Site Reliability/devops

Vor 4 Tagen

Berlin, Deutschland Amazon Web Services Development Center Germany GmbH Vollzeit

Experience supporting cloud systems or other services. Proficient troubleshooting and anticipating problems that affect the performance, reliability, or availability of software systems
- Proficient executing standard operating procedures and following operational best practices

Your responsibilities will encompass overseeing the launch of the ESC in 2025, working closely with global AWS teams, and influencing the evolution of AWS services and technology. A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC.
The overarching goal is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and operations team that aligns with AWS' commitment to customer satisfaction and continual innovation.
European Sovereign Cloud (ESC) is a part of AWS Utility Computing (UC).
AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS. Within AWS UC, Managed Operations engineers engage with AWS customers who require specialized security solutions for their cloud services.
A day in the life
You’ll spend a majority of your time operating and improving one of the largest software systems. Over the course of a week, you will review the operational health of the services in your team’s care, and as soon as you figure out why there was an anomaly, you write up an actionable bug report. As a responsible engineer, you’ve learned never to make changes to production systems without a plan, so you reviewed then executed changes following a change management process to one of the production systems in your care. Later in the week, you help to resolve your team’s backlog of operational issues. You round off the week by writing a cool script that you shared with your team which helps get to root cause faster of a hard problem that you diagnosed earlier.
You will be required to occasionally participate in an “on-call” rotations to resolve incidents occurring out-of-hours.
Eligibility requirements
- Fluency in written and spoken English is required
- Successful applicants must have the legal right to work in Germany
- Amazon will provide relocation support for successful applicants relocating within the European Union

About the team
Diverse Experiences
Why AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.
Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
- Experience working cross-organizationally and leading strategic team efforts requiring work from multiple team members
- Experience with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar)

m/w/d

Site Reliability Engineer

vor 2 Wochen

Berlin, Berlin, Deutschland 1GLOBAL Vollzeit 60.000 € - 120.000 € pro Jahr

1GLOBAL is a technology-driven global mobile communications provider dedicated to empowering enterprises worldwide to unlock the full growth potential of mobile connectivity. With a best-in-class telecom technology platform, a comprehensive suite of globally viable regulatory licenses, and privileged access to the telecom wholesale market, 1GLOBAL is...
Site Reliability Engineer

Vor 4 Tagen

Berlin, Berlin, Deutschland Zattoo Vollzeit 80.000 € - 120.000 € pro Jahr

THE ROLE & THE SRE TEAMAt Zattoo, we're building the TV platform of the future. With our ever-growing demand for unicast TV delivery, we're scaling out our custom-built infrastructure to deliver live and on-demand video at multi-Tbps scale. Because we own the full chain - from ingest, encoding/transcoding, packaging, to delivery - our engineers have the...
Site Reliability Engineering Manager

vor 1 Woche

Berlin, Deutschland Benker Headhunting Vollzeit

Senior Manager Site Reliability Engineering 24/7 Reporting Line: Director QATeam: approx. 15Company OverviewA leading European cloud provider seeking a proactive and technically capable Manager to lead the Cloud SRE 24x7 team in Berlin. The RoleThe Senior Manager Site Reliability Engineering (SRE) 24/7 will lead a high-impact operations team, drive service...
Site Reliability Engineer

Vor 4 Tagen

Berlin, Deutschland Zattoo Vollzeit

YOUR FUTURE, ON DEMAND The ideal blend of stability and flexibility. A genuinely human employer that cares for people and the planet. True autonomy to shape what comes next, for us and you. This is the perfect platform to take your career where you want. Back in 2005, we pioneered Europe’s first TV streaming service. Today, we’re the world’s first...
Site Reliability Engineering Manager

vor 1 Woche

Berlin, Deutschland Benker Headhunting Vollzeit

Senior Manager Site Reliability Engineering 24/7 Reporting Line: Director QA Team: approx. 15 Company Overview A leading European cloud provider seeking a proactive and technically capable Manager to lead the Cloud SRE 24x7 team in Berlin. The Role The Senior Manager Site Reliability Engineering (SRE) 24/7 will lead a high-impact operations team, drive...
Site Reliability Engineer

vor 2 Wochen

Berlin, Deutschland Zattoo Vollzeit

YOUR FUTURE, ON DEMAND The ideal blend of stability and flexibility. A genuinely human employer that cares for people and the planet. True autonomy to shape what comes next, for us and you. This is the perfect platform to take your career where you want. Back in 2005, we pioneered Europe’s first TV streaming service. Today, we’re the world’s first...
Site Reliability Engineer

Vor 5 Tagen

Berlin, Berlin, Deutschland 1KOMMA5˚ Vollzeit 60.000 € - 120.000 € pro Jahr

1KOMMA5°We are looking for you as an addition to our tech-team in Berlin, Munich or Hamburg. 1KOMMA5° is building Germany's largest one-stop-shop for sale, installation and services related to solar, heat pumps, electricity and charging infrastructure. And they are all connected Be a part of our missionBecome a part of our mission and learn about our...
Senior Site Reliability Engineer

vor 2 Wochen

Berlin, Berlin, Deutschland Kombo Vollzeit 80.000 € - 120.000 € pro Jahr

Senior Site Reliability Engineer (SRE) @Kombo Berlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Senior SREs. You'll work on reliability, scale our infrastructure, and help define how SRE is done at Kombo — while staying hands-on. High impact, high autonomy, and the chance to shape (and later lead) our growing platform/SRE function.Why You...
Site Reliability Engineer

vor 2 Wochen

Berlin, Berlin, Deutschland Zattoo Vollzeit 80.000 € - 120.000 € pro Jahr

YOUR FUTURE, ON DEMANDThe ideal blend of stability and flexibility. A genuinely human employer that cares for people and the planet. True autonomy to shape what comes next, for us and you. This is the perfect platform to take your career where you want.Back in 2005, we pioneered Europe's first TV streaming service. Today, we're the world's first certified...
Site Reliability Engineer

Vor 5 Tagen

Berlin, Berlin, Deutschland 1KOMMA5° Vollzeit 80.000 € - 120.000 € pro Jahr

1KOMMA5°We are looking for you as an addition to our tech-team in Berlin, Munich or Hamburg. 1KOMMA5° is building Germany's largest one-stop-shop for sale, installation and services related to solar, heat pumps, electricity and charging infrastructure. And they are all connected Be a part of our missionBecome a part of our mission and learn more about our...

Amerika

Europa

Asien / Ozeanien

Afrika

Site Reliability/devops