Site Reliability Engineering Manager

Vor 6 Tagen


Berlin, Deutschland Benker Headhunting Vollzeit

Senior Manager Site Reliability Engineering 24/7 Reporting Line: Director QA Team: approx. 15 Company Overview A leading European cloud provider seeking a proactive and technically capable Manager to lead the Cloud SRE 24x7 team in Berlin. The Role The Senior Manager Site Reliability Engineering (SRE) 24/7 will lead a high-impact operations team, drive service management, ensure service availability, promote operational excellence, and facilitate continuous improvement in a mission-critical cloud environment. Key Responsibilities Lead and manage the Cloud SRE 24x7 team, including: Continuous operations support On-call management and incident response Daily systems operations Level 2 ticket resolution Managing a team of ~16 direct reports Reporting into the Director of Cloud SRE Ensure operational excellence through: Service management (Incident, Change, Problem, Risk) Implementation of operational best practices Development and maintenance of runbooks and incident drills Facilitation of blameless postmortems Drive and support: Forecasting and capacity planning Load/performance testing and cost optimization Operational readiness for new product rollouts Consultation on system scaling and reliability engineering Serve as escalation point for incidents Manage and resolve client-related incidents and service requests escalated from Level 1 Customer Service Communicate effectively with customers to provide updates, resolutions, and service improvements Utilize Atlassian Jira for ticket tracking and Atlassian Confluence for documentation and knowledge sharing Collaborate with customer service, engineering, infrastructure, and product teams Uphold 24x7 availability standards and compliance requirements Qualifications Minimum 5 years managing Level 2 technical operations teams (15+ members) in 24x7 environments Proven experience leading technical operations and Level 2 teams Strong understanding of cloud infrastructure, DevOps, and SRE principles Solid grasp of ITIL processes and service management frameworks Eligible for security clearance (German citizenship or long-term residency required) Familiarity with observability tools, on-call tooling, and CI/CD practices Ability to manage incidents under pressure and coordinate across teams Excellent communication and customer-facing skills Experience handling customer tickets and service requests in high-availability environments Proficiency with Atlassian Jira and Confluence Knowledge of information security standards such as ISO 27001, BSI C5, or BSI IT-Grundschutz Strong mentoring and conflict resolution capabilities Ability to manage operational demands and business priorities Benefits Hybrid working model Flexible working hours through trust-based scheduling Subsidized canteen and various free drinks (at some locations) Modern office space with good transport connections Employee discounts for activities and products Employee events such as summer and winter parties, and workshops Numerous training and development opportunities



  • Berlin, Deutschland Benker Headhunting Vollzeit

    Senior Manager Site Reliability Engineering 24/7 Reporting Line: Director QA A leading European cloud provider seeking a proactive and technically capable Manager to lead the Cloud SRE 24x7 team in Berlin. The Senior Manager Site Reliability Engineering (SRE) 24/7 will lead a high-impact operations team, drive service management, ensure service...


  • Berlin, Deutschland Benker Headhunting Vollzeit

    Senior Manager Site Reliability Engineering 24/7 Reporting Line: Director QATeam: approx. 15Company OverviewA leading European cloud provider seeking a proactive and technically capable Manager to lead the Cloud SRE 24x7 team in Berlin. The RoleThe Senior Manager Site Reliability Engineering (SRE) 24/7 will lead a high-impact operations team, drive service...


  • Berlin, Deutschland Delivery Hero Vollzeit

    Job DescriptionWe are looking for a Principal Product Manager for our Site Reliability Engineering (SRE) team, a key part of our internal Developer Platform. In this critical individual contributor role, you will define the future of how Delivery Hero builds and operates reliable, large-scale systems.You will own the SRE product strategy and roadmap,...


  • Berlin, Berlin, Deutschland Delivery Hero Vollzeit 120.000 € - 180.000 € pro Jahr

    Company Description Job Description We are looking for a Principal Product Manager for our Site Reliability Engineering (SRE) team, a key part of our internal Developer Platform. In this critical individual contributor role, you will define the future of how Delivery Hero builds and operates reliable, large-scale systems.You will own the SRE product...


  • Berlin, Berlin, Deutschland Delivery Hero Vollzeit 100.000 € - 180.000 € pro Jahr

    As the world's pioneering local delivery platform, our mission is to deliver an amazing experience, fast, easy, and to your door. We operate in over 70+ countries worldwide, powered by tech, designed by people. As one of Europe's largest tech platforms, headquartered in Berlin, Germany. Delivery Hero has been listed on the Frankfurt Stock Exchange since 2017...


  • Berlin, Berlin, Deutschland Canonical - Jobs Vollzeit 120.000 € - 180.000 € pro Jahr

    Canonical is a leading provider of open-source software and operating systems for global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers, and...


  • Berlin, Berlin, Deutschland Blackfluo Vollzeit 84.000 € - 85.000 € pro Jahr

    Job DescriptionLocation: Full remote, EU timezone (CET +/- 2 hours)Start Date: As soon as possibleLanguages: English requiredWe are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of...


  • Berlin, Berlin, Deutschland KOMBO Vollzeit 100.000 € - 150.000 € pro Jahr

    Senior Site Reliability Engineer (Database) @KomboBerlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Database Reliability Engineer. You'll take ownership of our Postgres infrastructure, ensuring performance, scalability, and reliability as we grow.High impact, high autonomy, and the chance to shape Kombo's database reliability practices from...


  • Berlin, Berlin, Deutschland 1KOMMA5˚ Vollzeit 60.000 € - 120.000 € pro Jahr

    1KOMMA5°We are looking for you as an addition to our tech-team in Berlin, Munich or Hamburg. 1KOMMA5° is building Germany's largest one-stop-shop for sale, installation and services related to solar, heat pumps, electricity and charging infrastructure. And they are all connected Be a part of our missionBecome a part of our mission and learn about our...


  • Berlin, Berlin, Deutschland Kombo Vollzeit 80.000 € - 120.000 € pro Jahr

    Senior Site Reliability Engineer (Database) @Kombo Berlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Database Reliability Engineer. You'll take ownership of our Postgres infrastructure, ensuring performance, scalability, and reliability as we grow.High impact, high autonomy, and the chance to shape Kombo's database reliability practices from...