Site Reliability Engineer

Vor 6 Tagen


Berlin, Berlin, Deutschland IONOS EN Vollzeit 80.000 € - 120.000 € pro Jahr
At IONOS, the leading European provider of cloud infrastructure, cloud services and hosting services, you will work together with a wide range of teams. We are characterized by open structures, a friendly working culture and flat hierarchies with a strong team spirit. We firmly believe that work and fun are compatible, and offer you the right environment for this. Our constant growth means that we are always looking for new colleagues. Become part of IONOS and grow with us.

Tasks
  • Participate in architecture reviews to ensure storage infrastructure meets performance, reliability, and scalability goals.
  • Develop automation for storage provisioning, monitoring, and scaling using tools like Ansible, Saltstack, Terraform, or custom Python/Go scripts.
  • Create self-healing and alerting mechanisms for storage-related issues.
  • Implement observability for storage systems (metrics, logs, tracing).
  • Troubleshoot complex storage performance or reliability issues and participate in on-call rotations.
  • Conduct root cause analysis (RCA) for incidents and develop preventive measures.
Qualifications
  • 5+ years of experience in Linux systems engineering, storage infrastructure, or SRE roles.
  • Good understanding of RDMA, InfiniBand, and RoCE protocols
  • Strong experience with Linux MD RAID (mdadm) and LVM.
  • Proficiency in Linux performance tuning and network stack debugging (ethtool, perf, tcpdump, ibstat, ibtop).
  • Strong scripting and automation skills (Python, Bash, Go).
  • You have experience with configuration management tools such as SaltStack or Ansible, as well as monitoring tools like Prometheus, Loki, and Grafana.
Nice to have:
  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related technical field.
Location: Berlin
Note: At the end of the application process, candidates must undergo a security check. Your consent will be requested in good time during the process.
Benefits
  • Hybrid working model with home office option.
  • Flexible working hours through trust-based working hours.
  • At some locations a subsidized canteen and various free drinks.
  • Modern office space with very good transport connections.
  • Various employee discounts for activities and products.
  • Employee events such as summer and winter parties, as well as workshops.
  • Numerous training and development opportunities.
  • Various health offers, such as sports and health courses.
About IONOS

IONOS is the leading European digitalization partner for small and medium-sized businesses (SMB). The company serves around six million customers and operates across 18 markets in Europe and North America, with its services being accessible worldwide. With its Web Presence & Productivity portfolio, IONOS acts as a 'one-stop shop' for all digitalization needs: from domains and web hosting to classic website builders and do-it-yourself solutions, from e-commerce to online marketing tools. In addition, the company offers Cloud Solutions to enterprises who are looking to move to the cloud as their businesses evolve.

We value diversity and welcome all applications - regardless of, for example, gender, nationality, ethnic or social origin, religion, disability, age as well as sexual orientation and identity, physical characteristics, marital status or any other irrelevant factor subject to applicable law.

  • Berlin, Berlin, Deutschland KOMBO Vollzeit 100.000 € - 150.000 € pro Jahr

    Senior Site Reliability Engineer (Database) @KomboBerlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Database Reliability Engineer. You'll take ownership of our Postgres infrastructure, ensuring performance, scalability, and reliability as we grow.High impact, high autonomy, and the chance to shape Kombo's database reliability practices from...


  • Berlin, Berlin, Deutschland Kombo Vollzeit 80.000 € - 120.000 € pro Jahr

    Senior Site Reliability Engineer (Database) @Kombo Berlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Database Reliability Engineer. You'll take ownership of our Postgres infrastructure, ensuring performance, scalability, and reliability as we grow.High impact, high autonomy, and the chance to shape Kombo's database reliability practices from...


  • Berlin, Berlin, Deutschland Blackfluo Vollzeit 84.000 € - 85.000 € pro Jahr

    Job DescriptionLocation: Full remote, EU timezone (CET +/- 2 hours)Start Date: As soon as possibleLanguages: English requiredWe are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of...


  • Berlin, Berlin, Deutschland Wire Vollzeit 70.000 € - 95.000 € pro Jahr

    WHO WE ARE We are looking for a Site Reliability Engineer / Systems Engineer to complement our Deployment Operations  Team. In this role, you will build, improve and manage our automations and deployment infrastructure, to ensure the reliability, resilience, availability and observability of our product.Join us at Wire, the leading end-to-end encrypted...


  • Berlin, Berlin, Deutschland Kombo Vollzeit 80.000 € - 120.000 € pro Jahr

    Senior Site Reliability Engineer (SRE) @Kombo Berlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Senior SREs. You'll work on reliability, scale our infrastructure, and help define how SRE is done at Kombo — while staying hands-on. High impact, high autonomy, and the chance to shape (and later lead) our growing platform/SRE function.Why You...


  • Berlin, Berlin, Deutschland Zattoo Vollzeit 80.000 € - 120.000 € pro Jahr

    YOUR FUTURE, ON DEMANDThe ideal blend of stability and flexibility. A genuinely human employer that cares for people and the planet. True autonomy to shape what comes next, for us and you. This is the perfect platform to take your career where you want.Back in 2005, we pioneered Europe's first TV streaming service. Today, we're the world's first certified...


  • Berlin, Berlin, Deutschland 1GLOBAL Vollzeit 60.000 € - 120.000 € pro Jahr

    1GLOBAL is a technology-driven global mobile communications provider dedicated to empowering enterprises worldwide to unlock the full growth potential of mobile connectivity. With a best-in-class telecom technology platform, a comprehensive suite of globally viable regulatory licenses, and privileged access to the telecom wholesale market, 1GLOBAL is...


  • Berlin, Berlin, Deutschland Hypoport hub SE Vollzeit 80.000 € - 120.000 € pro Jahr

    Die Hypoport hub SE verbindet als eigenständiges Unternehmen die Corporate Functions für das Hypoport-Netzwerk, die den Zusammenschluss leistungsfähiger Technologieunternehmen für die Kredit-, Immobilien und Versicherungswirtschaft darstellt. Die Hypoport hub SE ist ein 100%-iges Tochterunternehmen der erfolgreichen...


  • Berlin, Berlin, Deutschland Ageras Vollzeit 60.000 € - 120.000 € pro Jahr

    About the RoleWe're looking for a Senior Site Reliability Engineer (SRE) to join our Infrastructure team. This is a long-term position to replace a recent departure and strengthen our capacity as we scale.As part of the team, you'll play a crucial role in maintaining and improving the reliability, security, and scalability of our cloud infrastructure. You'll...


  • Berlin, Berlin, Deutschland GetYourGuide Vollzeit 90.000 € - 120.000 € pro Jahr

    Change the way the world travelsBe part of the GetYourGuide journey and connect people with unforgettable travel experiences worldwide. Since 2009, millions of travelers have booked unique activities with us in over 12,000 cities. Our headquarters in Berlin is supported by 16 other local offices across the globe. Ready to join a diverse community of over...