Site Reliability Engineer
Vor 6 Tagen
Tasks
- Participate in architecture reviews to ensure storage infrastructure meets performance, reliability, and scalability goals.
- Develop automation for storage provisioning, monitoring, and scaling using tools like Ansible, Saltstack, Terraform, or custom Python/Go scripts.
- Create self-healing and alerting mechanisms for storage-related issues.
- Implement observability for storage systems (metrics, logs, tracing).
- Troubleshoot complex storage performance or reliability issues and participate in on-call rotations.
- Conduct root cause analysis (RCA) for incidents and develop preventive measures.
- 5+ years of experience in Linux systems engineering, storage infrastructure, or SRE roles.
- Good understanding of RDMA, InfiniBand, and RoCE protocols
- Strong experience with Linux MD RAID (mdadm) and LVM.
- Proficiency in Linux performance tuning and network stack debugging (ethtool, perf, tcpdump, ibstat, ibtop).
- Strong scripting and automation skills (Python, Bash, Go).
- You have experience with configuration management tools such as SaltStack or Ansible, as well as monitoring tools like Prometheus, Loki, and Grafana.
- Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related technical field.
Note: At the end of the application process, candidates must undergo a security check. Your consent will be requested in good time during the process.
Benefits
- Hybrid working model with home office option.
- Flexible working hours through trust-based working hours.
- At some locations a subsidized canteen and various free drinks.
- Modern office space with very good transport connections.
- Various employee discounts for activities and products.
- Employee events such as summer and winter parties, as well as workshops.
- Numerous training and development opportunities.
- Various health offers, such as sports and health courses.
IONOS is the leading European digitalization partner for small and medium-sized businesses (SMB). The company serves around six million customers and operates across 18 markets in Europe and North America, with its services being accessible worldwide. With its Web Presence & Productivity portfolio, IONOS acts as a 'one-stop shop' for all digitalization needs: from domains and web hosting to classic website builders and do-it-yourself solutions, from e-commerce to online marketing tools. In addition, the company offers Cloud Solutions to enterprises who are looking to move to the cloud as their businesses evolve.
We value diversity and welcome all applications - regardless of, for example, gender, nationality, ethnic or social origin, religion, disability, age as well as sexual orientation and identity, physical characteristics, marital status or any other irrelevant factor subject to applicable law.
-
Senior Site Reliability Engineer
Vor 6 Tagen
Berlin, Berlin, Deutschland KOMBO Vollzeit 100.000 € - 150.000 € pro JahrSenior Site Reliability Engineer (Database) @KomboBerlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Database Reliability Engineer. You'll take ownership of our Postgres infrastructure, ensuring performance, scalability, and reliability as we grow.High impact, high autonomy, and the chance to shape Kombo's database reliability practices from...
-
Senior Site Reliability Engineer
vor 1 Woche
Berlin, Berlin, Deutschland Kombo Vollzeit 80.000 € - 120.000 € pro JahrSenior Site Reliability Engineer (Database) @Kombo Berlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Database Reliability Engineer. You'll take ownership of our Postgres infrastructure, ensuring performance, scalability, and reliability as we grow.High impact, high autonomy, and the chance to shape Kombo's database reliability practices from...
-
Site Reliability Engineer
Vor 3 Tagen
Berlin, Berlin, Deutschland Blackfluo Vollzeit 84.000 € - 85.000 € pro JahrJob DescriptionLocation: Full remote, EU timezone (CET +/- 2 hours)Start Date: As soon as possibleLanguages: English requiredWe are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of...
-
Site Reliability Engineer
Vor 4 Tagen
Berlin, Berlin, Deutschland Wire Vollzeit 70.000 € - 95.000 € pro JahrWHO WE ARE We are looking for a Site Reliability Engineer / Systems Engineer to complement our Deployment Operations Team. In this role, you will build, improve and manage our automations and deployment infrastructure, to ensure the reliability, resilience, availability and observability of our product.Join us at Wire, the leading end-to-end encrypted...
-
Senior Site Reliability Engineer
vor 1 Woche
Berlin, Berlin, Deutschland Kombo Vollzeit 80.000 € - 120.000 € pro JahrSenior Site Reliability Engineer (SRE) @Kombo Berlin (On-site) · Full-timeTL;DRJoin Kombo as one of our first Senior SREs. You'll work on reliability, scale our infrastructure, and help define how SRE is done at Kombo — while staying hands-on. High impact, high autonomy, and the chance to shape (and later lead) our growing platform/SRE function.Why You...
-
Site Reliability Engineer
Vor 6 Tagen
Berlin, Berlin, Deutschland Zattoo Vollzeit 80.000 € - 120.000 € pro JahrYOUR FUTURE, ON DEMANDThe ideal blend of stability and flexibility. A genuinely human employer that cares for people and the planet. True autonomy to shape what comes next, for us and you. This is the perfect platform to take your career where you want.Back in 2005, we pioneered Europe's first TV streaming service. Today, we're the world's first certified...
-
Site Reliability Engineer
Vor 6 Tagen
Berlin, Berlin, Deutschland 1GLOBAL Vollzeit 60.000 € - 120.000 € pro Jahr1GLOBAL is a technology-driven global mobile communications provider dedicated to empowering enterprises worldwide to unlock the full growth potential of mobile connectivity. With a best-in-class telecom technology platform, a comprehensive suite of globally viable regulatory licenses, and privileged access to the telecom wholesale market, 1GLOBAL is...
-
Site Reliability Engineer
Vor 5 Tagen
Berlin, Berlin, Deutschland Hypoport hub SE Vollzeit 80.000 € - 120.000 € pro JahrDie Hypoport hub SE verbindet als eigenständiges Unternehmen die Corporate Functions für das Hypoport-Netzwerk, die den Zusammenschluss leistungsfähiger Technologieunternehmen für die Kredit-, Immobilien und Versicherungswirtschaft darstellt. Die Hypoport hub SE ist ein 100%-iges Tochterunternehmen der erfolgreichen...
-
Senior Site Reliability Engineer
vor 1 Woche
Berlin, Berlin, Deutschland Ageras Vollzeit 60.000 € - 120.000 € pro JahrAbout the RoleWe're looking for a Senior Site Reliability Engineer (SRE) to join our Infrastructure team. This is a long-term position to replace a recent departure and strengthen our capacity as we scale.As part of the team, you'll play a crucial role in maintaining and improving the reliability, security, and scalability of our cloud infrastructure. You'll...
-
Site Reliability Engineer
vor 1 Woche
Berlin, Berlin, Deutschland GetYourGuide Vollzeit 90.000 € - 120.000 € pro JahrChange the way the world travelsBe part of the GetYourGuide journey and connect people with unforgettable travel experiences worldwide. Since 2009, millions of travelers have booked unique activities with us in over 12,000 cities. Our headquarters in Berlin is supported by 16 other local offices across the globe. Ready to join a diverse community of over...