Site Reliability Engineer
vor 1 Tag
Location: Hybrid – Cologne (Rheinauhafen) — 3 days in the office, 2 remote (Tue + Thu) Team: Engineering · Reports to CTO Keep the world awake — build reliability at scale ilert helps thousands of DevOps & IT teams detect, fix, and communicate incidents faster. Our platform is mission-critical: customers rely on us 24/7 to keep their always-on businesses running. As a Site Reliability Engineer at ilert, you’ll own the reliability, performance, and scalability of our core platform across AWS, Kubernetes, Kafka, and more. Tasks Build & operate a highly available platform Run and evolve our AWS-based infrastructure Operate and optimize self-managed Kafka, ClickHouse clusters and our Observability stack Ensure resilience, disaster recovery, and capacity planning across the stack Improve reliability & performance Build and maintain SLOs, SLIs, error budgets, and observability dashboards Debug production issues across layers (networking, Kubernetes, application, DB) Improve performance of our ingestion pipeline Automation & tooling Automate operations with Terraform, Helm, Kubernetes operators, and internal tooling Build tooling for safer deploys, blue/green rollouts, and automated verification Strengthen incident response workflows through deep collaboration with our AI SRE agent team Security & compliance Implement best practices for workload isolation, secrets management, IAM, and auditability Support our ISO27001 posture by automating controls and hardening our infrastructure Cross-functional impact Partner with Backend, AI, and Product teams to design reliable services Participate in on-call rotation Lead post-incident reviews and drive reliability improvements long-term Requirements 3+ years experience as SRE, Platform Engineer, DevOps Engineer, or Infrastructure Engineer Strong hands-on experience with AWS, Kubernetes, Linux internals, networking, performance tuning Experience operating self-managed distributed systems, ideally Kafka or ClickHouse Strong understanding of observability Experience automating infrastructure with Terraform and CI/CD systems Fluent English (our working language); German optional Benefits 🚀 Product-centric - 100 % focused on solving a mission-critical pain felt by every always-on business | 🏡 Hybrid freedom - 2 days remote by default; gorgeous Rheinauhafen roof terrace when you’re in town | 🕒 Focus > meetings - We time-box syncs, favour async docs and protect maker time | 🌴 28 days off - …plus public holidays | 🚲 Commute perks - subsidised public transport| ilert is a SaaS company for alerting, on-call management and status pages and helps companies to operate always-on services and respond faster to incidents.
-
Site Reliability Engineer
vor 20 Stunden
Cologne, Deutschland ilert GmbH VollzeitLocation: Hybrid – Cologne (Rheinauhafen) — 3 days in the office, 2 remote (Tue + Thu) Team: Engineering · Reports to CTO Keep the world awake — build reliability at scale ilert helps thousands of DevOps & IT teams detect, fix, and communicate incidents faster. Our platform is mission-critical: customers rely on us 24/7 to keep their always-on...
-
Cloud Engineer
vor 3 Wochen
Cologne, Deutschland REWE Group VollzeitWillkommen im Home of IT REWE digital ist das Zuhause für alle, die sich IT auf die Fahne schreiben. Hier gehörst du hin, wenn du Future Thinker, IT-Spezialist:in, Software Developer, UXler:in, SAP-Expert:in, System Admin, Techniker:in, irgendetwas dazwischen oder etwas ganz anderes bist. Hauptsache du fühlst dich in der digitalen Welt zuhause. Als...
-
Data Engineer
vor 2 Wochen
Cologne, Deutschland Sanitätshaus Aktuell AG VollzeitPasst Du zu uns?Daten sind für dich nicht nur Nullen und Einsen, sondern die Grundlage für gute Entscheidungen? Du findest Freude daran, aus chaotischen Legacy-Systemen saubere Datenstrukturen zu schaffen? Du denkst in Pipelines, nicht in Skripten? Fehlerbehandlung und Retry-Logik sind für dich selbstverständlich, nicht optional? Du möchtest von...
-
AI Product Engineer
vor 1 Woche
Cologne, Deutschland ilert GmbH VollzeitTeam: Product & Engineering • Reports to the CTO Location: Hybrid - Cologne (Rheinauhafen) - 3 days in office, 2 days remote (Tue and Thu) Shape the future of autonomous incident response We’re on a mission to make downtime invisible. Thousands of DevOps and SRE teams rely on ilert to detect, resolve, and communicate incidents faster. As our first AI...
-
AI Product Engineer
vor 1 Woche
Cologne, Deutschland ilert GmbH VollzeitTeam: Product & Engineering • Reports to the CTO Location: Hybrid - Cologne (Rheinauhafen) - 3 days in office, 2 days remote (Tue and Thu) Shape the future of autonomous incident response We’re on a mission to make downtime invisible. Thousands of DevOps and SRE teams rely on ilert to detect, resolve, and communicate incidents faster. As our first AI...
-
Cloud Engineer
vor 4 Wochen
Cologne, Deutschland Optimus Search Limited VollzeitAbout us Join a leading tech company revolutionizing the healthcare industry with cutting-edge cloud solutions! Our client, a rapidly growing innovator in digital healthcare platforms, is seeking a Cloud Engineer with expertise in ASP.NET Core and Azure to build and manage scalable, secure cloud infrastructures that support vital medical systems. With a...
-
Cloud Engineer
vor 3 Wochen
Cologne, Deutschland Optimus Search VollzeitAbout us Join a leading tech company revolutionizing the healthcare industry with cutting-edge cloud solutions! Our client, a rapidly growing innovator in digital healthcare platforms, is seeking a Cloud Engineer with expertise in ASP.NET Core and Azure to build and manage scalable, secure cloud infrastructures that support vital medical systems. With a...
-
Cloud Engineer
vor 4 Wochen
Cologne, Deutschland Optimus Search VollzeitAbout us Join a leading tech company revolutionizing the healthcare industry with cutting-edge cloud solutions! Our client, a rapidly growing innovator in digital healthcare platforms, is seeking a Cloud Engineer with expertise in ASP.NET Core and Azure to build and manage scalable, secure cloud infrastructures that support vital medical systems. With a...
-
Cloud Engineer
Vor 3 Tagen
Cologne, Deutschland Optimus Search VollzeitAbout us Join a leading tech company revolutionizing the healthcare industry with cutting-edge cloud solutions! Our client, a rapidly growing innovator in digital healthcare platforms, is seeking a Cloud Engineer with expertise in ASP.NET Core and Azure to build and manage scalable, secure cloud infrastructures that support vital medical systems. With a...
-
Founding Sales Engineer
vor 2 Wochen
Cologne, Deutschland ilert GmbH VollzeitAt ilert, we help companies keep their always-on digital services resilient and responsive. Our AI-first platform powers IT teams with comprehensive incident management capabilities to detect, respond, and resolve incidents before customers ever notice. We’ve done the hardest part already: built something people love. Hundreds of customers have come on...