100 remote - AI Infrastructure & Inference Engineer (m/w/d) Fokus GPU & LLM
vor 1 Woche
Stellenbeschreibung
Für unseren Kunden sind wir auf der Suche nach einem AI Infrastructure & Inference Engineer (m/w/d) mit Fokus GPU & LLM.
Laufzeit: 5.1.26
Auslastung: Vollzeit
Einsatzort: Remote
• Design, implement, and optimize LLM and multimodal inference pipelines across multi-GPU, multi-node, and distributed environments.
• Build request routing and load balancing systems to ensure ultra-low latency, high-throughput services.
• Develop auto-scaling and intelligent resource allocation to meet strict SLAs across multiple data centers.
• Architect trade-offs between latency, throughput, and cost efficiency for diverse workloads.
• Implement traffic shaping and multi-tenant orchestration for fair and reliable compute allocation.
• Collaborate with AI researchers, platform engineers, and ML practitioners to bring new model architectures to production.
• Automate system provisioning, deployment pipelines, and operational tasks using modern DevOps and MLOps practices.
• Monitor, profile, and benchmark system-level performance for maximum GPU utilization and uptime.
• Apply best practices in system security, observability (logging/metrics/tracing), and disaster recovery.
• Contribute to open-source ecosystems and internal tooling to push the boundaries of inference performance.
• Maintain comprehensive technical documentation and participate in continuous process improvements.
Required skills
• Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
• 5+ years of experience in high-performance computing, GPU infrastructure, or distributed systems.
• Deep understanding of multi-GPU orchestration, workload scheduling, and distributed architectures.
• Proficiency with programming (Python or similar language) and systems automation scripting.
• Strong background in containerization (Docker), orchestration frameworks (Kubernetes), and CI/CD pipelines.
• Familiarity with observability tools such as Prometheus, Grafana, and OpenTelemetry.
• Strong understanding of OS-level performance (multi-threading, networking, memory management).
• Clear communication skills and the ability to work collaboratively across technical teams.
Preferred Skills
• Experience with NVIDIA DGX systems, NIM, TensorRT-LLM, or high-performance inference frameworks.
• Hands-on knowledge of CUDA, NCCL, Triton, MPI, NVLink, or InfiniBand networking.
• Experience deploying GPU clusters in both cloud and bare-metal environments.
• Familiarity with open-source inference ecosystems like SGLang, vLLM, or NVIDIA Dynamo.
• Knowledge of LLM optimization techniques for inference and fine-tuning acceleration.
• Understanding of enterprise security frameworks, compliance standards, and GDPR requirements.
-
AI Engineer
vor 7 Stunden
Remote (Germany) Cardo AI Vollzeit 80.000 € - 120.000 € pro JahrDescriptionAbout Cardo AICardo AI builds next generation technology for asset-based finance, private credit, and structured products. Our platform powers institutional investors, banks, and fintechs with AI-driven data management, reporting, and analytics for complex debt portfolios.For more details, visit our website at Role OverviewWe are looking for a...
-
Staff AI Engineer
vor 7 Stunden
Berlin - Remote in Europe, Deutschland Bluefish AI Vollzeit 100.000 € - 1.500.000 € pro JahrAbout the PositionAs a Staff AI Engineer, you'll serve as a technical leader for our LLM-powered products at the forefront of marketing and advertising technologies. You'll own critical architectural decisions, set quality bars, and lead multi‑team initiatives that drive measurable outcomes.As our Staff AI Engineer, you will lead the vision and execution...
-
Staff AI Engineer
Vor 6 Tagen
Berlin - Remote in Europe, Deutschland Bluefish AI Vollzeit 90.000 € - 120.000 € pro JahrAbout the PositionAs a Staff AI Engineer, you'll serve as a technical leader for our LLM-powered products at the forefront of marketing and advertising technologies. You'll own critical architectural decisions, set quality bars, and lead multi‑team initiatives that drive measurable outcomes.As our Staff AI Engineer, you will lead the vision and execution...
-
AI Engineer Student
vor 6 Stunden
Remote Deutschland iits-consulting Vollzeit 40.000 € - 60.000 € pro JahrÜber uns Wir bei iits bieten unseren Kunden individuelle Softwarelösungen, da jedes Unternehmen einzigartig ist und eigene Ziele und Herausforderungen hat. Dies ermöglichen wir durch die technische Expertise unserer Teams, welche die Geschäftsprozesse der Kunden tagtäglich optimieren und eine maßgeschneiderte Digitalisierung ermöglichen. Wir haben...
-
Student R&D support
vor 1 Woche
Hybrid, Remote, Kaiserslautern, Deutschland Lubis Eda Vollzeit 45.000 € - 65.000 € pro JahrYour mission We are looking for a motivated student assistant to support our R&D team in setting up and maintaining large language model (LLM) inference environments and related API services. The role involves hands-on work with modern inference frameworks and GPU-based infrastructures, both cloud-hosted and on-premises.Setting up, configuring, and...
-
Senior Software Engineer
vor 1 Woche
Munich, Germany (remote) Mitratech Vollzeit 100.000 € - 150.000 € pro JahrAt Mitratech, we are a team of innovators focused on building world-class products that simplify operations in the Legal, Risk, Compliance, and HR functions. We are a close-knit, globally dispersed team that thrives in an ecosystem that supports individual excellence and takes pride in its diverse and inclusive work culture centered around great people...
-
Applied AI Engineer
Vor 6 Tagen
Remote, Deutschland Moss Vollzeit 90.000 € - 120.000 € pro JahrAt Moss, we give finance professionals the power to automate their day-to-day and make forward-thinking decisions.Our team and culture make us unique — we're driven by impact and growth, where every one of us strives to learn and excel. Recognised by Sifted's Rising 100 and LinkedIn's Top Startups, we're here to help propel your career and together, make...
-
AI Engineer
Vor 7 Tagen
Berlin (Remote), Deutschland Acto Vollzeit 90.000 € - 120.000 € pro JahrYour Role We're looking for an experienced (Senior/Staff) AI Engineer (m/f/d) to help build the next generation of agentic AI systems powered by Large Language Models (LLMs). You'll be working at the intersection of reasoning, tool-use, and memory, designing intelligent agents that operate in complex environments, interact with APIs and structured data,...
-
Senior AI Fullstack Engineer
vor 12 Stunden
Germany (Remote) ; Ireland (Remote); Netherlands (Remote) ; Portugal (Remote) ; Spain (Remote) ; United Kingdom (Remote) Typeform Vollzeit 80.000 € - 120.000 € pro JahrWho we areTypeform is a refreshingly different form builder. We help over 150,000 businesses collect the data they need with forms, surveys, and quizzes that people enjoy. Designed to look striking and feel effortless to fill out, Typeform drives 500 million responses every year—and integrates with essential tools like Slack, Zapier, and Hubspot. Typeform...
-
Cloud & AI Engineer (gn)
vor 1 Woche
Deutschland, remote bei The Quality Group Vollzeit 40.000 € - 60.000 € pro JahrStart: ab sofort | Level: Mid-Senior | Location: Deutschland, remote | Arbeitszeit: Vollzeit (40h/Woche)Als Cloud & AI Engineer (gn) gestaltest du die Zukunft unserer digital-technischen Landschaft: Du entwickelst zuverlässige, automatisierte Infrastruktur und bringst KI-Workflows in Produktion – mit dem Ziel, Effizienz, Skalierbarkeit und Innovation in...