ML Infrastructure Engineer
Company: Jobgether
Location: Location not specified (Remote)
Type: Full-time
Remote: Yes
Posted: 2026-05-13
About this role
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a ML Infrastructure Engineer in Germany.
Join a cutting-edge AI infrastructure environment focused on powering the next generation of machine learning and large-scale AI workloads. This role offers the opportunity to work at the intersection of GPU performance engineering, deep learning optimization, and cloud-scale infrastructure development. You will contribute directly to benchmarking and optimizing advanced GPU platforms that support training and inference for complex neural networks and AI systems. Working alongside highly skilled engineering and hardware teams, you will help drive performance improvements across compute architectures, software stacks, and distributed AI environments. The position is ideal for engineers passionate about ML systems, large-scale model performance, and infrastructure innovation. With exposure to modern AI frameworks, high-performance GPU ecosystems, and international collaboration, this role provides a strong platform for technical growth and meaningful impact within the AI industry.
Accountabilities
- Benchmark and evaluate GPU platform performance for machine learning and AI workloads across various architectures, frameworks, and software environments.
- Collaborate closely with hardware and engineering teams to profile GPU performance at system and kernel levels and identify optimization opportunities.
- Analyze, debug, and optimize training and inference workloads to improve efficiency, scalability, and overall hardware utilization.
- Conduct acceptance testing for new GPU clusters to validate performance, stability, compatibility, and operational readiness for AI workloads.
- Perform experiments across multiple GPU configurations and interconnect strategies to assess system-level scalability and performance trade-offs.
- Develop internal tools, dashboards, and reporting frameworks to visualize performance metrics, ...