Sr. ML Platform Engineer (Hybrid)

Company: Crowdstrike

Location: India - Bangalore

Type: Full-time

Posted: 2026-06-01

About this role

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. We work on large scale distributed systems, processing almost 3 trillion events per day and this traffic is growing daily. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role:

We're seeking a Sr. Engineer - ML Platform (Infrastructure & Debugging Specialist) to maintain and optimize CrowdStrike's mission-critical ML infrastructure. You'll diagnose complex distributed systems issues and ensure platform reliability for infrastructure processing billions of events daily.

What You'll Do:

Platform Reliability & Debugging: Diagnose and resolve issues across Ray, Spark, Airflow, MLflow, JupyterHub, Kubeflow, and SLURM Perform root cause analysis on production incidents affecting training and inference pipelines Debug performance bottlenecks, resource contention, memory leaks, and scheduling conflicts Develop debugging tools and diagnostic frameworks

System Optimization & Performance: Profile and optimize Ray clusters and Spark jobs on K8s and Cloud (EMR/Dataproc) Troubleshoot JupyterHub spawner issues, kernel crashes, and resource allocation Optimize SLURM job scheduling, GPU allocation, and HPC cluster utilization

**Infrastructure & M...

Create Your Job Alert

Other Sr. Jobs