AI Engineer
Company: Haystack
Location: San Diego, CA (Remote)
Salary: $158.4k - $237.6k per year
Type: Full-time
Remote: Yes
Posted: 2026-06-05
About this role
We're hiring on behalf of a Haystack partner!
The Role
- Lead accuracy-centric architecture and optimization for LLMs, VLMs, and multimodal AI models.
- Drive Day0 hardware enablement on current and future AI platforms.
- Design and implement quantization strategies to balance model quality, performance, and hardware.
- Analyze and resolve accuracy regressions and numerical stability issues across the inference stack.
- Collaborate with performance engineers to co-optimize kernels and execution strategies.
- Define and implement accuracy evaluation metrics and tooling for robust model deployment.
What You'll Need
- Extensive experience with LLMs and/or VLMs in production or pre-production environments.
- Expert-level understanding of quantization, numerics, and precision tradeoffs.
- Deep knowledge of transformer architectures, attention mechanisms, and MoEs.
- Proven ability to balance accuracy, performance, and hardware constraints.
- Strong Python skills and experience across compiler, kernel, and hardware abstraction layers.
- A Bachelor's degree with 4+ years of experience, or a Master's with 3+ years, or a PhD with 2+ years in a related field.
What's On Offer
- Opportunity to work at the forefront of Cloud AI and foundation model inference.
- A senior technical role with broad cross-functional impact.
- Competitive annual discretionary bonus and opportunity for annual RSU grants.
- Comprehensive benefits package designed to support work-life balance.
Apply via Haystack today!