AI Engineer

Company: Haystack

Location: San Diego, CA (Remote)

Salary: $158.4k - $237.6k per year

Type: Full-time

Remote: Yes

Posted: 2026-06-05

About this role

We're hiring on behalf of a Haystack partner!


The Role

  • Lead accuracy-centric architecture and optimization for LLMs, VLMs, and multimodal AI models.
  • Drive Day0 hardware enablement on current and future AI platforms.
  • Design and implement quantization strategies to balance model quality, performance, and hardware.
  • Analyze and resolve accuracy regressions and numerical stability issues across the inference stack.
  • Collaborate with performance engineers to co-optimize kernels and execution strategies.
  • Define and implement accuracy evaluation metrics and tooling for robust model deployment.

What You'll Need

  • Extensive experience with LLMs and/or VLMs in production or pre-production environments.
  • Expert-level understanding of quantization, numerics, and precision tradeoffs.
  • Deep knowledge of transformer architectures, attention mechanisms, and MoEs.
  • Proven ability to balance accuracy, performance, and hardware constraints.
  • Strong Python skills and experience across compiler, kernel, and hardware abstraction layers.
  • A Bachelor's degree with 4+ years of experience, or a Master's with 3+ years, or a PhD with 2+ years in a related field.

What's On Offer

  • Opportunity to work at the forefront of Cloud AI and foundation model inference.
  • A senior technical role with broad cross-functional impact.
  • Competitive annual discretionary bonus and opportunity for annual RSU grants.
  • Comprehensive benefits package designed to support work-life balance.

Apply via Haystack today!

Create Your Job Alert

Other AI Jobs

Other Jobs in San Diego