Principal Site Reliability Engineer (AI-first SRE)

Company: Groupon

Location: Warsaw, Mazowieckie (Remote)

Type: Full-time

Level: lead

Remote: Yes

Posted: 2026-03-04

About this role

Groupon is a marketplace where customers discover new experiences and services everyday and local businesses thrive. To date we have worked with over a million merchant partners worldwide, connecting over 16 million customers with deals across various categories. In a world often dominated by e-commerce giants, we stand out as one of the few platforms uniquely committed to helping local businesses succeed on a performance basis.


Groupon is on a radical journey to transform our business with relentless pursuit of results. Even with thousands of employees spread across multiple continents, we still maintain a culture that inspires innovation, rewards risk-taking and celebrates success. The impact here can be immediate due to our scale and the speed of our transformation. We're a "best of both worlds" kind of company. We're big enough to have the resources and scale, but small enough that a single person has a surprising amount of autonomy and can make a meaningful impact.


About The Role
Groupon is modernizing its global platform — and reliability is at the center of that transformation. We’re looking for a Principal Site Reliability Engineer to lead the evolution from reactive maintenance to predictive, AI-driven resilience.


You’ll design intelligent, self-healing systems that prevent incidents before they happen, ensuring our customers enjoy fast, secure, and reliable experiences across millions of daily interactions.


📍Hybrid work model


Key Responsibilities:

  • Architect and maintain self-healing systems with 99.9%+ availability targets.
  • Use AI/ML to automate infrastructure governance and detect configuration or IaC anti-patterns.
  • Implement adaptive SLIs/SLOs that evolve automatically from real-time data.
  • Build AIOps-based observability and auto-remediation pipelines.
  • Apply predictive modeling to forecast failures before they impact users.
  • Lead chaos, performance, and resilience testing programs.
  • Map platform and ser...

Create Your Job Alert

Other Principal Jobs

Other Jobs in Warsaw