Senior / Staff Software Engineer - Infrastructure
Company: Boundless Networks, Inc.
Location: Location not specified (Remote)
Type: Full-time
Remote: Yes
Posted: 2026-03-18
About this role
### The Role
As an Infrastructure Engineer, you'll build and deploy massive computational infrastructure that positions Boundless as the leading decentralized proving network.. You'll architect GPU clusters at unprecedented scale, orchestrate proving across every major blockchain, and manage the complex systems that power billions of cycles of ZK proofs daily. This role demands expertise in both bare-metal optimization and cloud-native architectures.
### What You'll Do
- Build Massive Proving Clusters: Design and deploy proving infrastructure with 1000s of GPUs across both on-premises data centers and cloud services (AWS, GCP, Azure)
- Orchestrate Multi-Chain Proving: Build infrastructure that coordinates proving workloads across every major blockchain, ensuring optimal resource allocation and throughput
- Optimize Container Topology: Design and refine the topology of complex containerized services, maximizing efficiency and minimizing latency in proof generation
- Bare Metal Engineering: Work at the hardware level, optimizing GPU performance, managing CUDA installations, and tuning kernel parameters for maximum throughput
- Cloud Infrastructure: Architect highly available, auto-scaling cloud infrastructure that can dynamically respond to proving demand across multiple regions
- Release Management: Manage deployment pipelines and release schedules for complex distributed software, ensuring zero-downtime upgrades
- Performance Monitoring: Build comprehensive monitoring and alerting systems to track GPU utilization, proof generation metrics, and system health
- Cost Optimization: Implement strategies to minimize infrastructure costs while maintaining performance, including spot instance management and resource scheduling
### Requirements
- 5+ years of infrastructure/DevOps experience with 2+ years managing large-scale GPU clusters
- Experience with both on-premises compute operations and cloud platforms (AWS/GCP/Azure)
- Proficiency in infrastructure-as-code tool...