Software Engineer, Data Infrastructure & Acquisition - Boston, MA, USA
Company: RemoteHunter
Location: Location not specified (Remote)
Type: Contract
Remote: Yes
Posted: 2026-05-19
About this role
1. About Our Client:
The organization operates in the text-to-speech technology space, aiming to eliminate reading as a barrier to learning. Its products convert various reading materials, such as PDFs, books, and online documents, into audio, helping users read faster and retain more information. The platform has a user base of over 50 million people and offers applications across multiple platforms, including iOS, Android, Mac, Chrome, and web. The company is fully distributed with nearly 200 employees worldwide, including engineers and AI research scientists from leading tech firms and academic programs. The organization has been recognized with awards for its Chrome extension and inclusive design.
2. About the Opportunity:
The Software Engineer, Data Infrastructure & Acquisition is responsible for managing data collection to support AI model training. This role focuses on building high-quality, large-scale datasets through a combination of infrastructure, engineering, and research collaboration. The position contributes directly to advancing the organization’s AI capabilities by improving data quality, scale, and cost efficiency. The engineer will work closely with scientists and leadership to develop the dataset roadmap that powers next-generation consumer and enterprise products.
3. Responsibilities:
• Source new audio data and integrate it into the ingestion pipeline
• Operate and enhance cloud infrastructure for data ingestion using GCP and Terraform
• Collaborate with scientists to optimize cost, throughput, and quality of data acquisition
• Work with the AI team and leadership to plan the dataset strategy for future products
4. Requirements:
• BS, MS, or PhD in Computer Science or a related field
• 5+ years of software development experience
• Proficiency in bash and Python scripting within Linux environments
• Experience with Docker, Infrastructure-as-Code, and at least one major cloud provider (GCP preferred)
• Familiarity with web crawlers...