Data Engineer - AWS/Databricks - Mid Level
Company: Acuity, Inc.
Location: Reston, VA (Remote)
Type: Full-time
Remote: Yes
Posted: 2026-04-28
About this role
Overview
Acuity Inc. is seeking a highly skilled Data Engineer to join our Engineering Team, helping drive the design and delivery of AWS cloud-scale data platforms for federal clients. This role requires knowledge and/or experience with Spark, Delta Lake, and distributed data pipelines on Databricks. The ideal candidate brings both engineering and strategic insight into enterprise data modernization.
Are you ready to use your expertise in the areas of IT Modernization, Data Enablement, and Hyperautomation to make a real difference? Join Acuity, Inc., a technology consulting firm that supports federal agencies. We combine industry partnerships and long-term federal experience with innovative technical leadership to support our customers’ critical missions.
Responsibilities
- Build and maintain scalable PySpark-based data pipelines in Databricks notebooks to support ingestion, transformation, and enrichment of structured and semi-structured data.
- Design and implement Delta Lake tables optimized for ACID compliance, partition pruning, schema enforcement, and query performance across large datasets.
- Develop ETL and ELT workflows that integrate multiple source systems into a centralized, query-optimized data warehouse architecture.
- Leverage Spark SQL and DataFrame APIs to implement business rules, dimensional joins, and aggregation logic aligned to warehouse modeling best practices.
- Collaborate with data architects and engineers to implement cloud-native data solutions on AWS using S3, Glue, RDS, and IAM for secure, scalable storage and access control.
- Optimize pipeline performance through intelligent partitioning, caching, broadcast joins, and adaptive query tuning.
- Deploy and version data engineering assets using Git-integrated development workflows and automate deployment with CI/CD tools such as GitLab or Jenkins.
- Monitor pipeline health, job execution, and cluster utilization using native Databricks tools and AWS CloudWatch, ident...