Data Engineer � Databricks SME.
Company: PlanIT Group, LLC
Location: Raleigh, NC (Remote)
Type: Full-time
Remote: Yes
Posted: 2026-05-12
About this role
Job Description
We are seeking a Senior Data Engineer to support our client with data ingestion, data deduplication and data tagging for migration of a large-scale data environment into Databricks.
The ideal candidate will also bring hands-on expertise in
end-to-end data pipeline
management
, including
data ingestion
from diverse sources,
de-duplication
of large-scale datasets, and
data tagging
to support downstream analytics, governance, and machine learning workflows.
Roles And Responsibilities (including But Not Limited To)
- Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment.
- Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake.
- Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements.
- Assist with Operationalizing deployments and support of Cloud services for ETL Operations. This will include standardizing and automating processes and workflows, creating documentation/knowledge articles, and overall assisting Operations staff who have limited experience in Cloud.
- Written and oral presentations to high-level CIO management on status of current efforts.
- Possesses skills and experience related to business management, systems engineering, operations research, and management engineering. Typically has specialization in a particular technology or business application. Keeps abreast of technological developments and industry trends.
- Assist with deployment, configuration, and management of Azure Cloud environment.
- Ass...