As a Data Engineer, candiate will have the unique opportunity to curate and organize data for a program advancing state-of-the-art modeling and prediction capabilities. The aim of this program is to discover and curate data as well as develop and curate synthetic data. Then, develop machine learning (ML) models and training pipelines to detect and discern features in scenes. Finally, inject obfuscations into the data to fool the models.
The delivery teams are driven to explore new ideas and technology, and care deeply about collaboration, feedback, and iteration. The team follows SAFe agile practices, embrace the Ops ethos (DataOps/DevSecOps/MLOps) to “automate-first”, use modern tech stacks, and constantly challenge each other to grow and improve.
Assess, transform, organize, and optimize data for use by machine learning algorithms
Generating representative data sets for systems development and data science initiatives
Build data pipelines that enables data scientists and engineers and other stakeholders
Position Type
Full-time
Location
VA
Apply
Qualifications
Demonstrated experience building and maintaining ETL pipelines.
Demonstrated experience working with Synthetic Aperture Radar (SAR) data
Demonstrated experience with large-scale distributed processing
Experience building data pipelines in ML frameworks. Kubeflow experience is desired
Requirements
TS/SCI clearance required
Experience
4+ years of experience designing data models and data warehouses supporting analytics, using both relational and non-relational distributed data storage systems
Experience with Apache Beam
Experience working in a fast-paced agile environment
Any graduate