Description

Role: Data Engineer (Databricks)
Location: 100% Remote within US (working EST hours)
Client: Health
Type: 12+ month contract
Visa: Citizen, GC, H1B, EAD (no OPT or CPT)
Rate: W2, C2C

Must Haves:
3+ years of experience working with Databricks, PySpark, SQL, Spark clusters, Jupyter Notebooks.
Experience building data lakes using the Medallion architecture.
Understanding of delta tables and the delta file format.
Familiarity with CI/CD pipelines and Agile methodologies and frameworks
Strong understanding of ETL processes, data modeling, and data warehousing principles.
Experience with Power BI and other data visualization tools is a plus.
Knowledge of cybersecurity data, specifically vulnerability scan data is preferred.
Bachelor's or Master's degree in Computer Science, Information Systems, or a related field.
Responsibilities:
Develop, construct, test, and maintain large-scale data processing systems using Databricks, PySpark, SQL, Spark clusters, delta tables, and Medallion architecture.
Collaborate with cybersecurity analysts to understand data requirements and deliver high-quality solutions.
Work closely with the data architecture and analytics team to ensure data quality and implement data governance standards.
Design and implement ETL processes, including data cleansing, transformation, and integration with an understanding of the delta file format.
Build and manage data lakes following the Medallion architecture principles.
Monitor and optimize data pipelines, employing CI/CD practices for efficient development and deployment. 
Collaborate with other team members to implement data analytics projects, utilizing tools like Jupyter Notebooks.
Adhere to Agile methodologies throughout the development lifecycle to promote iterative and collaborative development.

Education

Bachelor's or Master's degree in Computer Science