ODH Inc. is looking for a Data Pipeline Engineer to join our growing Data Engineering team and participate in design and build of data ingestion and transformation pipelines based on the specific needs driven by Product Owners and Analytics consumers. The candidate should possess strong knowledge, interest in data processing, and have a background in data engineering. Candidate will also have to work directly with senior data engineers, solution architects, DevOps engineers, product owners and data consumers to deliver data products in a collaborative and agile environment. They will also have to continuously integrate and push code into our cloud production environments.
Job Description:
As a key contributor to the data engineering team, the candidate is expected to:
Build and deploy modular data pipeline components such as Apache Airflow DAGs, AWS Glue jobs, AWS Glue crawlers through a CI/CD process.
Translate Business or Functional Requirements to actionable technical build specifications.
Collaborate with other technology teams to extract, transform, and load data from a wide variety of data sources.
Work closely with product teams to deliver data products in a collaborative and agile environment.
Perform data analysis and onboarding activities as new data sources are added to the platform.
Proficient in data modeling techniques and concepts to support data consumers in designing the most efficient method of storage and retrieval of data.
Evaluate innovative technologies and tools while establishing standard design patterns and best practices for the team.
Qualifications:
Required:
Experience in AWS Data processing, Analytics, and storage Services such as Simple Storage Service (s3), Glue, Athena and Lake Formation
Experience in extracting and delivering data from various databases such as MongoDB, DynamoDB, SnowFlake, Redshift, Postgres, RDS
Coding experience with Python, SQL, yaml, spark programming (pyspark)
Hands on experience with Apache Airflow as a pipeline orchestration tool
Experience in AWS Serverless services such as Fargate, SNS, SQS, Lambda
Experience in Containerized Workloads and using cloud services such as AWS ECS, ECR and Fargate to scale and organize these workloads.
Experience in data modeling and working with analytics teams to design efficient data structures.
Applied knowledge of working in agile, scrum, or DevOps environments and teams
Applied knowledge of modern software delivery methods like TDD, BDD, CI/CD
Applied knowledge of Infrastructure as Code (IAC)
Experience with development lifecycle (development, testing, documentation, and versioning)