Job Description:
Position Overview: As a Lead Data Engineer, you will be a critical member of our data engineering team responsible for designing, building, and optimizing data pipelines and infrastructure. Your primary focus will be on leveraging your Java/python skills to create scalable and efficient data solutions, contributing to the accessibility and reliability of data for various business needs.
Key Responsibilities:
Strong documentation Skills: Coding: Proficiency in Java.
Cluster Computing Frameworks: Proficiency in Spark and Spark SQL.
AWS data Services: proficiency in lake formation, glue ETL (OR) EMR, S3, Glue Catalog, Athena, Kinesis (OR) MSK, Airflow (OR) Lambda + Step functions + Event Bridge
Data De/Serialization: Expertise in atlest 2 of the formats: Parquets, iceberg, AVRO JSON_LE
DevOps: Linux Scripting, Jenkins, Git, CI/CD, Jira TDD
AWS Data Security: Good Understanding of security concepts such as Lake formation, IAM, Service roles, Encryption, KMS Secrets management
IAC: Terraform
Qualifications:
Minimum 9+ years of experience as a Data Engineer and Analyst on any cloud platforms.
Expertise in Terraform and other IaC based services is a must.
Strong understanding of cloud-based services and delivery models.
Hands-on experience with network configuration, Kubernetes, and container orchestration.
Familiarity with DevOps practices and CI/CD solutions.
Experience with Orchestration frameworks (Airflow) and Serverless/FaaS cloud services.
Any Graduate