Description

Job Description: 

Position Overview: As a Lead Data Engineer, you will be a critical member of our data engineering team responsible for designing, building, and optimizing data pipelines and infrastructure. Your primary focus will be on leveraging your Java/python skills to create scalable and efficient data solutions, contributing to the accessibility and reliability of data for various business needs. 

Key Responsibilities: 

Strong documentation Skills: Coding: Proficiency in Java. 

Cluster Computing Frameworks: Proficiency in Spark and Spark SQL. 

AWS data Services: proficiency in lake formation, glue ETL (OR) EMR, S3, Glue Catalog, Athena, Kinesis (OR) MSK, Airflow (OR) Lambda + Step functions + Event Bridge 

Data De/Serialization: Expertise in atlest 2 of the formats: Parquets, iceberg, AVRO JSON_LE 

DevOps: Linux Scripting, Jenkins, Git, CI/CD, Jira TDD 

AWS Data Security: Good Understanding of security concepts such as Lake formation, IAM, Service roles, Encryption, KMS Secrets management 

IAC: Terraform 

Qualifications: 

Minimum 9+ years of experience as a Data Engineer and Analyst on any cloud platforms.

Expertise in Terraform and other IaC based services is a must.

Strong understanding of cloud-based services and delivery models.

Hands-on experience with network configuration, Kubernetes, and container orchestration.

Familiarity with DevOps practices and CI/CD solutions.

Experience with Orchestration frameworks (Airflow) and Serverless/FaaS cloud services.

Education

Any Graduate