Description

Must Have Skills :

Spark, AWS, data lake, data pipelining

python

Job Description:
As an AWS Data Engineer, you will play a crucial role in designing, developing, and maintaining our data infrastructure. You will be responsible for onboarding data sources into our data lake, building robust data pipelines, and ensuring data quality and governance. Your expertise in AWS services and data engineering best practices will be essential in driving our data initiatives forward.
Key Responsibilities:
• Data Onboarding: Onboard various data sources into the data lake, ensuring seamless integration and data consistency.
• Data Pipeline Development: Design, develop, and maintain scalable and efficient data pipelines using AWS services such as Lambda, Step Functions, and EMR.
• Data Registration: Register data sources and manage metadata to ensure data discoverability and accessibility.
• Data Quality Management: Implement data quality checks and transformations to ensure the accuracy and reliability of data.
• Data Governance: Comply with data governance principles and best practices to ensure data security, privacy, and compliance.
• Infrastructure as Code: Utilize Terraform scripting to manage and automate AWS infrastructure.
• Data Processing: Leverage Spark and other big data technologies to process and analyze large datasets.
• Orchestration: Use Airflow and Step Functions to orchestrate complex data workflows.
• Data Modeling: Work with Snowflake, Iceberg table formats, and other data modeling tools to design and optimize data storage solutions.
• Collaboration: Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver high-quality data solutions.
Required Skills and Qualifications:
• AWS Services: Proficiency in AWS Lake Formation, Step Functions, Lambda (serverless), EC2, EMR, and EKS.
• Scripting and Programming: Strong experience with Python and Terraform scripting.
• Data Tools: Experience with Jupyter Notebook, RDS, Snowflake, and Iceberg table formats.
• Big Data Technologies: Expertise in Spark and data pipeline orchestration tools like Airflow and dbt.
• Data Engineering: Solid understanding of data engineering principles, including ETL processes, data warehousing, and data modeling.
• Data Governance: Knowledge of data governance principles and best practices.
• Problem-Solving: Strong analytical and problem-solving skills with the ability to troubleshoot and resolve data-related issues.
• Communication: Excellent communication skills with the ability to collaborate effectively with cross-functional teams.
Preferred Qualifications:
• Certifications: AWS Certified Data Engineer or Analytics – Specialty, AWS Certified Solutions Architect, or other relevant certifications.
• Experience: Previous experience in a similar role within a fast-paced, data-driven environment

Education

Any Graduate