Key Responsibilities:
• Design, develop, and maintain ETL processes to support data extraction, transformation, and loading between source and target data stores.
• Optimize and automate data pipelines to ensure high performance and reliability.
• Develop data workflows and pipelines using AWS Glue for data extraction and transformation tasks.
• Manage and maintain Glue jobs, crawlers, and workflows, ensuring efficient data processing.
• Implement serverless functions for on-demand ETL processes and data integrations.
• Use Lambda for task automation and event-driven data processing as required.
• Set up and manage DataSync for efficient and secure data transfers between on-premise data sources and AWS cloud storage.
• Use S3 for data storage and management, ensuring data availability, security, and optimized storage.
• Build and maintain data pipelines for AWS Aurora (PostgreSQL) and DynamoDB, ensuring data accuracy and consistency.
• Implement data archiving, backup, and disaster recovery processes for both databases.
• Extract data from MS SQL Server and integrate it into cloud storage for analytics and reporting.
• Perform data transformations and develop ETL jobs to support data integration between on-premise and cloud databases.
• Ensure data accuracy, integrity, and compliance with internal and external data policies.
• Implement data validation checks and monitor ETL job status, ensuring reliable data flow.
Required Qualifications:
• 5+ years of experience in ETL development and data engineering.
• Strong experience with AWS Glue, Lambda, DataSync, S3, Aurora (PostgreSQL), DynamoDB, and MS SQL Server.
• Proficiency in SQL and data modeling techniques.
• Experience with Python or other scripting languages for ETL automation.
• Strong analytical and problem-solving skills.
Preferred Qualifications:
• AWS certifications (e.g., AWS Certified Data Analytics, AWS Certified Solutions Architect).
• Knowledge of data warehousing and data lake architectures.
Any Graduate