Description

Job Overview:

We are seeking a talented and experienced AWS Redshift Data Engineer / Consultant to join our team in designing, developing, and optimizing data pipelines and ETL processes for our AWS Redshift-based data lake house. In this role, you will collaborate closely with cross-functional teams, leveraging your expertise in SQL, Redshift stored procedures, AWS DMS, Airflow, Python scripting and other pertinent AWS services to ensure the seamless ingestion, integration, transformation, and orchestration of data. Your experience with complex ETL pipelines, Changed Data Capture (CDC), Slowly Changing Dimension (SCD) strategies will be instrumental in creating a scalable, high-performance data environment. By adhering to best practices and industry standards, you will collaborate with our engineering and data teams to design forward thinking solutions

Key Responsibilities:

  • Collaborate with data engineering and development teams to design, develop, test, and maintain robust and scalable ELT/ETL pipelines using SQL scripts, Redshift stored procedures, and other AWS tools and services.
  • Collaborate with our engineering and data teams to understand business requirements and data integration needs, translate them into effective data solutions, that yield top-quality outcomes.
  • Architect, implement, and manage end-to-end data pipelines, ensuring data accuracy, reliability, data quality, performance, and timeliness.
  • Employ AWS DMS and other services for efficient data ingestion from on-premises databases into Redshift.
  • Design and implement ETL processes, encompassing Changed Data Capture (CDC) and Slow Changing Dimension (SCD) logics, to seamlessly integrating data from diverse source systems.
  • Provide expertise in Redshift database optimization, performance tuning, and query optimization.
  • Design and implement efficient orchestration workflows using Airflow, ensuring seamless coordination of complex ETL processes.
  • Integrate Redshift with other AWS services, such as AWS DMS, AWS Glue, AWS Lambda, Amazon S3, Airflow, and more, to build end-to-end data pipelines.
  • Perform data profiling and analysis to troubleshoot data-related challenges / issues and build solutions to address those concerns.
  • Proactively identify opportunities to automate tasks and develop reusable frameworks.
  • Work closely with version control team to maintain a well-organized and documented repository of codes, scripts, and configurations using Git.
  • Provide technical guidance and mentorship to fellow developers, sharing insights into best practices, tips, and techniques for optimizing Redshift-based data solutions.

 

Qualifications and Skills:

 

  • Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
  • Extensive hands-on experience designing, developing, and maintaining data pipelines and ETL processes on AWS Redshift, including data lakes and data warehouses.
  • Proficiency in SQL programming and Redshift stored procedures for efficient data manipulation and transformation.
  • Hands-on experience with AWS services such as AWS DMS, Amazon S3, AWS Glue, Redshift, Airflow, and other pertinent data technologies.
  • Strong understanding of ETL best practices, data integration, data modeling, and data transformation.
  • Experience with complex ETL scenarios, such as CDC and SCD logics, and integrating data from multiple source systems.
  • Demonstrated expertise in AWS DMS for seamless ingestion from on-prem databases to AWS cloud.
  • Proficiency in Python programming with a focus on developing efficient Airflow DAGs and operators.
  • Experience in converting Oracle scripts and Stored Procedures to Redshift equivalents.
  • Familiarity with version control systems, particularly Git, for maintaining a structured code repository.
  • Proficiency in identifying and resolving performance bottleneck and fine-tuning Redshift queries,
  • Strong coding and problem-solving skills, and attention to detail in data quality and accuracy.
  • Ability to work collaboratively in a fast-paced, agile environment and effectively communicate technical concepts to non-technical stakeholders.
  • Proven track record of delivering high-quality data solutions within designated timelines.
  • Experience working with large-scale, high-volume data environments.
  • The ideal candidate possesses several years of hands-on experience working with Redshift and other AWS services and a proven track record of delivering high-performing, scalable data platforms and solutions within the AWS cloud

Education

Bachelor’s Degree