Description

Required Experience:

• Data warehousing experience with dimensional and data vault data model.

• Proficient in SQL, PL/SQL, and Python

• Hands-on experience in creating data pipeline Etl jobs using AWS Glue with PySpark.

• Creating and testing pipeline jobs locally using aws glue interactive session.

• Efficient in using PySpark Dataframe API and Spark SQL.

• Performance tuning of PySpark jobs.

• Using AWS athena to perform data analysis on Lake data populated into aws glue data catalog through aws glue crawlers.

• Knowledge in AWS services e.g. DMS, S3, RDS, Redshift, Step Function.

• Etl development experience with tools e.g. SAP BODS, Informatica.

• Efficient in writing complex SQL queries to perform data analysis.

• Good understanding of version control tools like Git, GitHub, TortoiseHg.

Description of duties:

• Work with a scrum team(s) to deliver product stories according to priorities set by customer and the Product Owners.

• Interact with stakeholders.

• Provide knowledge transfer to other team members.

 


 

Education

Any Graduate