Required Experience:
• Data warehousing experience with dimensional and data vault data model.
• Proficient in SQL, PL/SQL, and Python
• Hands-on experience in creating data pipeline Etl jobs using AWS Glue with PySpark.
• Creating and testing pipeline jobs locally using aws glue interactive session.
• Efficient in using PySpark Dataframe API and Spark SQL.
• Performance tuning of PySpark jobs.
• Using AWS athena to perform data analysis on Lake data populated into aws glue data catalog through aws glue crawlers.
• Knowledge in AWS services e.g. DMS, S3, RDS, Redshift, Step Function.
• Etl development experience with tools e.g. SAP BODS, Informatica.
• Efficient in writing complex SQL queries to perform data analysis.
• Good understanding of version control tools like Git, GitHub, TortoiseHg.
Description of duties:
• Work with a scrum team(s) to deliver product stories according to priorities set by customer and the Product Owners.
• Interact with stakeholders.
• Provide knowledge transfer to other team members.
Any Graduate