Designing and implementing highly performant data ingestion pipelines from multiple sources using Databricks on AWS
Developing scalable and re-usable frameworks for ingesting large data sets and moving them from Bronze to silver and from Silver to Gold layer in databricks.
Integrating the end to end data pipeline to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times
Working with event based / streaming technologies to ingest and process data
Proficient in writing Spark code in below languages: Scala, Python, or SQL
Pro-code skills
Experience with cloud-based architectures in AWS, Azure
Experience in Big Data components such as Kafka, Spark SQL, Dataframes, HIVE DB
Experience with databases and data warehousing
Read and write data in AWS Databricks
Working within an Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints
Experience working with large scale data processing services (Hadoop)