Design and develop scalable data architectures and systems that extract, store, and process large amounts of data
Develop and architect scalable data pipelines and ETL processes using programming languages such as Python, Scala, frameworks like Spark, and AWS services like Glue, Lambda,NoSQL databases, Serverless workflow Orchestration using step functions
Collaborate with Software Engineers, Business Analysts, Data Analysts and/or Product Owners to understand their requirements and provide efficient solutions for data exploration, analysis, and visualization
Automate integrations and data pipelines using data governance and various testing techniques
Perform unit tests and conduct reviews with other team members to make sure your code is rigorously designed, elegantly coded, and effectively tuned for performance
4+ years of experience with a public cloud (AWS : Step functions, Glue, Lambda, DynamoDB)
4+ years’ experience with Distributed data/computing tools (Spark)
2+ years of experience with Agile engineering practices Utilize programming languages like JavaScript, Java, HTML/CSS, TypeScript, SQL, Python, and Scala, NoSQL databases, Serverless workflow Orchestration using step functions