Must Have Hands On Experience
min. 10 year programing experience
Cloud - AWS
Spark - performance tuning
Machine learning ops
Key Responsibilities
- Design & Develop spark data processing pipelines that can apply complex transformations on large volumes of data
- Help teams with Performance tuning of complex spark jobs processing huge volumes of data in limited time
- Develop and enhance common data processing frameworks focusing on efficiency, scalability and code reusability
- Implement and optimize streaming data processing pipelines using Kafka, Flink
- Provide mentorship and technical guidance to junior team members, promoting best practices in data processing & Streaming
- Stay current with the latest trends and technologies in Java, Spark, streaming and big data processing, AWS services
- Lead code reviews, ensuring high coding standards and practices
Qualifications
- Bachelor’s or Master’s degree in computer science, engineering or related field
- 10+ years of relevant professional experience in software development with a focus on Java, spark, streaming and batch high volume data processing
- Proven experience in performance tuning of Spark jobs in large-scale data environments
- Strong background in streaming data technologies and real-time data processing using kafka, spark, flink
- Experience building common frameworks/libraries, particularly for data processing
- Exceptional problem solving skills and algorithmic thinking
- Experience with AWS cloud platform and services (including SNS, SQS, Event bridge, lambda, glue, lake formation etc.)
- Knowledge of Docker, Kubernetes and other containerization and orchestration tools
- Candidate little short on above experience can still apply if they have experience in working with MLOps with specific expertise in implementing and managing offline/online/in-line feature stores using vendor products like Sagemaker, Tecton, Feast, Databricks etc.