Description

Must Have Hands On Experience

min. 10 year programing experience

Cloud - AWS

Spark - performance tuning

Machine learning ops

Key Responsibilities

 

  • Design & Develop spark data processing pipelines that can apply complex transformations on large volumes of data
  • Help teams with Performance tuning of complex spark jobs processing huge volumes of data in limited time
  • Develop and enhance common data processing frameworks focusing on efficiency, scalability and code reusability
  • Implement and optimize streaming data processing pipelines using Kafka, Flink
  • Provide mentorship and technical guidance to junior team members, promoting best practices in data processing & Streaming
  • Stay current with the latest trends and technologies in Java, Spark, streaming and big data processing, AWS services
  • Lead code reviews, ensuring high coding standards and practices

     

Qualifications

 

  • Bachelor’s or Master’s degree in computer science, engineering or related field
  • 10+ years of relevant professional experience in software development with a focus on Java, spark, streaming and batch high volume data processing
  • Proven experience in performance tuning of Spark jobs in large-scale data environments
  • Strong background in streaming data technologies and real-time data processing using kafka, spark, flink
  • Experience building common frameworks/libraries, particularly for data processing
  • Exceptional problem solving skills and algorithmic thinking
  • Experience with AWS cloud platform and services (including SNS, SQS, Event bridge, lambda, glue, lake formation etc.)
  • Knowledge of Docker, Kubernetes and other containerization and orchestration tools
  • Candidate little short on above experience can still apply if they have experience in working with MLOps with specific expertise in implementing and managing offline/online/in-line feature stores using vendor products like Sagemaker, Tecton, Feast, Databricks etc.

Education

Any Graduate