Description

"Responsibilities

" Design, develop, and maintain data processing pipelines using Apache Spark.

" Collaborate with data engineers, data scientists, and business analysts to understand data requirements and deliver solutions that meet business needs.

" Write efficient Spark code to process, transform, and analyze large datasets.

" Optimize Spark jobs for performance, scalability, and resource utilization.

" Integrate Hadoop, Hive, Spring, Hibernate, Kafka, and ETL processes into Spark applications.

" Troubleshoot and resolve issues related to data pipelines and Spark applications.

" Monitor and manage Spark clusters to ensure high availability and reliability.

" Implement data quality and validation processes to ensure accuracy and consistency of data.

" Stay up-to-date with industry trends and best practices related to Spark, big data technologies, Python, and AWS services.

" Document technical designs, processes, and procedures related to Spark development.

" Provide technical guidance and mentorship to junior developers on Spark-related projects.

Qualifications

" Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

" Proven experience (10+ years) as a Spark Developer or in a similar role working with big data technologies.

" Strong proficiency in Apache Spark, including Spark SQL, Spark Streaming, and Spark MLlib.

" Proficiency in programming languages such as Scala or Python for Spark development.

" Experience with data processing and ETL concepts, data warehousing, and data modeling.

Education

Bachelor's degree