"Responsibilities
" Design, develop, and maintain data processing pipelines using Apache Spark.
" Collaborate with data engineers, data scientists, and business analysts to understand data requirements and deliver solutions that meet business needs.
" Write efficient Spark code to process, transform, and analyze large datasets.
" Optimize Spark jobs for performance, scalability, and resource utilization.
" Integrate Hadoop, Hive, Spring, Hibernate, Kafka, and ETL processes into Spark applications.
" Troubleshoot and resolve issues related to data pipelines and Spark applications.
" Monitor and manage Spark clusters to ensure high availability and reliability.
" Implement data quality and validation processes to ensure accuracy and consistency of data.
" Stay up-to-date with industry trends and best practices related to Spark, big data technologies, Python, and AWS services.
" Document technical designs, processes, and procedures related to Spark development.
" Provide technical guidance and mentorship to junior developers on Spark-related projects.
Qualifications
" Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
" Proven experience (10+ years) as a Spark Developer or in a similar role working with big data technologies.
" Strong proficiency in Apache Spark, including Spark SQL, Spark Streaming, and Spark MLlib.
" Proficiency in programming languages such as Scala or Python for Spark development.
" Experience with data processing and ETL concepts, data warehousing, and data modeling.
Bachelor's degree