Description

Job Description

  • Experience with Big Data Technologies - Spark (Java or PySpark) for handling large-scale data processing.
  • Proficiency in SQL and Database - querying, managing, and manipulating data sets.
  • Knowledge of Cloud Platforms - data storage, processing, and deployment in a scalable environment (Azure)
  • Design and implement scalable data processing pipelines using Apache Spark.
  • Develop and optimize Spark jobs for data transformation, aggregation, and analysis.
  • Work with large datasets to extract, process, and analyze data from various sources.
  • Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver solutions.
  • Implement data integration solutions to connect disparate data sources.
  • Ensure data quality, integrity, and consistency throughout the data processing pipeline.
  • Monitor and troubleshoot performance issues in Spark jobs and cluster environments.
  • Stay current with the latest developments in big data technologies and best practices.
  • Document technical designs, processes, and procedures.

 

Education

Any Graduate