Description

Experience with Big Data Technologies

  • Spark (Java or PySpark) for handling large-scale data processing.
  • Proficiency in SQL and Database-querying, managing, and manipulating data sets.
  • Knowledge of Cloud Platforms, data storage, processing, and deployment in a scalable environment (Azure)
  • Design and implement scalable data processing pipelines using Apache Spark.
  • Develop and optimize Spark jobs for data transformation, aggregation, and analysis.
  • Work with large datasets to extract, process, and analyse data from various sources.
  • Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver solutions.
  • Implement data integration solutions to connect disparate data sources.
  • Ensure data quality, integrity, and consistency throughout the data processing pipeline.
  • Monitor and troubleshoot performance issues in Spark jobs and cluster environments.
  • Stay current with the latest developments in big data technologies and best practices.
  • Document technical designs, processes, and procedures.

Education

Any Graduate