Description


Job Description:

  • Must be strong in Spark/Bigdata, ETL and Pyspark.
  • Your expertise in Python, PySpark, ETL processes, CI/CD (Jenkins or GitHub), and experience with both streaming and batch workflows will be essential in ensuring the efficient flow and processing of data to support our clients
  • Collaborate with cross-functional teams to understand data requirements and design robust data architecture solutions
  • Implement ETL processes to extract, transform, and load data from various sources
  • Ensure data quality, integrity, and consistency throughout the ETL pipeline
  • Utilize your expertise in Python and PySpark to develop efficient data processing and analysis scripts
  • Optimize code for performance and scalability, keeping up-to-date with the latest industry best practices
  • Integrate data from different systems and sources to provide a unified view for analytical purposes
  • Collaborate with data analysts to implement solutions that meet their data integration needs
  • Design and implement streaming workflows using PySpark Streaming or other relevant technologies
  • Develop batch processing workflows for large-scale data processing and analysis


 

Education

Any Graduate