Description

Must-Have Skills for Leads (2 positions):

  • Python & PySpark: Strong proficiency in both Python and PySpark (70% PySpark, 30% Java focus).
  • Java: Solid understanding and experience in Java development.

Must-Have Skills for Other Engineers (3 positions):

  • Python & PySpark: Strong proficiency in Python and PySpark.

Additional Essential Skills (for all positions):

  • SQL: Strong SQL skills for data querying and manipulation.
  • AWS: Extensive experience with AWS cloud services and its data-related offerings.
  • Data Lake: Experience working with and building data lakes.
  • Databricks (Plus): If mentioned on the resume, candidates must be able to articulate their experience with Databricks.

Responsibilities:

  • Design, develop, and maintain data pipelines using PySpark and Java.
  • Migrate data from legacy platforms to a new AWS-based platform.
  • Write and optimize complex Spark transformations and SQL queries.
  • Work collaboratively with data scientists and other stakeholders.
  • Ensure data quality, integrity, and security.

Qualifications:

  • Strong understanding of Big Data concepts and technologies.
  • Hands-on experience with data processing and transformation using PySpark.
  • Proficiency in Python or Java development.
  • Experience with AWS cloud services, especially those related to data storage and processing (e.g., S3, EMR, Redshift).
  • Excellent problem-solving and communication skills.


 

Education

Any Graduate