Description

Job Title: Spark Developer

Location: Plano, TX

 

Duration: ~10 months

Visa: W2 candidates

Senior resume needed 14+ yrs

Job Description: Spark Developer

Apache Spark proficiency: (3 - 4 years)

The Developer must have experience using Apache Spark to build data pipelines, Data Science models, and/or Machine Learning models.

Must have worked on processing large datasets (preferably Peta Bytes) using Spark

Must have goods hands on experience in performance tuning Spark applications/ Data Science Models (Regression models)

Must have strong troubleshooting and debugging skills of Spark Application to troubleshoot failures and/or long running applications.

Must have skills to optimize the Spark code for scalability, performance and maintainability

AWS EMR proficiency: ( 2 - 3 years)

Must have experience with Docker compatible EMR

Experience using compute intensive / memory intensive / GPU instances

Must have a solid understanding of different types of AWS instances and when to use them

SQL : (3- 5 years)

Must have experience working on Sql queries and Databases, preferably Snowflake

Python, Scala, or R coding:

Deep expertise in Python and Pyspark.

Nice to have experience in the R language.

Data analysis:

Candidates should have experience analyzing big data and be comfortable working with data scientists.

GitHub experience:

Developers must be proficient at using GitHub to manage and share code and build an organized codebase.

 

 

Communication, collaboration, and problem-solving skills:

Apache Spark developers need to work well with others, quickly develop innovative solutions to problems, and communicate their ideas clearly and professionally.

Education

ANY GRADUATE