Job Title: Spark Developer
Location: Plano, TX
Duration: ~10 months
Visa: W2 candidates
Senior resume needed 14+ yrs
Job Description: Spark Developer
Apache Spark proficiency: (3 - 4 years)
The Developer must have experience using Apache Spark to build data pipelines, Data Science models, and/or Machine Learning models.
Must have worked on processing large datasets (preferably Peta Bytes) using Spark
Must have goods hands on experience in performance tuning Spark applications/ Data Science Models (Regression models)
Must have strong troubleshooting and debugging skills of Spark Application to troubleshoot failures and/or long running applications.
Must have skills to optimize the Spark code for scalability, performance and maintainability
AWS EMR proficiency: ( 2 - 3 years)
Must have experience with Docker compatible EMR
Experience using compute intensive / memory intensive / GPU instances
Must have a solid understanding of different types of AWS instances and when to use them
SQL : (3- 5 years)
Must have experience working on Sql queries and Databases, preferably Snowflake
Python, Scala, or R coding:
Deep expertise in Python and Pyspark.
Nice to have experience in the R language.
Data analysis:
Candidates should have experience analyzing big data and be comfortable working with data scientists.
GitHub experience:
Developers must be proficient at using GitHub to manage and share code and build an organized codebase.
Communication, collaboration, and problem-solving skills:
Apache Spark developers need to work well with others, quickly develop innovative solutions to problems, and communicate their ideas clearly and professionally.
ANY GRADUATE