Job Description
Hands-on Pyspark SME with multiple project experience with Data planforms comprising of Hadoop, Teradata Data Warehouse, Ab Initio, Informatica, Java Spark (DPL), SSIS, AWS Lake Formation (S3), Snowflake
Ability to design, build and unit test applications on Spark framework on Python
Build PySpark based applications for both batch and streaming requirements, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well
Develop and execute data pipeline testing processes and validate business rules and policies
Optimize performance of the built Spark applications in Hadoop using configurations around Spark Context, Spark-SQL, Data Frame, and Pair RDD's
ANY GRADUATE