Description

Responsibilities

 

Experienced with Scala, Spark, Databricks, Azure Data Factory, Hadoop and related ecosystem and worked with AWS/Azure/GCP

 

 

Responsible to Ingest data from files, streams, and databases. Process the data with Python, Pyspark and Scala

 

 

Develop programs in Spark, Scala, and Hive as part of data cleaning and processing

 

 

Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems

 

 

Develop efficient software code for multiple use cases leveraging Spark and Big Data technologies for various use cases built on the platform

 

 

Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Spark, Scala, Hive any Cloud computing etc.

 

 

Required Technical and Professional Expertise

 

 

Minimum 4-7 years of experience in Big Data technologies

 

 

Minimum 4+ years of experience in Python, Spark / Pyspark, Scala and Hive programming Experience in developing applications on Big Data and Cognitive technologies including API development

 

 

Application Development background along with knowledge of Analytics libraries, open-source Natural Language Processing, statistical and big data computing libraries

 

 

Demonstrated ability in solutioning covering data ingestion, data cleansing, ETL, data mart creation and exposing data for consumers

Education

Any Graduate