Experienced with Scala, Spark, Databricks, Azure Data Factory, Hadoop and related ecosystem and worked with AWS/Azure/GCP
Responsible to Ingest data from files, streams, and databases. Process the data with Python, Pyspark and Scala
Develop programs in Spark, Scala, and Hive as part of data cleaning and processing
Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems
Develop efficient software code for multiple use cases leveraging Spark and Big Data technologies for various use cases built on the platform
Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Spark, Scala, Hive any Cloud computing etc.
Required Technical and Professional Expertise
Minimum 4-7 years of experience in Big Data technologies
Minimum 4+ years of experience in Python, Spark / Pyspark, Scala and Hive programming Experience in developing applications on Big Data and Cognitive technologies including API development
Application Development background along with knowledge of Analytics libraries, open-source Natural Language Processing, statistical and big data computing libraries
Demonstrated ability in solutioning covering data ingestion, data cleansing, ETL, data mart creation and exposing data for consumers