Responsibilities
Create ScalaSpark jobs for data transformation and aggregation
Produce unit tests for Spark transformations and helper methods
Write Scaladocstyle documentation with all code
Design data processing pipelines
What youll be doing
Building distributed and highly parallelized Big data processing pipeline which process massive amount of data both structured and unstructured data in near realtime
Leveraging Spark to enrich and transform corporate data to enable searching data visualization and advanced analytics
Working closely with analysts and business stakeholders to develop analytics models
Continuous delivering on Hadoop and other Big Data Platforms
Automatingprocesses where possible and are repeatable and reliable
Working closely with QA team
As an expert Data Engineer you are expected to have
Excellent Programming language skills Spark Scala BashUnix Scripting SQL Bash
Excellent understanding of Hadoop and HDFS Architecture
Excellent understanding of file types their pros and cons
Excellent understanding of Build Tools especially SBTGradle
Very good understanding and good experience in implementing CICD eg Building and Maintaining Jenkins Pipelines
Working experience in SAFeAgile
As an expert you are also responsible for Juniors writing maintainable code
Are collaborative and achieve your expectations through communication and teamwork
Are curious responsive and can understand the needs of others to ensure delivery of the desired results
Work qualitatively and strive to always do things better
Skills
Spark Scala Scala with a focus on the programming
Apache Spark 2x
Spark query tuning and performance optimization
Understanding on Hadoop Architecture
Experience working with Hive SQL and HDFS
Deep understanding of distributed systems eg CAP theorem partitioning replication consistency and consensus
Good to have experience in Scala
Experience working with tools like Jenkins Jira Bitbucket and Git
Experience in writing shell scripts and working with Linux platforms
As an advantage familiarity with one or more of the below technologies
Experienceworking in AgileSAFe
Experience certification with AWS Cloud
Any Graduate