Description

Responsibilities

Create ScalaSpark jobs for data transformation and aggregation

Produce unit tests for Spark transformations and helper methods

Write Scaladocstyle documentation with all code

Design data processing pipelines

What youll be doing

Building distributed and highly parallelized Big data processing pipeline which process massive amount of data both structured and unstructured data in near realtime

Leveraging Spark to enrich and transform corporate data to enable searching data visualization and advanced analytics

Working closely with analysts and business stakeholders to develop analytics models

Continuous delivering on Hadoop and other Big Data Platforms

Automatingprocesses where possible and are repeatable and reliable

Working closely with QA team

As an expert Data Engineer you are expected to have

Excellent Programming language skills Spark Scala BashUnix Scripting SQL Bash

Excellent understanding of Hadoop and HDFS Architecture

Excellent understanding of file types their pros and cons

Excellent understanding of Build Tools especially SBTGradle

Very good understanding and good experience in implementing CICD eg Building and Maintaining Jenkins Pipelines

Working experience in SAFeAgile

As an expert you are also responsible for Juniors writing maintainable code

Are collaborative and achieve your expectations through communication and teamwork

Are curious responsive and can understand the needs of others to ensure delivery of the desired results

Work qualitatively and strive to always do things better

Skills

Spark Scala Scala with a focus on the programming

Apache Spark 2x

Spark query tuning and performance optimization

Understanding on Hadoop Architecture

Experience working with Hive SQL and HDFS

Deep understanding of distributed systems eg CAP theorem partitioning replication consistency and consensus

Good to have experience in Scala

Experience working with tools like Jenkins Jira Bitbucket and Git

Experience in writing shell scripts and working with Linux platforms

As an advantage familiarity with one or more of the below technologies

Experienceworking in AgileSAFe

Experience certification with AWS Cloud

Education

Any Graduate