Description

Key Responsibilities:

Lead design and build Enterprise Level scalable, fault-tolerant streaming data platform that provides meaningful and timely insights

Lead a group of engineers building data pipelines using big data technologies (Spark, Kafka, AWS) on medium to large scale datasets

Build the next generation Distributed Streaming Data Pipelines and Analytics Data Stores using streaming frameworks (e.g. Spark Streaming, etc.) using programming languages like Java, Scala, Python

Work with a team of developers with deep experience in Java and/or Python, AWS Glue, Lambdas and ETL (data pipelines) and Spark

Influence best practices for Data Pipeline design, Data architecture and processing of structured and unstructured data

Collaborate with and across Agile teams to design, develop, test, implement, and support technical solutions in full-stack development tools and technologies (Java and/or Python, Angular, Node, Typescript)

Minimum Qualifications:

Experience with building data pipelines in getting the data required to build and evaluate Data models, using tools like Apache Spark, AWS Glue or other distributed data processing frameworks

Develops data applications in a Cloud Amazon Web Services (AWS) environment using Java and / or Python, Scala

Data movement technologies (ETL/ELT), Messaging/Streaming Technologies (AWS SQS, Kinesis/Kafka), API and in-memory technologies

Strong knowledge of developing highly scalable distributed systems using Open-source technologies

Builds and maintains large-scale data processing systems in an Agile environment

 

 

Education

Any Gradute