Description

10+ years’ experience in large-scale software development/Big Data technologies
Programming skills in Java/Scala, Python, Shell scripting, and SQL
Development skills around Spark, MapReduce, and Hive
Strong skills around developing RESTful API’s
Design and implement distributed data processing pipelines using Spark, Hive, Python, and other tools and languages prevalent in the Hadoop ecosystem.

You will be given the opportunity to own the design and implementation.

You will collaborate with Product managers, Data Scientists, Engineering folks to accomplish your tasks.
Publish RESTful API’s to enable real-time data consumption using OpenAPI specifications. This will enable many teams to consume the data that is being produced.
Explore and build proof of concepts using open source NOSQL technologies such as HBase, DynamoDB, Cassandra and Distributed Stream Processing frameworks like ApacheSpark, Flink, Kafka stream.
Take part in DevOps by building utilities, user defined functions and frameworks to better enable data flow patterns.
Work with architecture/engineering leads and other teammates to ensure high quality solutions through code reviews, engineering best practices documentation.
Experience in Business Rule management systems like Drools will also come in handy

Education

Bachelor's degree