Responsibilities:
Collaborate with and across Agile teams to design, develop, test, implement, and support technical solutions in data intensive development tools and technologies
Work with a team of developers with deep experience in distributed microservices and full stack systems
Utilize programming languages like Python and powerful computing technologies such as spark to do real time transformations and consumption of data.
Conduct reviews with other team members to make sure code is rigorously designed, elegantly coded and effectively tuned for performance
Drive automated CI/CD release pipelines
Basic Qualifications:
Bachelors Degree
At least 6 years of experience in big data technologies
At least 5 years of experience in practical coding in python
At least 4 years of experience in PySpark, Spark, Hadoop, EMR, and other cluster computing environments for data
At least 3 years experience with cloud computing (AWS, Microsoft Azure, Google Cloud)
Familiarity with jenkins, unit testing, and proper CI/CD practices is a must have
Preferred Qualifications:
7+ years of experience in application development using Python
4+ years of experience with a public cloud (AWS, Microsoft Azure, Google Cloud)
4+ years experience with Distributed data/computing tools (EMR, Hadoop, Spark, MapREduce, and/or Kafka)
4+ year experience working on real-time data and streaming applications - spark structured streaming in particular. Familiarity with data formats such as Delta is a big plus.
4+ years of experience with NoSQL implementation (DynamoDB, Mongo, Cassandra)
4+ years of data warehousing experience (Redshift or Snowflake)
4+ years of experience with UNIX/Linux including basic commands and shell scripting
2+ years of experience with Agile engineering practices
Experience with with databricks is a plus
Bachelor's Degree