What You Will Do
As a member of the data platform team, you will be responsible for architecting, developing, and operating the next generation of the Moveworks Data Platform. As Moveworks grows fast, the data platform team is tasked with designing and developing scalable, reliable, and secure data platforms, processing pipelines, and services, which powers Moveworks’ cutting edge NLP and Conversational AI technologies with the first class Enterprise data governance, security and privacy standards.
Design, build, and operate highly performant and scalable batch and stream data processing infrastructure and solutions to support day to day ML operations including training, serving, evaluation and experimental systems.
Design and develop Moveworks’ foundational data models, data warehouse, real-time and offline processing pipelines using AWS EMR Spark, Apache Kafka, AWS Athena, Snowflake, Airflow, Apache HUDI, etc.
Closely work with machine learning teams and data science teams to understand their data needs, influence data team’s roadmap, and lead as well as execute on various projects.
Build data lake and implement data cataloging platform for easy data discovery and availability
Architect and implement the data anonymization and data access control frameworks that support policy based masking and access to data for different use cases
Build out platform and data services/APIs to make data available to various different stakeholders and for customer facing data products
What You Bring To The Table
5+ years of experience as software engineer
Experience with Python or Golang or Java or C++
Experience with cloud infrastructure like AWS/GCP/Azure
Experience with relational or non-relational databases such as Postgres, AWS DataLake/S3 or DynamoDB or Snowflake
BS or higher in Computer Science or a related field.
Any Graduate