Architect and develop large-scale, distributed data processing pipelines using technologies like Apache Spark, Apache Beam, and Apache Airflow for orchestration
Design and implement efficient data ingestion, transformation, and storage solutions for structured and unstructured data
Partner closely with Engineering Leaders, Architects, and Product Managers to understand business requirements and provide technical solutions within a larger roadmap
Build and optimize real-time and batch data processing systems, ensuring high availability, fault tolerance, and scalability
Collaborate with data engineers, analysts, and scientists to understand business requirements and translate them into technical solutions
Contribute to the development and maintenance of CI/CD pipelines, ensuring efficient and reliable deployments
Collaborate with cross-functional teams to ensure the successful delivery of projects and initiatives
What You'll Bring
Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field
Minimum of 10 years of experience in backend software development, with a strong focus on data engineering and big data technologies
Proven expertise in Apache Spark, Apache Beam, and Airflow, with a deep understanding of distributed computing and data processing frameworks
Proven experience building enterprise-grade software in a cloud-native environment (GCP or AWS) using cloud services such as GCS/S3, Dataflow/Glue, Data proc/EMR, Cloud Function/Lambda, Big Query/Athena, Big Table/Dynamo
Experience with cloud platforms (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes)
Experience in stream / data processing technologies like Kafka, Spark, Google BigQuery, Google Dataflow, HBase
Familiarity designing CI/CD pipelines with Jenkins, Github Actions, or similar tools
Experience with Graph and Vector database or processing frameworks
Strong knowledge of data modeling, data warehousing, and data integration best practices
Familiarity with streaming data processing, real-time analytics, and machine learning pipelines
Excellent problem-solving, analytical, and critical thinking skills
Bachelor's