Description

For MarketingOS, having a robust identity graph would make our Advertising, Websights and Intent products more sticky and valuable to our customers, and opens up opportunities in the realm of personalization and monetization. A full range of identifiers associated with both devices and people will be inventoried and considered for inclusion in the identity resolution service for both visitor identification and ad use cases. From an Advertising perspective, Identity graph will be used to identify (and map all the identities) and activate marketing activities on different platforms (display, social, email, platforms, etc), reporting and take attribution across website engagement (Websights) activities, to CRM conversion(closed /won deals)

What You'll Do

Architect and develop large-scale, distributed data processing pipelines using technologies like Apache Spark, Apache Beam, and Apache Airflow for orchestration

Design and implement efficient data ingestion, transformation, and storage solutions for structured and unstructured data

Partner closely with Engineering Leaders, Architects, and Product Managers to understand business requirements and provide technical solutions within a larger roadmap

Build and optimize real-time and batch data processing systems, ensuring high availability, fault tolerance, and scalability

Collaborate with data engineers, analysts, and scientists to understand business requirements and translate them into technical solutions

Contribute to the development and maintenance of CI/CD pipelines, ensuring efficient and reliable deployments

What You'll Bring
Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field

Proven expertise in Apache Spark, Apache Beam, and Airflow, with a deep understanding of distributed computing and data processing frameworks

Proven experience building enterprise-grade software in a cloud-native environment (GCP or AWS) using cloud services such as GCS/S3, Dataflow/Glue, Dataproc/EMR, Cloud Function/Lambda, BigQuery/Athena, BigTable

Experience with cloud platforms (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes)

Experience in stream / data processing technologies like Kafka, Spark, Google BigQuery, Google Dataflow

Familiarity designing CI/CD pipelines with Jenkins, Github Actions, or similar tools

Experience with SQL, particularly performance optimization

Education

ANY GRADUATE