Description

Key Responsibilities:

  • Build high-quality, low-latency, cost-effective large-scale data pipelines for ingesting Internal & external data.
  • Implement processes for data wrangling, assimilation, and standardization in production
  • Deploy, fine-tune and execute machine learning models in production
  • Build end-to-end data and computing infrastructure for large-scale data and machine-learning product operations
  • Passionately advocate for data and analytics capabilities and products and create a vision of success that can be vetted among colleagues to drive future success, articulates the vision, negotiates, persuades, and gains support for strategic initiatives that fit within the overall company strategy.
  • Supports data and analytics projects and proof of concepts with third-party providers.
  • Ensures analytics alignment and reliable engine output with internal and external stakeholders.
  • Keeps critical intellectual property (IP) and know-how about company content within the organization.
  • Establishes partners and/or open-source technologies for data and analytics.
  • Works in conjunction with Publication Technology Operation Unit to create and manage analytic capabilities and products.
  • Collaborates in ongoing projects to deliver solutions used by our customers, volunteers, and other Operation Units.

 

Qualifications

  • Master's degree is preferred in a Technical Field, Computer Science, Information Technology, or Business Management
  • Good understanding of data structures and algorithms, ETL processing, large-scale data and machine-learning production, data and computing infrastructure, automation and workflow orchestration.
  • Hands-on experience in Python, Pyspark, SQL, and shell scripting or similar programming languages
  • Hands-on Experience in using cloud-based technologies throughout data and machine learning product development.
  • Hands-on experience with code versioning, automation and workflow orchestration tools such as Github, Ansible, SLURM, Airflow and Terraform
  • Good Understanding of data warehousing concepts such as data migration and data integration in Amazon Web Services (AWS) cloud or similar platform
  • Excellent debugging and code-reading skills.
  • Documentation and structured programming to support sustainable development.
  • Ability to describe challenges and solutions in both technical and business terms.
  • Ability to develop and maintain excellent working relationships at all organizational levels.

Education

Master’s degree