Description

This role will provide expertise to support the development of a Big Data / Data Lake system architecture that supports enterprise data operations for the District of Columbia government, including the Internet of Things (IoT) / Smart City projects, enterprise data warehouse, the open data portal, and data science applications. This is an exciting opportunity to work as a part of a collaborative senior data team supporting DC's Chief Data Officer.  This architecture includes an Databricks, Microsoft Azure platform tools (including Data Lake, Synapse), Apache platform tools (including Hadoop, Hive, Impala, Spark, Sedona, Airflow) and data pipeline/ETL development tools (including Streamsets, Apache NiFi, Azure Data Factory).  The platform will be designed for District wide use and integration with other OCTO Enterprise Data tools such as Esri, Tableau, MicroStrategy, API Gateways, and Oracle databases and integration tools. Required Skillsets Experience implementing Big Data storage and analytics platforms such as Databricks and Data Lakes – 5 Years Knowledge of Big Data and Data Architecture and Implementation best practices – 5 Years Knowledge of architecture and implementation of networking, security and storage on cloud platforms such as Microsoft Azure – 5 Years Experience with deployment of data tools and storage on cloud platforms such as Microsoft Azure – 5 Years Knowledge of Data-centric systems for the analysis and visualization of data, such as Tableau, MicroStrategy, ArcGIS, Kibana, Oracle – 10 Years Experience querying structured and unstructured data sources including SQL and NoSQL databases – 5 Years Experience modeling and ingesting data into and between various data systems through the use of Data Pipelines – 5 Years Experience with implementing Apache data products such as Spark, Sedona, Airflow, Atlas, NiFi, Hive, Impala – 5 Years Experience with API / Web Services (REST/SOAP) – 3 Years Experience with complex event processing and real-time streaming data – 3 Years Experience with deployment and management of data science tools and modules such as JupyterHub – 3 Years Experience with ETL, data processing, analytics using languages such as Python, Java or R - 3 Years Experience with Cloudera Data Platform - Highly desired 3 Years 16+ yrs planning, coordinating, and monitoring project activities 16+ yrs leading projects, ensuring they are in compliance with established standards/procedures Bachelor’s degree in IT or related field or equivalent experience

Education

ANY GRADUATE