Data Architect

matrix7i
Bangalore, Karnataka, India

Description

Role & Responsibilities:

Work with cloud engineers and customers to solve big data problems by developing utilities for migration, storage and processing on Azure Cloud.

Design and build a cloud migration strategy for cloud and on-premise applications.

Diagnose and troubleshoot complex distributed systems problems and develop solutions with a significant impact at massive scale.

Build tools to ingest and jobs to process several terabytes or petabytes per day.

Design and develop next-gen storage and compute solutions for several large customers.

Define Data Architecture for the Data Science teams and participate in review and walk-through sessions for model fit and model productionization

Provide thought leadership on data integrity & quality for data science workloads
Be involved in proposals, RFPs and provide effort estimates, solution design etc.

Communicate with a wide set of teams, including Infrastructure, Network, Engineering, DevOps, SiteOps teams, and cloud customers.

Build advanced tooling for automation, testing, monitoring, administration, and data operations across multiple cloud clusters.

Better understanding of Data modeling and governance

Must have:

8+ years experience of Hands-on in data structures, distributed systems, Hadoop and spark, SQL and NoSQL Databases
Strong software development skills in at least one of: Python, Java or Scala. SQL commands
Experience building and deploying cloud-based solutions at scale.
Experience in developing Big Data solutions (migration, storage, processing)
Experience building and supporting large-scale systems in a production environment.
Designing and development of ETL pipeline
Modern Azure data warehouse design skills
Requirement gathering and understanding of the problem statement.
End-to-end ownership of the entire delivery of the project
Designing and documentation of the solution
Knowledge of RDBMS & NoSQL databases
Any of Kafka, Kinesis, Cloud pub-sub
Cloud Platforms Azure (GCP & AWS Good to have)
Any of Apache Hadoop/CDH/HDP/EMR/Google DataProc/HD-Insights Distributed processing Frameworks.
One or more of MapReduce, Apache Spark, Apache Storm, Apache Flink. Database/warehouse
Hive, HBase, and at least one cloud native services Orchestration Frameworks
Any of Airflow, Oozie, Apache NiFi, Google Data Flow Message/Event Solutions
Reporting tool exposure (at least one of Power BI, Tableau, Looker)
Enable best practices w.r.t data handling.

Key Skills

Cloud pub-sub Kafka Kinesis NoSQL MapReduce Apache Spark Apache Storm Apache Flink. Python Java

Education

Any graduate

Back To Jobs

Posted On: 07-Dec-2024
Experience: 14+ years of experience
Availability: Remote
Openings: 1
Category: Technical Architect
Tenure: Flexible Position