Description

Strong hands on in Pyspark and Apache Spark

Strong hands on in Medallion architecture

Experience in Native Spark Migration to Databricks.

Experience in Building Data Governance Solutions like Unity Catalog, Azure Purview etc.

ighly experienced in Usability Optimization (Auto Compaction, ZOrdering, Vaccuming), Cost Optimization and Performance Optimization.

Build Very Strong Orchestration Layer in Databricks/ADF. Workflows.

Build CICD for Databricks in Azure Devops.

Process near Real time Data thru Auto Loader, DLT Pipelines.

Implement Security Layer in Delta Lake.

Implement Massive Parallel Processing Layers in Spark SQL and PySpark.

Implement Cost effective Infrastructure in Databricks.

Experience In extracting logic and from on prem layers, SAP, ADLS into Pyspark/ADLS using ADF/Databricks.

Hands on Experience in Azure Synapse Analytics, Azure Data Factory and Data Bricks, Azure Storage, Azure Key Vault, SQL Pools CI/CD Pipeline Designing and other Azure services like functions, logic apps

Linked services, Various Runtimes, Datasets, Pipelines, Activities

Strong Hands on Experience in Various Activites like Control flow logic and conditions (For Each, if, switch, until), Lookup, Stored procedure, scripts, validations, Copy Data, Data flow, Azure functions, Notebooks, SQL Pool Stored procedures and etc

Strong hands on exp in deployment of code through out landscape (Dev -> QA -> Prod), Git Hub, CI/CD pipelines and etc

strong hands on creating the SQL stored procedures 

Functions, Stored Procedures, how to call one SP into another, How to process record-by-record

Dynamic SQL

Must have strong background about the Python libraries like PySpark, Pandas, NumPy, pymysql, Oracle, Pyspark libraries

Must have strong hands on to get data through APIs

Must be able to install libraries and help users to troubleshoot issues

Must have knowledge to get the data through stored procedures via Python

Should be able to debug the Python code

Hands on experioence in Spark Pools, PySpark

Should be able to merge data/delta loads through Notebooks

Must have strong background about the Python libraries and PySpark

Education

Any Graduate