Description

Job Description:
Technical/Functional Skills:

  • Proficiency in working with the Databricks Unified Analytics Platform, including notebooks, clusters, jobs, and libraries.
  • Strong programming skills in languages commonly used with Databricks, such as Python, Scala, and SQL.
  • Experience in designing and implementing ETL processes within Databricks using Spark SQL, DataFrame API, and structured streaming.
  • Understanding and mastery of Spark DataFrames and Datasets for efficient and structured data manipulation.
  • Knowledge of configuring and optimizing Databricks runtime settings for performance and resource utilization.
  • Proficiency in creating and managing clusters in Databricks, optimizing configurations based on workload requirements.
  • Integration skills to connect Databricks with various data sources and sinks, including cloud storage, databases, and streaming platforms.
  • Knowledge of Databricks security features, including access controls, encryption, and integration with identity providers.
  • Experience with structured streaming in Databricks for real-time data processing and analytics.

Roles & Responsibilities:

  • Should be able to use version control systems, such as Git, to manage codebase changes and collaborate with a development team.
  • Should have Collaboration skills using Databricks notebooks, including sharing, versioning, and commenting on code.
  • monitor Databricks workloads, interpret logs, and optimize performance using built-in and external monitoring tools.
  • Create Scripts to automate tasks within Databricks, leveraging APIs and Databricks CLI.
  • Implement of cluster autoscaling in Databricks to optimize resource utilization.

 

Education

Any Graduate