Databricks Architect

Resourcesys
Atlanta, GA, USA

Description

Job Description

Key Responsibilities:

Architect and Design Solutions:
Lead the architecture and design of Databricks-based data solutions that support data engineering, machine learning, and real-time analytics.
Data Pipeline Design:
Design and implement ETL (Extract, Transform, Load) pipelines using Databricks, Apache Spark, and other big data tools to process and integrate large-scale data from multiple sources.
Collaborate with Stakeholders:
Work with business and data teams to understand requirements, identify opportunities for automation, and design solutions that improve data workflows.
Optimize Data Architecture:
Create highly optimized, scalable, and cost-effective architectures for processing large data sets and managing big data workloads using Databricks, Delta Lake, and Apache Spark.
Implement Best Practices:
Define and promote best practices for Databricks implementation, including data governance, security, performance optimization, and monitoring.
Manage Databricks Clusters:
Manage and optimize Databricks clusters for performance, cost, and reliability. Troubleshoot performance issues and optimize the use of cloud resources.
Data Governance and Security:
Implement best practices for data governance, security, and compliance on the Databricks platform to ensure that data processing and storage meet organizational and regulatory standards.
Automation and Optimization:
Automate repetitive tasks, streamline data processes, and optimize data workflows to improve efficiency and reduce operational costs.
Mentorship and Training:
Mentor and provide guidance to junior engineers, ensuring the team follows best practices in the development of data pipelines and analytics solutions.
Keep Up-to-Date with Trends:
Stay current with emerging technologies in the big data and cloud space, and recommend new solutions or improvements to existing processes.

Required Skills & Qualifications:

Technical Expertise:
- Extensive experience with Databricks, Apache Spark, and cloud platforms (AWS, Azure, or GCP).
- Proficiency in programming languages such as Python, Scala, or SQL.
- Strong understanding of distributed computing, data modeling, and data storage technologies.
- Hands-on experience with Delta Lake, Spark SQL, and MLlib.
Experience with Cloud Services:
- Expertise in deploying and managing data platforms and workloads on cloud environments like AWS, Azure, or GCP.
- Familiarity with cloud-native services like S3, Redshift, Azure Blob Storage, and BigQuery.
Data Engineering Skills:
- Experience designing, building, and optimizing ETL data pipelines.
- Familiarity with data warehousing concepts, OLAP, and OLTP systems.
Machine Learning (ML) Knowledge:
- Experience in integrating machine learning workflows with Databricks, building models, and automating model deployment.
Leadership and Collaboration:
- Strong leadership and communication skills to interact with both technical and non-technical stakeholders.
- Experience in leading cross-functional teams and mentoring junior team members.

Preferred Skills:

Advanced Databricks Knowledge:
In-depth experience with Databricks components, such as notebooks, jobs, and collaboration features.
DevOps & CI/CD:
Experience with DevOps practices, automation, and CI/CD pipelines in data engineering.
Data Governance:
Strong knowledge of data governance principles, such as metadata management, data lineage, and data quality.
Certifications:
- Databricks Certified Associate Developer for Apache Spark.
- Cloud certifications (e.g., AWS Certified Solutions Architect, Azure Solutions Architect Expert).

Education & Experience:

Education:
Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related field (or equivalent work experience).
Experience:
5+ years of experience in data architecture, engineering, and working with cloud platforms (preferably with Databricks and Apache Spark)

Key Skills

Extract Transform Load Python Scala Gcp Aws Azure Olap

Education

Bachelor's Degree

Back To Jobs

Posted On: 23-Dec-2024
Experience: 5+ years of experience
Availability: Remote
Openings: 1
Category: Databricks Architect
Tenure: Contract - Corp-to-Corp Position