Job Description;
A result-oriented Professional with 8+ years of experience in Big Data development along with Data administration and proposing effective
solutions through an analytical approach with a track record of building large-scale systems using Big Data technologies.
Proficient in cloud platforms such as Microsoft Azure and AWS, harnessing their capabilities for scalable and secure data storage and processing.
Expertise in diverse data processing frameworks including Spark, Apache Flink, Apache NiFi, and Apache MapReduce, ensuring efficient data
manipulation and analysis.
Excellence in using Use Apache Hadoop to work with Big Data and analyse large data sets.
Hands-on experience in ecosystems like Hive, Sqoop, MapReduce, Flume, and Oozie.
Proficient in containerization strategies leveraging Azure Containers and Kubernetes, enhancing deployment and scalability of application.
Work with Data Lakes and Big Data ecosystems (Hadoop, Spark, Hortonworks, Cloudera).
Track record of results in an Agile methodology using data-driven analytics.
Load and transform large sets of structured, semi-structured, and unstructured data working with data on Amazon Redshift, Apache Cassandra,
and HDFS in Hadoop Data Lake.
Proficient in managing databases such as Microsoft SQL Server and NoSQL (Cassandra) to ensure effective data organization and accessibility.
Skilled with BI tools like Tableau and Power BI, data interpretation, modelling, data analysis, and reporting with the ability to assist in directing
planning based on insights.
Skilled in HDFS, Spark, Hive, Sqoop, HBase, Flume, Oozie, and Zookeeper
Strong scripting skills in SQL, Python, and Scala, facilitating the development of efficient data pipelines and analytics workflows.
Apply in-depth understanding/knowledge of Hadoop architectures and various components such as HDFS, MapReduce, Spark, and Hive.
Create Spark Core ETL processes to automate using a workflow scheduler.
Proficiency in reporting tools like SSRS for generating comprehensive and visually appealing reports.
Experienced in utilizing various data-related technologies including Apache Kafka, Apache Beam, Apache Avro, Airflow, Snowflake, SSMS, ERD,
Azure ADLS, and Trifacta to optimize data workflows and analytics processes.
Gained experience in applying techniques to live data streams from big data sources using Spark and Scala.
Any graduate