DESCRIPTION :
NO. OF POSITIONS: 2
JOB DESCRIPTION:
REQUIREMENT GATHERING AND ANALYSIS:
1. WORKS WITH THE BUSINESS TO IDENTIFY NEW DATA NEEDS AND DOCUMENTS REQUIREMENTS.
2. PARTICIPATES IN REQUIREMENTS ANALYSIS, DATA ASSESSMENTS, BUSINESS PROCESS REENGINEERING AND HAS EXPERIENCE WITH DATA WAREHOUSING CONCEPTS.
3. WORKS CLOSELY WITH INTERNAL OR EXTERNAL PARTNERS TO OBTAIN SUPPORTING INFORMATION TO CREATE FINANCIAL MODELS, REPORTING, AND DATA EXCHANGES.
4. PROVIDES BUSINESS AND OPERATIONAL SUPPORT TO ENSURE PROCESSES SUPPORTED WITH DATA ARE PROPERLY DOCUMENTED AND EFFICIENTLY EXECUTABLE.
5. PERFORM ANALYSIS ON DATASETS TO DETERMINE THEIR QUALITY, COVERAGE, COMPLEXITY
6. ANALYZE LARGE AMOUNTS OF TRANSACTIONAL AND BEHAVIORAL DATA, USING STATISTICAL MODELING AND PATTERN ANALYSIS TECHNIQUES, TO IDENTIFY CUSTOMERS / PROSPECTS INSIGHTS.
DESIGN:
1. RESPONSIBLE TO ANALYZE FUNCTIONAL SPECIFICATIONS AND TO PREPARE TECHNICAL DESIGN SPECIFICATIONS.
2. PARTICIPATE IN THE PLANNING AND DESIGN OF NEW REPORTING AND ANALYSIS PRODUCTS FOR INTERNAL AND EXTERNAL USE
3. CREATE HIGH LEVEL & LOW LEVEL DESIGN DOCUMENT OF PROJECTS.
4. ENHANCED THE PERFORMANCE OF QUERIES AND DAILY RUNNING SPARK JOBS USING THE EFFICIENT DESIGN OF PARTITIONED HIVE TABLES AND SPARK LOGIC.
5. APPLIED DESIGN PATTERNS AND OO DESIGN CONCEPTS TO IMPROVE THE EXISTING CODE BASE.
6. ANALYZE BUSINESS RULES AND FILE FORMATS TO DESIGN MAPS AND TRANSFORMATIONS IN DATASTAGE.
7. DESIGNED APPROPRIATE PARTITIONING/BUCKETING SCHEMA TO ALLOW FASTER DATA RETRIEVAL DURING ANALYSIS USING HIVE.
IMPLEMENTATION OR CODING:
1. BUILD A DATA QUALITY FRAMEWORK, WHICH CONSISTS OF A COMMON SET OF MODEL COMPONENTS AND PATTERNS THAT CAN BE EXTENDED TO IMPLEMENT COMPLEX PROCESS CONTROLS AND DATA QUALITY MEASUREMENTS USING HADOOP.
2. WORK EXTENSIVELY WITH SQOOP TO MOVE DATA FROM MYSQL TO HDFS.
3. EXTENSIVE EXPERIENCE WITH EXTRACTION, TRANSFORMATION, LOADING (ETL)PROCESS USING IBM DATASTAGE
4. DEVELOP AND TEST WEB PAGES AND ALSO USED VISUALIZATION FOR GRAPHS, CHARTS, ETC. AND WORKED WITH SQL SCRIPTS, MAPREDUCE, QUERY OPTIMIZATION AND OPERATED WITH DATA SCHEMAS TO MANIPULATE DATA FOR DATA LOADS AND EXTRACTS.
5. CREATE SSIS PACKAGES TO LOAD DATA INTO DATA WAREHOUSE USING TASKS LIKE EXECUTE SQL TASK, DATA FLOW TASK AND EXPERT IN DESIGNING OF ETL (EXTRACT, TRANSFORM, AND LOAD) FLOWS.
6. EXPERIENCE IN WRITING QUERIES IN HQL (HIVE QUERY LANGUAGE), TO PERFORM DATA ANALYSIS.
7. USE SQOOP TO IMPORT DATA FROM RELATIONAL DATABASE (RDBMS) INTO HDFS AND HIVE, STORING USING DIFFERENT FORMATS LIKE TEXT, AVRO, PARQUET, SEQUENCE FILE, ORC FILE ALONG WITH COMPRESSION CODECS LIKE SNAPPY AND GZIP.
8. HANDS ON EXPERIENCE WORKING WITH APACHE SPARK AND HADOOP ECOSYSTEMS LIKE MAPREDUCE (MRV1 AND YARN), SQOOP, HIVE, OOZIE, FLUME, KAFKA, ZOOKEEPER AND DATABASES LIKE MYSQL.
9. ENSURES THAT EXTERNAL AND INTERNAL REGULATIONS AND POLICIES GOVERNING DATA MANAGEMENT ARE MET INCLUDING REGULATIONS CONCERNING SECURITY, AUDIT ABILITY AND PRIVACY.
TESTING:
1. EXPERTISE IN PREPARING THE TEST SCRIPTS AND TEST SCENARIOS USING BUSINESS REQUIREMENT SPECIFICATIONS, FUNCTIONAL REQUIREMENT SPECIFICATION
2. GOOD EXPERIENCE IN REQUIREMENT GATHERING, TEST PLAN PREPARATION
3. EXPERIENCE IN LOAD TESTING, STRESS TESTING, VOLUME TESTING, ENDURANCE AND DB FAILOVER TESTING.
4. ANALYZING THE PERFORMANCE TEST RESULTS, PUBLISHING THE RESULTS TO THE CONCERNED STAKE HOLDERS WITH DETAILED MONITORING STATS, OBSERVATIONS AND RECOMMENDATIONS.
5. PREPARATION OF DIFFERENT CYCLES IN TEST PLAN ACCORDING TO THE REQUIREMENTS IN THE PROJECTS.
6. EXECUTE PERFORMANCE TEST RUNS OF LOAD TEST, STRESS TEST, ENDURANCE TEST AND DB FAILOVER TEST AND WHEN POSSIBLE ENTERPRISE VOLUMES AGAINST AGREED UPON NON-FUNCTIONAL END-USER REQUIREMENTS
7. OBSERVING THE PERFORMANCE MONITORS LIKE CPU& MEMORY UTILIZATIONS, THROUGHPUT, QUEUE LENGTH, THREAD COUNT, HITS PER SEC, RESPONSE TIMES DURING TESTING.
8. RESPONSIBLE FOR HANDLING THE CLIENT COMMUNICATION THROUGHOUT THE TEST CYCLE.
9. DEVELOPED UNIT TEST CASES TO TEST MAP AND REDUCE FUNCTIONS USING MRUNIT TESTING FRAMEWORK.
DEPLOYMENT AND MAINTENANCE:
1. WORK WITH DATA ENGINEERING PLATFORM TEAM TO PLAN AND DEPLOY NEW HADOOP ENVIRONMENTS AND EXPAND EXISTING HADOOP CLUSTERS.
2. DEPLOY DATA OBJECTS IN PRODUCTION REPOSITORY.
3. INVOLVE IN PERFORMANCE TUNING, DEBUGGING ISSUES IN TESTING AND DEPLOYMENT PHASES.
4. CONDUCT DATA ANALYSIS TO RESEARCH ROOT CAUSE FOR DATA GAPS, DISCREPANCIES AND DERIVE INSIGHTS USING ANALYSIS.
EDUCATION:
1. BACHELOR’S DEGREE OR HIGHER IN COMPUTERS OR RELATED.
Any Graduate