Call us Today ! +91 9900441261 |

Name of the Program: BIG DATA Specialist
• 6 Months in Genpact as on 31st January 2016
• 1 year of experience in IT programming (Java & LINUX basics) with exposure to analytics domain and knowledge of analytical tools.
• Should not be actively enrolled into any other E@W program at the time of enrollment into the BIG DATA Specialist Program
Course content: Covers the following modules (Please ask for the detailed content separately):
• MODULE 1- INTRODUCTION TO BIG DATA (Introduction to Big Data & Hadoop)
• MODULE 2- COMPREHENSIVE REVIEW OF BIG DATA TECHNOLOGIES (Big Data Technologies; Apache Pig & Hive; Apache Sqoop and Flume; Apache HBase and Oozie; Hadoop Distributions and Impala)
• MODULE 3- BIG DATA ANALYSIS WITH R AND HADOOP (Data Analysis with R and Hadoop; Data Analysis using R; handling Bag Data using R and Hadoop)
• MODULE 4- BIG DATA APPLICATIONS (Big Data Visualizations on Hadoop; Big Data Solution Design and Deployment; Case Studies)

Program Duration: 6 Months (16 weeks access to live classes + Certification Exam)
Methodology: Live Online ILT classes on any one day of the weekend with access to a lab for hands on practice (Schedules to be shared upon enrollment).

Hadoop-Big Data specilist 

This course is designed for professionals having good knowledge in IT programming (Java & LINUX basics) with exposure to analytics domain and knowledge of analytical tools aspire to build expertise in bid data technologies like Hadoop. Big data Analytics with R & Hadoop and data Visualization

Key Features of the Course

  • Knowledge of entire data analytics life cycle.
  • Deep understanding of applications of Big Data in various industries.
  • Basic and advanced data analytics and visualization techniques.
  • Hands-on experience of working with R, RHadoop and Hadoop ecosystem

Course Batches

  • Group logo of Hadoop big data
  • Group logo of Batch1 Genpact

    Batch1 Genpact

    Join the batch to start your course

Course Curriculum

Section 1
The Big Data Paradigm 00:13:31
Serial vs Distributed Computing 00:10:00
What is Hadoop 00:07:06
HDFS or the Hadoop Distributed File System 00:10:28
MapReduce Introduced 00:11:15
Yarn 00:03:28
Hadoop Install Modes 00:08:11
Setup a Virtual Linux Instance (For Windows users) 00:15:10
Hadoop Standalone mode Install 00:09:12
Hadoop Pseudo-Distributed mode Install 00:14:07
Path and other Environment Variables 00:08:07
The basic philosophy underlying MapReduce 00:08:22
MapReduce – Visualized And Explained 00:08:31
MapReduce – Digging a little deeper at every step 00:10:05
Hello World in MapReduce 00:10:10
The Mapper 00:00:00
The Reducer 00:07:20
The Job 00:12:10
Get comfortable with HDFS 00:10:28
Juicing your MapReduce – Combiners, Shuffle and Sort and The Streaming API 00:14:11
Section5 : Advance MapReduce . Conti...
Parallelize the reduce phase – use the Combiner 00:14:21
Not all Reducers are Combiners 00:14:21
How many mappers and reducers does your MapReduce have 00:08:29
Parallelizing reduce using Shuffle And Sort 00:14:30
MapReduce is not limited to the Java language – Introducing the Streaming API 00:00:05
Python for MapReduce 00:12:08
HDFS – Protecting against data loss using replication 00:15:16
HDFS – Name nodes and why they’re critical 00:00:06
HDFS – Checkpointing to backup name node information 00:00:00
The heart of search engines – The Inverted Index 00:14:21
Generating the inverted index using MapReduce 00:10:12
Custom data types for keys – The Writable Interface 00:10:00
Represent a Bigram using a WritableComparable 00:13:05
MapReduce to count the Bigrams in input text 00:08:13
Test your MapReduce job using MRUnit 00:13:22
Introducing the File Input Format 00:12:22
Text And Sequence File Formats 00:10:06
Data partitioning using a custom partitioner 00:07:00

Course Reviews


  • 5 stars0
  • 4 stars0
  • 3 stars0
  • 2 stars0
  • 1 stars0

No Reviews found for this course.

  • CALL : 06391010343
  • Email:
Quick Enquiry

GITS Academy. All rights reserved.