Name of the Program: BIG DATA Specialist
• 6 Months in Genpact as on 31st January 2016
• 1 year of experience in IT programming (Java & LINUX basics) with exposure to analytics domain and knowledge of analytical tools.
• Should not be actively enrolled into any other E@W program at the time of enrollment into the BIG DATA Specialist Program
Course content: Covers the following modules (Please ask for the detailed content separately):
• MODULE 1- INTRODUCTION TO BIG DATA (Introduction to Big Data & Hadoop)
• MODULE 2- COMPREHENSIVE REVIEW OF BIG DATA TECHNOLOGIES (Big Data Technologies; Apache Pig & Hive; Apache Sqoop and Flume; Apache HBase and Oozie; Hadoop Distributions and Impala)
• MODULE 3- BIG DATA ANALYSIS WITH R AND HADOOP (Data Analysis with R and Hadoop; Data Analysis using R; handling Bag Data using R and Hadoop)
• MODULE 4- BIG DATA APPLICATIONS (Big Data Visualizations on Hadoop; Big Data Solution Design and Deployment; Case Studies)
Program Duration: 6 Months (16 weeks access to live classes + Certification Exam)
Methodology: Live Online ILT classes on any one day of the weekend with access to a lab for hands on practice (Schedules to be shared upon enrollment).
Key Features of the Course
- Knowledge of entire data analytics life cycle.
- Deep understanding of applications of Big Data in various industries.
- Basic and advanced data analytics and visualization techniques.
- Hands-on experience of working with R, RHadoop and Hadoop ecosystem
Join the batch to start your course
Section 1 The Big Data Paradigm 00:13:31 Serial vs Distributed Computing 00:10:00 What is Hadoop 00:07:06 HDFS or the Hadoop Distributed File System 00:10:28 MapReduce Introduced 00:11:15 Section2 Yarn 00:03:28 Hadoop Install Modes 00:08:11 Setup a Virtual Linux Instance (For Windows users) 00:15:10 Hadoop Standalone mode Install 00:09:12 Hadoop Pseudo-Distributed mode Install 00:14:07 Section3 Path and other Environment Variables 00:08:07 The basic philosophy underlying MapReduce 00:08:22 MapReduce – Visualized And Explained 00:08:31 MapReduce – Digging a little deeper at every step 00:10:05 Section4 Hello World in MapReduce 00:10:10 The Mapper 00:00:00 The Reducer 00:07:20 The Job 00:12:10 Get comfortable with HDFS 00:10:28 Juicing your MapReduce – Combiners, Shuffle and Sort and The Streaming API 00:14:11 Section5 : Advance MapReduce . Conti... Parallelize the reduce phase – use the Combiner 00:14:21 Not all Reducers are Combiners 00:14:21 How many mappers and reducers does your MapReduce have 00:08:29 Parallelizing reduce using Shuffle And Sort 00:14:30 MapReduce is not limited to the Java language – Introducing the Streaming API 00:00:05 Python for MapReduce 00:12:08 HDFS – Protecting against data loss using replication 00:15:16 HDFS – Name nodes and why they’re critical 00:00:06 HDFS – Checkpointing to backup name node information 00:00:00 The heart of search engines – The Inverted Index 00:14:21 Generating the inverted index using MapReduce 00:10:12 Custom data types for keys – The Writable Interface 00:10:00 Represent a Bigram using a WritableComparable 00:13:05 MapReduce to count the Bigrams in input text 00:08:13 Test your MapReduce job using MRUnit 00:13:22 Introducing the File Input Format 00:12:22 Text And Sequence File Formats 00:10:06 Data partitioning using a custom partitioner 00:07:00
No Reviews found for this course.347 STUDENTS ENROLLED