Upon completion of the course, attendees can clear Hadoop developer and Hadoop administrator certifications from Cloudera or from HortonWorks. Certification is a great differentiator; it helps establish individuals as leaders in their field, providing customers with tangible evidence of skills and expertise.
Topics - What is Hadoop?, The Hadoop Distributed File System, Hadoop Map Reduce Works, Anatomy of a Hadoop Cluster, Master Daemons, Name node, Job Tracker, Secondary name node, Slave Daemons, Job tracker, Task tracker.
Topics - Blocks and Splits, Input Splits, HDFS Splits, Data Replication, Hadoop Rack Aware, Data high availability, Data Integrity, Cluster architecture and block placement, Accessing HDFS, JAVA Approach, CLI Approach, Programming Practices, Developing MapReduce Programs in, Running without HDFS and Mapreduce, Running all daemons in a single node, Running daemons on dedicated nodes, Local Mode, Pseudo-distributed Mode, Fully distributed mode.
Topics - Make a fully distributed Hadoop cluster on a single laptop/desktop, Name Node in Safe mode, Meta Data Backup, Integrating Kerberos security in hadoop.
Topics - Examining a Sample MapReduce Program, With several examples, Basic API Concepts, The Driver Code, The Mapper, The Reducer, Hadoop's Streaming API.
Topics - The configure and close Methods, Sequence Files, Record Reader, Record Writer, Role of Reporter, Output Collector, Processing XML files, Counters, Directly Accessing HDFS, ToolRunner, Using The Distributed Cache.
Topics - Sorting and Searching, Indexing, Classification/Machine Learning, Term Frequency - Inverse Document Frequency, Word Co-Occurrence, Hands-On Exercise: Creating an Inverted Index, Identity Mapper, Identity Reducer, Exploring well known problems using MapReduce applications.
Topics - Testing with MRUnit, Logging, Other Debugging Strategies.
Topics - A Recap of the MapReduce Flow, The Secondary Sort, Customized Input Formats and Output Formats.
Topics - Counters, Skipping Bad Records, Rerunning failed tasks with Isolation Runner.
Topics - Reducing network traffic with combiner, Partitioners, Using Compression, Reusing the JVM, Running with speculative execution, Refactoring code and rewriting algorithms Parameters affecting Performance, Other Performance Aspects.
Topics - HBase concepts, HBase architecture, Region server architecture, File storage architecture, HBase basics, Column access, Scans, HBase use cases, Install and configure HBase on a multi node cluster, Create database, Develop and run sample applications, Access data stored in HBase using clients like Java, Python and Pearl, HBase and Hive Integration, HBase admin tasks, Defining Schema and basic operation.
Topics - Hive concepts, Hive architecture, Install and configure hive on cluster, Create database, access it from java client, Buckets, Partitions, Joins in hive, Inner joins, Outer Joins, Hive UDF, Hive UDAF, Hive UDTF, Develop and run sample applications in Java/Python to access hive.
Topics - Pig basics, Install and configure PIG on a cluster, PIG Vs MapReduce and SQL, Pig Vs Hive, Write sample Pig Latin scripts, Modes of running PIG, Running in Grunt shell, Programming in Eclipse, Running as Java program, PIG UDFs, Pig Macros.
Topics - Flume and Chukwa concepts, Use cases of Thrift, Avro and scribe, Install and configure flume on cluster, Create a sample application to capture logs from Apache using flume.
Topics - Name Node High – Availability, Name Node federation, Fencing, YARN.
Topics - Hadoop disaster recovery, Hadoop suitable cases.
Topics - Documents or tests, Hadoop Project - a realtime project where students can practice.
Rithisha Information Systems Pvt.Ltd has been committed to providing the highest quality, needs-based training interventions to its clients, both locally and internationally, Rithisha is a renowned for superior training programs delivered by an enviable team of qualified, expert and highly experienced trainers in the area of Information Technology. Rithisha provides organizations and individuals with a complete and comprehensive suite of training offerings including online and classroom training's.
Register For Demo
Goals & Objectives
- Understands Big Data and Hadoop Basics.
- Understands Hadoop Architecture and Hadoop Cluster.
- Understands the HDFS Architecture.
- Learns Map Reduce Framework.
- Learns how to install Hadoop on Single node and Multi node.
- Should able to clear Clourera or HortonWorks certification.