Hadoop- MapReduce Concept

Hadoop supports MapReduce model, which was introduced by Google.

The processing of data by Mapreduce is 2 way process.

Map:It is an ingestion and transformation step. Initially all input records processed paralally

Reduce:It is an aggregation and summarization step.All associated records processed together by single entity.

Hadoop framework is a apache software. It is an open source.

  • Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and support for the MapReduce distributed computing metaphor.
  • HBase builds on Hadoop Core to provide a scalable, distributed database.
  • Pig is a high-level data-flow language and execution framework for parallel computation. It is built on top of Hadoop Core.

  • ZooKeeper is a highly available and reliable coordination system. Distributed applications use ZooKeeper to store and mediate updates for critical shared state.

  • Hive is a data warehouse infrastructure built on Hadoop Core that provides data summarization, adhoc querying and analysis of datasets.

HDFS: Hadoop distributed file system.

Keep watching on this space for more info on Hadoop.

About these ads

One thought on “Hadoop- MapReduce Concept

  1. I’ve seen your blog about “Mainframe-How to Modernize Batch Process”. I’m contributing to a open source project with the goal to reproduce a batch execution environment (like on MF) on open system, in cloud. It’s called “JEM, the BBE” and you could find it here: http://www.pepstock.org.
    Hadoop integration is planned as well!
    Let’s hope that could be interesting!

Have Something to Say? Post Your Comment

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s