Introduction To Apache Hadoop Eco System

Apache Hadoop

                                   In the last article we understood what is Bigdata? In this article we will see a big data framework called “Hadoop”. Hadoop is a software library which will enable the users to distribute and process the large amount of data using clusters of commodity servers. This project includes,

Hadoop Common – Contains common utilities to work with Hadoop

Hadoop Distributed File System(HDFS) – A distributed file system which will provide high throughput access to the data.

Hadoop YARN(Yet Another Resource Negotiator) – A framework for job scheduling and resource management.

Hadoop MapReduce(MRv2) – Parallel processing programming model.

 In addition to above modules, there are other Hadoop related projects,

ZooKeeper – Coordination service for distributed applications.

Pig –  A high-level data-flow language and execution framework for parallel computation.

HBase – A scalable distributed columnar database.

Hive – A data warehouse infrastructure that provides data summarization and ad hoc querying.

Oozie – A workflow scheduler system to manage Apache Hadoop jobs.

Flume – A tool to move the unstructured data to Apache Hadoop.

Sqoop – A tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.

The Apache Hadoop eco system is depicted below.

Apache Hadoop EcoSystem

In the coming article we will see introduction of Hadoop Distributed File System.


I am Siva Prasad Rao Janapati. Working as Technical Architect. Has hands on experience on ATG Commerce(DAS/DPS/DCS), Mozu commerce, Broadleaf Commerce, Java, JEE, Spring, Play, JPA, Hibernate, Velocity, JMS, Jboss, Weblogic,Tomcat, Jetty, Apache, Apache Solr, Spring Batch, JQuery, NodeJS, SOAP, REST, MySQL, Oracle, Mongo DB, Memcached, HazelCast, Git, SVN, CVS, Ant, Maven, Gradle, Amazon Web services, Rackspace, Quartz, JMeter, Junit, Open NLP, Facebook Graph,Twitter4J, YouTube Gdata, Bazzarvoice,Yotpo, 4-Tell, Alatest, Shopzilla, Linkshare. I have hands on experience on open sources and commercial technologies.

Tagged with: ,
Posted in Big Data, Hadoop

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.



Java Code Geeks
Java Code Geeks
%d bloggers like this: