Big Data Hadoop & Spark

Installing Cloudera (CDH 5 ) on Centos Linux.

In this article we will be discussing about Installing Cloudera (CDH 5 ) in Centos Linux.
Ways To Install CDH 5 :-
We can install CDH 5 in following ways:

  • Automated ways: Using Cloudera Manager, This is recommended method to install Cloudera.
  • Manual ways. Using Cloudera repository.

Here we are going to install CDH 5 using manual method ( using Cloudera repository).
Visit this link to download repository:
http://www.cloudera.com/documentation/cdh/5-0-x/CDH5-Installation-Guide/cdh5ig_cdh5_install.html#topic_4_4_1_unique_1
Download and install CDH 5 repository for your Centos System-
1
Here I have downloaded it for centos 64 bit.
Move to Downloads folder, and move repository to home directory.
2
Return back to home directory and Install the repository for centos –

sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm

3
Now Add a Repository Key which enables you to verify that you are downloading genuine packages. [ Optional ]

sudo rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

4
Hadoop

Now Install CDH 5 with YARN:

Note: Before installing YARN daemons clean yum repository.

sudo yum clean all

5
Then install YARN –

sudo yum install hadoop-yarn-resourcemanager

6
7
 
Install hadoop-hdfs-namenode :

sudo yum install hadoop-hdfs-namenode

8
 
Secondary NameNode Installation:

sudo yum install hadoop-hdfs-secondarynamenode

9
Install data-nodes and other cluster hosts:

sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce

10
Install one history server and yarn-proxyserver :

sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver

11
 
Add below property into core-site.xml between configuration tag:

vi /etc/hadoop/conf/core-site.xml

<property>
<name>fs.defaultFS</name>
<value>hdfs://acd.acadgild.net:8020</value>
</property>
12
Add below property into hdfs-site.xml :

vi /etc/hadoop/conf/core-site.xml

<property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value>
 </property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/1/dfs/dn,file:///data/2/dfs/dn,file:///data/3/dfs/dn,file:///data/4/dfs/dn</value>
</property>
13

Make Namenode and Datanode directory and set permission to them –

Run following command one by one:-

  • sudo mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn
  • sudo mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
  • sudo chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
  • sudo chmod 700 /data/1/dfs/nn /nfsmount/dfs/nn
  • sudo chmod go-rx /data/1/dfs/nn /nfsmount/dfs/nn
  • Reboot

Now type jps to see All running daemons

 jps

14
All set, Now  Cloudera Cluster has been configured and will start Automatically after every reboot.
Keep visiting our site www.acadgild.com for more updates on Bigdata and other technologies.
Spark

Tags

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close