In this article we will be discussing about Installing Cloudera (CDH 5 ) in Centos Linux.
Ways To Install CDH 5 :-
We can install CDH 5 in following ways:
- Automated ways: Using Cloudera Manager, This is recommended method to install Cloudera.
- Manual ways. Using Cloudera repository.
Here we are going to install CDH 5 using manual method ( using Cloudera repository).
Visit this link to download repository:
http://www.cloudera.com/documentation/cdh/5-0-x/CDH5-Installation-Guide/cdh5ig_cdh5_install.html#topic_4_4_1_unique_1
Download and install CDH 5 repository for your Centos System-Here I have downloaded it for centos 64 bit.
Move to Downloads folder, and move repository to home directory.
Return back to home directory and Install the repository for centos –
sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
Now Add a Repository Key which enables you to verify that you are downloading genuine packages. [ Optional ]
sudo rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
Now Install CDH 5 with YARN:
Note: Before installing YARN daemons clean yum repository.
sudo yum clean all
Add below property into hdfs-site.xml :
vi /etc/hadoop/conf/core-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///var/lib/hadoop-hdfs/cache/hdfs/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data/1/dfs/dn,file:///data/2/dfs/dn,file:///data/3/dfs/dn,file:///data/4/dfs/dn</value>
</property>
Make Namenode and Datanode directory and set permission to them –
Run following command one by one:-
-
sudo mkdir -p /data/1/dfs/nn /nfsmount/dfs/nn
-
sudo mkdir -p /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
-
sudo chown -R hdfs:hdfs /data/1/dfs/nn /nfsmount/dfs/nn /data/1/dfs/dn /data/2/dfs/dn /data/3/dfs/dn /data/4/dfs/dn
-
sudo chmod 700 /data/1/dfs/nn /nfsmount/dfs/nn
-
sudo chmod go-rx /data/1/dfs/nn /nfsmount/dfs/nn
-
Reboot
Now type jps to see All running daemons
jps
All set, Now Cloudera Cluster has been configured and will start Automatically after every reboot.
Keep visiting our site www.acadgild.com for more updates on Bigdata and other technologies.