Hadoop – Downgrading from YARN to MRv1 (Cloudera CDH4)

With two versions of MapReduce available for Hadoop, the older MRv1 and the newer YARN, sometimes you need to move between the two.  Using RPM’s or other packages with the Cloudera CDH installation makes this mostly easy, however there is still some work to do for a successful downgrade from YARN to MRv1.  For going from MRv1 to YARN, the Cloudera installation guide walks  you through doing this.  The instructions here are for going the other direction, from YARN to MRv1.

I recently had to go through the exercise of making this downgrade, and I have documented my steps below.  I am using CentOS with yum/RPM’s, other distributions may be similar.  Please let me know if you find any recommendations for changes to these steps:

# remove YARN configuration
sudo yum remove hadoop-conf-pseudo
 
# stop YARN
sudo service hadoop-yarn-resourcemanager stop 
sudo service hadoop-yarn-nodemanager stop
sudo service hadoop-mapreduce-historyserver stop
 
# stop HDFS
sudo for x in cd /etc/init.d ; ls hadoop-hdfs-* ; do sudo service $x stop ; done
 
# Install MRv1
sudo yum install hadoop-0.20-conf-pseudo
 
# Remove cache dir
sudo rm -rf /var/lib/hadoop-hdfs/cache/
 
# format namenode
sudo -u hdfs hdfs namenode -format 
 
# start HDFS
sudo for x in cd /etc/init.d ; ls hadoop-hdfs-* ; do sudo service $x start ; done
 
# make /tmp directories and set permissions/ownership
sudo -u hdfs hadoop fs -mkdir /tmp
sudo -u hdfs hadoop fs -chmod -R 1777 /tmp 
 
sudo -u hdfs hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging
sudo -u hdfs hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred
 
sudo -u hdfs mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/local/
sudo chown -R mapred  /var/lib/hadoop-hdfs/cache/mapred
 
# check dir structure
sudo -u hdfs hadoop fs -ls -R / 
 
# start MRv1
for x in cd /etc/init.d ; ls hadoop-0.20-mapreduce-* ; do sudo service $x start ; done
 
# make user directory for your username
sudo -u hdfs hadoop fs -mkdir /user/cloudera
sudo -u hdfs hadoop fs -chown cloudera /user/cloudera
 
# test
hadoop fs -mkdir input
hadoop fs -put /etc/hadoop/conf/*.xml input
hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep input output ‘dfs[a-z.]+’
 
This entry was posted in Data Analytics and tagged , , , . Bookmark the permalink.

Leave a Reply