Monday, January 19, 2015

Patching Cloudera Manager from 5.0.2 to 5.2.0 (Cloudera Hadoop Patching/Version upgrade part 1 of 2)

Lets say we have a Hadoop Cluster with below nodes running CDH 5.0.2. We want to upgrdate the cluster to CDH 5.2.0.

192.168.56.201 dvhdmgt1.example.com dvhdmgt1  # Management node hosting Cloudera Manager
192.168.56.202 dvhdnn1.example.com  dvhdnn1 # Name node
192.168.56.203 dvhdjt1.example.com  dvhdjt1 # Jobtracker/Resource Manager
192.168.56.101 dvhddn01.example.com dvhddn01 # Datanode1
192.168.56.102 dvhddn02.example.com dvhddn02 # Datanode2
192.168.56.103 dvhddn03.example.com dvhddn03 # Datanode3

This is a two step process:
Part 1: Upgrade Cloudera Manager


Ref 1:
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_mc_upgrade_to_cdh52_using_parcels.html

Ref 2: For rolling upgrade :
[Had some issues as in step 13. So, we are not following this, we will follow above link. Also nay how we need to stop some important services like impala & hive for rolling also. To me both are same and needs downtime(not that difference)]
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_mc_rolling_upgrade.html


-----------------------------
Upgrading CDH 5 Using Parcels
-----------------------------

*** First we need to upgrade Cloudera Manager to 5.2.0 and then follow below steps to upgrade CDH.
*** Review Ref 1 again before start
-- ----------------
1. Before You Begin
-- ----------------
1.1 Make sure there are no Oozie workflows in RUNNING or SUSPENDED status; otherwise the Oozie database upgrade will fail and you will have to reinstall CDH 4 to complete or kill those running workflows.
We can use web GUI for oozie to check it:
http://192.168.56.201:11000/oozie/
1.2 Run the "Host Inspector" and fix every issue.
1.2.1 On Cloudera manager Click the Hosts tab.
1.2.2 Click Host Inspector. Cloudera Manager begins several tasks to inspect the managed hosts.
1.2.3 After the inspection completes, click Download Result Data or Show Inspector Results to review the results.
1.2.4 Click "Show Inspector Results" to check the result
1.2.5 If there are any validation error, please consult with Cloudera Support before proceed further.
1.3 Run hdfs fsck / and hdfs dfsadmin -report and fix any issues
# su - hdfs
$ hdfs fsck /
$ hdfs dfsadmin -report
$ hbase hbck
1.4 enable maintenance mode on your cluster
1.4.1 Click (down arrow) to the right of the cluster name and select Enter Maintenance Mode.
1.4.2 Confirm that you want to do this.

-- -----------------------------------------
2. Back up the HDFS Metadata on the NameNode
-- -----------------------------------------
   2.1 Stop the cluster. It is particularly important that the NameNode role process is not running so that you can make a consistent backup.
   2.2 CM > HDFS > Configuration 
   2.3 In the Search field, search for "NameNode Data Directories". This locates the NameNode Data Directories property.
   2.4 From the command line on the NameNode host, back up the directory listed in the NameNode Data Directories property. 
For example, if the data directory is /mnt/hadoop/hdfs/name, do the following as root:
# cd /mnt/hadoop/hdfs/name
# tar -cvf /root/nn_backup_data.tar .
Note:  If you see a file containing the word lock, the NameNode is probably still running. Repeat the preceding steps, starting by shutting down the CDH services.
-- ---------------------------------------------
3. Download, Distribute, and Activate the Parcel
-- ---------------------------------------------
Note: Before start below, enable internet access on the node where cloudera manager installed.
3.1 In the Cloudera Manager Admin Console, click the Parcels indicator in the top navigation bar
3.2 Click Download for the version(s) you want to download.
3.3 When the download has completed, click Distribute for the version you downloaded.
3.4 When the parcel has been distributed and unpacked, the button will change to say Activate.
3.5 Click Activate. You are asked if you want to restart the cluster *** Do not restart the cluster at this time.
    3.6 Click Close
    - if some service failed to start, ignore for the time being
- Check follow below then check again

-- ---------------------
4. Upgrade HDFS Metadata
-- ---------------------
4.1 Start the ZooKeeper service.
4.2 Go to the HDFS service.
4.3 Select Actions > Upgrade HDFS Metadata.

-- -----------------------------------
5. Upgrade the Hive Metastore Database
-- -----------------------------------
   5.1 Back up the Hive metastore database.
   5.2 Go to the Hive service.
   5.3 Select Actions > Upgrade Hive Metastore Database Schema and click Upgrade Hive Metastore Database Schema to confirm.
   5.4 If you have multiple instances of Hive, perform the upgrade on each metastore database.

-- --------------------------
6. Upgrade the Oozie ShareLib
-- --------------------------
   6.1 Go to the Oozie service.
   6.2 Select Actions > Install Oozie ShareLib and click Install Oozie ShareLib to confirm.
-- -------------
7. Upgrade Sqoop
-- -------------
   7.1 Go to the Sqoop service.
   7.2 Select Actions > Upgrade Sqoop and click Upgrade Sqoop to confirm.
-- -----------------------
8. Upgrade Sentry Database
-- -----------------------
Required if you are updating from CDH 5.0 to 5.1 or later.

   8.1 Back up the Sentry database.
   8.2 Go to the Sentry service.
   8.3 Select Actions > Upgrade Sentry Database Tables and click Upgrade Sentry Database Tables to confirm.

-- -------------
9. Upgrade Spark
-- -------------
Required if you are updating from CDH 5.0 to 5.1 or later.

9.1 Go to the Spark service.
9.2 Select Actions > Upload Spark Jar and click Upload Spark Jar to confirm.
9.3 Select Actions > Create Spark History Log Dir and click Create Spark History Log Dir to confirm.

--- -------------------
10. Restart the Cluster
--- -------------------
  - CM > Cluster Name > Start

--- ---------------------------------
11. Deploy Client Configuration Files
--- ---------------------------------
   - On the Home page, click  to the right of the cluster name and select Deploy Client Configuration.
   - Click the Deploy Client Configuration button in the confirmation pop-up that appears.
  
--- ----------------------------------
12. Finalize the HDFS Metadata Upgrade
--- ----------------------------------
After ensuring that the CDH 5 upgrade has succeeded and that everything is running smoothly, finalize the HDFS metadata upgrade. It is not unusual to wait days or even weeks before finalizing the upgrade.
   - Go to the HDFS service.
   - Click the Instances tab.
   - Click the NameNode instance.
   - Select Actions > Finalize Metadata Upgrade and click Finalize Metadata Upgrade to confirm.

--- -------------
13. Common issues
--- -------------
13.1 if HDFS namenode failed to start with below error on log:
"File system image contains an old layout version -55."
   - CM > HDFS > Action > Stop
- CM > HDFS > Action > Upgrade HDFS Metadata
13.2 imapala showing "This Catalog Server is not connected to its StateStore"
- CM > Hue > Action > Stop
- CM > Imapala > Action > Stop
- CM > Hive > Action > Stop 
> Action > Upgrade Metastore Schema
> Action > Upgrade Metastore Namenodes
- CM > Hive > Action > Start
- CM > Impala > Action > Start
- CM > Hue > Action > Start
13.3 Hue hot showing databases list:
- use internet explorer. It seems that it doesn't work on Chrome.

--- --------------------------------------
14. Test the upgraded CDH working properly
--- --------------------------------------
    We can follow the steps we  did on "Step 4 & 5" on install_CM_CDH_5.0.2.txt
Note: now the example jar file is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.2.0.jar
14.1 Check from Cloudera manager, there should not ant alarm or warnings
14.2 Runs host inspector and check for any alarm or warnings. If possible then fix them.
14.3 Check existing data on imapala are accessible
  - Also check analytical query like below(for impala 2.0 and above):
SELECT qtr_hour_id,date_id,
 count() OVER (PARTITION BY date_id,qtr_hour_id) AS how_many_qtr
FROM f_ntw_actvty_http ;
14.4 check import data from Netezza working properly
14.5 check export data to netezza working properly
14.6 run example YARN jobs and check they are successful
14.7 run terasort and check the output
14.8 Checl mahut working properly & can use data stored on HDFS
14.9 check R hadoop working properly & can use data stored on HDFS
14.10 Wait for 1 or 2 days and monitor daily jobs working fine
14.11 change all the end veriable pointing to latest CDH


--- ---------------------
15. Rollback to CDH 5.0.2
--- ---------------------
Below is the feedback from Clouder Support regarding rollback.
We can prepare a manula rollback step while doing sample upgrade on test environment.

There is no true "rollback" for CDH.  While it is true that you can deactivate the new parcel and reactivate the old, or remove the new packages and re-install the old, an upgrade does not only constitute a change in the binaries and libraries for CDH.  Some components store metadata in databases, and the upgrade process will usually modify the database schema in these databases -- for example, the Hive Metastore, the Hue database, the Sqoop2 metastore, and the Oozie database.  If an upgrade of CM is involved, the CM database is also upgraded.

As such, there is no "rollback" option.  What Cloudera recommends is that all databases be backed up prior to the upgrade taking place (you will note this warning in various places in the upgrade documentation).  If necessary, a point-in-time restore can be performed, but there is no automated way to do this -- it is a highly manual process.

This is why we recommend thoroughly testing the upgrade process in an environment closely matching your production system.  Then, during the actual production upgrade, take backups of metadata stores as noted in the upgrade documentation, and if an issue does occur during the upgrade, the backups can be used to roll-back and then retry the failed upgrade steps for that particular component.


1 comment: