We have prepared below action plan for our implementation:
-- ------------------------------------------------------------------------------------------- -- BUNDLE Patch for Base Bug 10110978, Grid Infrastructure 11.2.0.1 Bundle Patch 4 for Exadata -- ------------------------------------------------------------------------------------------- -- ------------- -- Prerequisites -- ------------- 1. latest Opatch [already installed for BP8 ] and dismount all DBFS
2. Verify the OUI Inventory:
** Logon to each(4) DB Machine and follow below steps:
For GI HOME: ------------ $ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/11.2.0/grid
For DB HOME: ------------ $ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/oracle/product/11.2.0/dbhome_1 This should list the components and the list of nodes. [If the command errors out, contact Oracle Support]
3. Create a location PATCH_TOP
** Logon All 4 DB Machine and follow below steps:
$ mkdir /u01/app/patch/p10110978_112010_LINUX.X64 $ export PATCH_TOP=/u01/app/patch/p10110978_112010_LINUX.X64 $ unzip -d $PATCH_TOP p10110978_112010_LINUX.X64.zip [** make the $PATCH_TOP directory empty except the two patch folders] 4. When run the "opatch auto" command, it prompts the user to enter the OCM (Oracle Configuration Manager) response file path. If the OCM (NOT OCR) response file is already created in your environment, enter the absolute file path of the OCM responsefile. Otherwise, you can create the OCM response file with the command:
$ /u01/app/11.2.0/grid/OPatch/ocm/bin/emocmrsp --- created at /home/oracle/BP8/ocm.rsp as root run below:[ID 1128528.1] $mkdir /u01/app/11.2.0/grid/crs/admin $cd /u01/app/11.2.0/grid/crs/ $chown oracle:oinstall admin $touch /u01/app/11.2.0/grid/install/params.crs $cd /u01/app/11.2.0/grid/install/ $chown oracle:oinstall params.crs $chmod 755 params.crs -- ------------ -- Installation -- ------------ 5. apply the patch [from GRIH_HOME] to the Grid Home and all applicable RDBMS homes ** as in our case both DB & GI Home are owned by same user we will go for "opatch auto" ** for each DB node run blow commands as *root* user: # export ORACLE_HOME=/u01/app/11.2.0/grid # export PATH=$PATH:$ORACLE_HOME/OPatch # which opatch /u01/app/11.2.0/grid/OPatch/opatch # export PATCH_TOP=/u01/app/patch/p10110978_112010_LINUX.X64 # cd $PATCH_TOP # opatch auto 6. On success you can determine whether the patch has been installed by using the following command: ** for each(4) DB Machine and follow below steps: For GI HOME: ------------ $ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/11.2.0/grid
For DB HOME: ------------ $ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/oracle/product/11.2.0/dbhome_1 This should list the components and the list of nodes.
-- -------------- -- Deinstallation -- -------------- 7. ** for each DB nodes follow all the steps above except "opatch auto" in step 4. ** instead of "opatch auto" in step 4. use below command: $ opatch auto -rollback $PATCH_TOP
According to the readme and companion documents we have prepared below action plan: (We need to monitor Oracle MOS Note 888828.1 for Exadata supported patch)
** This patch is RAC Rolling Installable for ourcase, as BP5 9870547 is currently installed
1. Inatall latest version of OPatch and dismount all DBFS:
** Logon to each(4) DB Machine and follow below steps: For GI HOME: ------------ $ echo $ORACLE_HOME /u01/app/11.2.0/grid
$ mv /u01/app/11.2.0/grid/OPatch /u01/app/11.2.0/grid/OPatch_old $ unzip p6880880_112000_Linux-x86-64.zip -d /u01/app/11.2.0/grid/ $ cd $ORACLE_HOME/OPatch $ ./opatch version [to check that the version is 11.2.0.1.4]
For DB HOME: ------------ $ echo $ORACLE_HOME /u01/app/oracle/product/11.2.0/dbhome_1
$ mv /u01/app/oracle/product/11.2.0/dbhome_1/OPatch /u01/app/oracle/product/11.2.0/dbhome_1/OPatch_old $ unzip p6880880_112000_Linux-x86-64.zip -d /u01/app/oracle/product/11.2.0/dbhome_1/ $ cd $ORACLE_HOME/OPatch $ ./opatch version [to check that the version is 11.2.0.1.4]
2. Verify the OUI Inventory:
** Logon to each(4) DB Machine and follow below steps:
For GI HOME: ------------ $ echo $ORACLE_HOME /u01/app/11.2.0/grid
$cd $ORACLE_HOME/OPatch $./opatch lsinventory
For DB HOME: ------------ $ echo $ORACLE_HOME /u01/app/oracle/product/11.2.0/dbhome_1
$ cd $ORACLE_HOME/OPatch $ ./opatch lsinventory
[If the command errors out, contact Oracle Support]
4. Determine whether any currently installed one-off patches conflict with the DBM bundle patch8 10389035: ** For Both RDBMS & GI_HOME on all 4 nodes, check below $ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $PATCH_TOP/10389035 - Conflicts with a patch already applied to the ORACLE_HOME - In this case, stop the patch installation and contact Oracle Support Services - Conflicts with subset patch already applied to the ORACLE_HOME - In this case, continue with the patch installation because as the new patch
-- ------------ -- Installation -- ------------ **for all 4 nodes repeat below steps [steps 5 to 7]: 5. Install Patch on GI_HOME - As oracle run below: $ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1 $ $ORACLE_HOME/bin/srvctl stop instance -d db_unique_name -n node_name
- As root run below: # /u01/app/11.2.0/grid/bin/crsctl stop crs # /u01/app/11.2.0/grid/crs/install/rootcrs.pl -unlock - As oracle run below: $ export ORACLE_HOME=/u01/app/11.2.0/grid $ export PATH=$PATH:$ORACLE_HOME/OPatch $ which opatch /u01/app/11.2.0/grid/OPatch/opatch $ export PATCH_TOP=/u01/app/patch/p10389035_112010_Linux-x86-64 $ cd $PATCH_TOP/10389035 $ opatch apply -local - As root run below: # /u01/app/11.2.0/grid/crs/install/rootcrs.pl -patch # /u01/app/11.2.0/grid/bin/crsctl start crs
- As oracle run below: $ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1 $ $ORACLE_HOME/bin/srvctl start instance -d db_unique_name -n node_name
6. Install Patch on DB_HOME [Start a new session to unset ENV variable in step:5] - As oracle run below: $ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1 $ $ORACLE_HOME/bin/srvctl stop home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s STAT_FILE_LOCATION -n NODE_NAME $ export PATH=$PATH:$ORACLE_HOME/OPatch $ which opatch /u01/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch $ export PATCH_TOP=/u01/app/patch/p10389035_112010_Linux-x86-64 $ cd $PATCH_TOP/10389035 $ opatch apply -local $ $ORACLE_HOME/bin/srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s STAT_FILE_LOCATION -n NODE_NAME
-- --------------------------------------------- -- Postinstallation only on a ***single instance -- --------------------------------------------
8. Navigate to the $ORACLE_HOME/cfgtoollogs/catbundle directory (if $ORACLE_BASE is defined then the logs will be created under $ORACLE_BASE/cfgtoollogs/catbundle), and check the following log files for any errors like "grep ^ORA | sort -u". If there are errors, refer to "Known Issues". Here, the format of the is YYYYMMMDD_HH_MM_SS.
We have just upgraded our Exadata Storage Software Version upgrade from 11.2.1.3.1 to 11.2.2.2.0 (patch: 10356485): Before that we have done below: 1. implement Exadata Database Machine Bundle Patch 8 2. implement 10110978 above BP 8 3. implement 11661824 above BP 8
Normally Exadata SW patches are needs to e applied on both Cell Servers and DB Machines. For supported patches for exadata machines, please monitor oracle MOS Doc ID: 888828.1
According to the readme and other related companion documents we prepared below action plans which is some like below:
--------------- PATCHING CELLS: --------------- - Find the model of the cell or database host using command
dmidecode -s system-product-name
DB Machines : SUN FIRE X4170 SERVER Cell Servers: SUN FIRE X4275 SERVER
PREPARING FOR PATCHING: ---------------------- - Check for clean file systems on cells at version less than 11.2.2.1.0
Check only one cell at a time(for all 7 cells) when using rolling updates
- Monitor the cells on their LO web console during this process.
- If you are doing this check in rolling fashion with no deployment downtime, then ensure the grid disks on the cell are all offlined.
- From ASM execute, ALTER DISKGROUP DATA1 OFFLINE DISKS IN FAILGROUP [FAIL_GROUP/CELL NAME]; ALTER DISKGROUP DATA2 OFFLINE DISKS IN FAILGROUP [FAIL_GROUP/CELL NAME]; ALTER DISKGROUP SYSTEMDG OFFLINE DISKS IN FAILGROUP [FAIL_GROUP/CELL NAME]; ALTER DISKGROUP RECO OFFLINE DISKS IN FAILGROUP [FAIL_GROUP/CELL NAME];
- Shut down cell services on the cell to avoid automatic online of grid disks:
cellcli -e 'alter cell shutdown services all'
- Login as root user and reboot cell using following command. You can do this on all cells together if you are using non rolling update. This forces a file system check at boot.
sync sync shutdown -F -r now - If any cell encounters corrupt file system journal, then it will not complete the reboot. If you encounter such cell contact Oracle Support for guidance or refer to bug 9752748 to recover the cell.
- If you are doing this check in rolling fashion with no deployment downtime, then ensure the grid disks are all online before starting this procedure on another cell.
should show all grid disks that were online before step 02 are all back online.
- Ensure the various cell network configurations are consistent.
- One way to help verify the values are consistent is to run ipconf with -dry option. - CAUTION: ipconf -dry, does not flag all inconsistencies. Inspection to compare with what the system is really using is needed.
- Any values marked as (custom) during such run indicate that the cell configurations were manually modified instead of using ipconf.
- All other values should be carefully inspected and compared to the actual values in use to ensure they do not differ from each other.
- Setting up for using patchmgr
- patchmgr must be launched as root user from a system that has root user - ssh equivalence set up to the root user on all the cells to be patched *** we are ging to use a DB machine for this
- To set up root user ssh equivalence from a Exadata DB Machine follw below:
- Prepare the list of cells file say called cell_group that has one cell host name or ip address per line for each cell to be patched.
- Check for existing root ssh equivalence. Following command should require no password prompts and no interaction. It should return the list of hostnames in the cell_group file.
dcli -g cell_group -l root 'hostname -i'
- Set up ssh root equivalence if not already done from the launch host NOTE: Do not do this step if you already have root ssh equivalence.
- Generate root ssh keys. a. ssh-keygen -t dsa b. ssh-keygen -t rsa
Accept defaults so the ssh keys are created for root user - Push ssh keys to set up ssh equivalence. Enter root password when prompted for.
- Adjust the disk_repair_time for ASM(value should be 3.6h) sqlplus> select dg.name,a.value from gv$asm_diskgroup dg, gv$asm_attribute a where dg.group_number=a.group_number and a.name='disk_repair_time'; PATCHING CELLS: ---------------
- Use the LO web based console to monitor the cell during patch
- Use a fresh login session for each patch or rollback execution - Do not reboot or power cycle cells in the middle of applying the patch. You may be left with an unbootable cell.
- Do not edit any log file or open them in writable mode.
- Monitoring patch activity
- Use 'less -rf patchmgr.stdout' from another terminal session or window to see raw log details from patchmgr.
- In addition on each cell being patched, you can tail the file, /root/_patch_hctap_/_p_/wait_out
- Use tail -200f to monitor the ASM alert logs on all ASM instances to keep a watch on ASM activity as the patch or rollback progresses.
- Patching steps using patchmgr
- Do not run more than one instance of patchmgr in the deployment. It can cause a serious irrecoverable breakdown.
- Login to a system that has ssh equivalence set up for root to all cells that are to be patched.
- Unzip the patch, it will extract into the patch_11.2.2.2.0.101206.2 directory.
- Change to the patch_11.2.2.2.0.101206.2 directory.
- [according to DOC ID 1270634.1]Download any work around helpers attached to the MOS note 1270634.1, and replace dostep.sh in patch_11.2.2.2.0.101206.2 directory.
- Prepare a file listing the cells with one cell host name or ip address per line. Say the file name is cell_group.
- Verify that the cells meet prerequisite checks. Use -rolling option if you plan to use rolling updates. ./patchmgr -cells cell_group -patch_check_prereq -rolling
- If the prerequisite checks pass, then start the patch application. Use -rolling option if you plan to use rolling updates.
./patchmgr -cells cell_group -patch -rolling
- Monitor the log files and cells being patched
See the "Monitoring patch activity" section earlier in this file.
- Verify the patch status
A. Check image status and history
Assuming the patching is successful, check the imageinfo output and imagehistory output on each cell. A successful patch application will show output similar to the following.
Active image version: 11.2.2.2.0.101206.2 Active image activated: Active image status: success Active system partition on device: < / file system device after successful patch e.g. /dev/md5> Active software partition on device:
In partition rollback: Impossible
Cell boot usb partition: Cell boot usb version: 11.2.2.2.0.101206.2
Inactive image version: Inactive image activated: Inactive image status: success Inactive system partition on device: Inactive software partition on device:
Boot area has rollback archive for the version: Rollback to the inactive partitions: Possible
B. Only if you used -rolling option to patchmgr:
Ensure all grid disks that were active before patching started are active and their asmmodestatus is ONLINE. You will find the list of grid disks inactivated by the patch or rollback in the following file on the respective storage cell /root/attempted_deactivated_by_patch_griddisks.txt
- if imagehistory shows failure and misceachboot shows FAILED in file "/var/log/cellos/validations.log" [the most recent run of validations] of target CELL then do as below:
01. Log in to the cell as root user 02. Stop cell services: cellcli -e 'alter cell shutdown services all' 03. Set the LSI card to factory defaults: /opt/MegaRAID/MegaCli/MegaCli64 -AdpFacDefSet -a0 04. Reboot the cell.[shutdown -F -r now] 05. Ensure that there are no failed validations in /var/log/cellos/validations.log for the most recent run of validations. Specifically there is no misceachboot validation failure. 06. /opt/oracle.cellos/imagestatus -set success 07. Verify imageinfo now shows success
- Clean up
Use -cleanup to clean up all the temporary patch or rollback files on the cells. It will free up close to 900 MB disk space. Also use this before retrying an aborted or failed run of patchmgr.
./patchmgr -cells cell_group -cleanup
- Optionally remove the root ssh equivalence:
This is an optional step.
dcli -g cell_group -l root --unkey
- Rolling back successfully patched cells using patchmgr
- Do not run more than one instance of patchmgr at a time in the deployment. It can cause a serious irrecoverable breakdown.
- Check the prerequisites: ./patchmgr -cells cell_group -rollback_check_prereq -rolling
- Execute the rollback:
./patchmgr -cells cell_group -rollback -rolling
--------------------- PATCHING DB MACHINES: --------------------- - The database host convenience pack is distributed as db_patch_11.2.2.2.0.101206.2.zip inside the cell patch.
- [according to DOC ID 1270634.1] Before copying the db_patch_11.2.2.2.0.101206.2.zip file from cell patch, do below:
- Completely "shut down Oracle" components on the database host and disable clusterware (CRS). For example, /u01/app/11.2.0/grid/bin/crsctl stop crs -f /u01/app/11.2.0/grid/bin/crsctl disable crs - Do a clean reboot of the database host: As root user execute # reboot - Shut down oswatcher: As root user: a. cd /opt/oracle.oswatcher/osw b. ./stopOSW.sh - Turn on the controller cache /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -a0 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NoCachedBadBBU -Lall -a0 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NORA -Lall -a0 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp Direct -Lall -a0 - Add the following lines before the last exit 0 line in the file /opt/oracle.cellos/validations/init.d/misceachboot: /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -a0 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NoCachedBadBBU -Lall -a0 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NORA -Lall -a0 /opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp Direct -Lall -a0
- Apply the LSI disk controller firmware manually if it is not already at 12.9.0 [Currently 12.0.1-0081] As root user: a. Unzip the database convenience pack and extract the disk controller firmare unzip db_patch_11.2.2.2.0.101206.2.zip mkdir /tmp/tmpfw tar -pjxvf db_patch_11.2.2.2.0.101206.2/11.2.2.2.0.101206.2.tbz -C /tmp/tmpfw \ opt/oracle.cellos/iso/cellbits/dbfw.tbz tar -pjxvf /tmp/tmpfw/opt/oracle.cellos/iso/cellbits/dbfw.tbz -C /tmp/tmpfw \ ActualFirmwareFiles/12.9.0.0049_26Oct2010_2108_FW_Image.rom b. Flush file system data to disks: sync /opt/MegaRAID/MegaCli/MegaCli64 -AdpCacheFlush -a0 c. Execute firmware update: /opt/MegaRAID/MegaCli/MegaCli64 -AdpFwFlash -f \ /tmp/tmpfw/ActualFirmwareFiles/12.9.0.0049_26Oct2010_2108_FW_Image.rom -a0 e. Flush file system data to disks: sync /opt/MegaRAID/MegaCli/MegaCli64 -AdpCacheFlush -a0 f. Reboot the system: reboot -Now proceed to applying the rest of the database host convenience pack
PREPARING FOR DATABASE HOST PATCHING: -------------------------------------
* Run all steps as root user.
* Obtain Lights Out (LO) and serial console access for the database hosts to be patched. It is useful if an issue needs fixing during the patch or if some firmware update does not go through properly.
* For Exadata V1 (HP hardware) the serial console can be accessed by telnet or ssh to the LO host name or ip address of by the administrative user such as admin. To start serial console,
VSP
To stop it press escape key (ESC) followed by ( ESC ( stop
* For Exadata Oracle-Sun hardware ssh to the ILOM host name or IP address as root user. To start the serial console,
start /SP/console To stop it press escape key (ESC) followed by (
ESC ( stop /SP/console * Check output of /usr/local/bin/imagehistory.
DB host must be at release 11.2.1.2.x or 11.2.1.3.1 or 11.2.2.1.x for Exadata V2. DB host must be at release 11.2.1.2.x for Exadata machine V1.
* Unmount all NFS and DBFS file systems
* All Oracle software must be shut down before patching.
/u01/app/11.2.0/grid/bin/crsctl stop crs -f
Verify no cluster, ASM or Database processes are running:
ps -ef | grep -i 'grid\|ASM\|ora'
Shut down oswatcher:
cd /opt/oracle.oswatcher/osw ./stopOSW.sh
Oracle recommends that you do not interrupt the procedure with control-C. -------------------------------------------------------------------
PATCHING DATABASE HOSTS: -----------------------
00. Repeat the steps 01 onwards for EACH database host. If you are taking deployment wide downtime for the patching, then these steps may be performed in parallel on all database hosts. 01. Update the resource limits for the database and the grid users:
NOTE: This step does NOT apply if you have customized the values for your specific deployment and database requirements. WARNING: Do NOT execute this step, if you do have specific customized values in use for your deployment.
01.a Calculate 75% of the physical memory on the machine.
01.b Edit the /etc/security/limits.conf file to update or add following limits for the database owner (orauser) and the grid infrastructure user (griduser). Your deployment may use the same operating system user for both and it may be named as oracle user. Adjust the following appropriately.
########### BEGIN DO NOT REMOVE Added by Oracle ###########
orauser soft core unlimited orauser hard core unlimited orauser soft nproc 131072 orauser hard nproc 131072 orauser soft nofile 131072 orauser hard nofile 131072 orauser soft memlock orauser hard memlock griduser soft core unlimited griduser hard core unlimited griduser soft nproc 131072 griduser hard nproc 131072 griduser soft nofile 131072 griduser hard nofile 131072 griduser soft memlock griduser hard memlock ########### END DO NOT REMOVE Added by Oracle ###########
02. Login as root user to a the database host and copy over the db_11.2.2.2.0.101206.2_patch.zip file from unzipped cell patch. 03. Unzip db_11.2.2.2.0.101206.2_patch.zip file. It will create directory:
db_patch_11.2.2.2.0.101206.2
04. Change to the db_patch_11.2.2.2.0.101206.2 directory.
05. Run ./install.sh [this should be run locally from the host which is being patched]
It will return to the prompt immediately after submitting the patch in background. Return to the prompt does NOT mean the patch is complete.
NOTE: The install.sh will submit the patch process in the background to prevent interruption of the patch in case the login session gets terminated due to network connection break. The database host will reboot as part of the patch process after a while.
NOTE: The database host will reboot as part of this update. You will lose your connections including the ssh connection and the database host may appear to hang which the ssh connection eventually times out. If the LO gets updated, then the same connection loss or freeze will be experienced. After 10 minutes retry connection.
06. Verify the patch status:
After the system is rebooted and up, 06.1. /usr/local/bin/imageinfo should show Image version as 11.2.2.2.0. For example,
Image version: 11.2.2.2.0.101206.2 Image activated: 2010-12-06 11:35:59 -0700 Image status: success 06.2 Verify the ofa rpm version
Find the kernel value: uname -r Find the ofa rpm version: rpm -qa | grep ofa 06.2.a ofa versions should match following table,
kernel ofa
2.6.18-194.3.1.0.3 1.5.1-4.0.28 2.6.18-194.3.1.0.2 None, the ofa is part of the kernel 2.6.18-128.1.16.0.1 1.4.2-14 2.6.18-53.1.21.2.1 1.4.2-14 06.3 Only for database host models X4170, X4170M2:
Verify the LSI disk controller firmware is now 12.9.0-0049 /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aAll | grep 'FW Package Build'
06.4. Only for the model X4170 database hosts:
06.4.a Verify the version of ILOM firmware: ipmitool sunoem version Version: 3.0.9.19.a r55943
06.4.b Verify the thermal profile to ensure appropriate fan speeds.
ipmitool sunoem cli "show /SP/policy"
Should display: FLASH_ACCELERATOR_CARD_INSTALLED=enabled
Post Installation ----------------- - Now proceed to applying the rest of the database host convenience pack.
- After the database host reboots at the end of the convenience pack, enable the clusterware and bring up rest of the Oracle stack. /u01/app/11.2.0/grid/bin/crsctl enable crs - Verify that the controller cache is on by running,
/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -a0
The output for each Logical Drive should show following lines:
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU - Reboot and mount unmounted ASM Diskgroups [alter diskgroup ... mount;]