Tuesday, March 29, 2011

Implementing DB patch

Below are the steps for implementation of a DB patch for our case:


-- ------------------------------------------------------------------------
-- Database Machine (DBM) Overlay Patch for Bug 11661824 for Oracle Exadata
---------------------------------------------------------------------------
-- -------------
-- Prerequisites
-- -------------

1. (BP8) 10389035 is already applied and dismount all DBFS

2. latest Opatch [already installed for BP8]

3. Verify the OUI Inventory:

** Logon to each(4) DB Machine and follow below steps:

For GI HOME:
------------
$ echo $ORACLE_HOME
/u01/app/11.2.0/grid

$cd $ORACLE_HOME/OPatch
$./opatch lsinventory

For DB HOME:
------------
$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/dbhome_1

$ cd $ORACLE_HOME/OPatch
$ ./opatch lsinventory

[If the command errors out, contact Oracle Support]

4. Create a location PATCH_TOP

** Logon All 4 DB Machine and follow below steps:

$ mkdir /u01/app/patch/p11661824_112010_Linux-x86-64
$ export PATCH_TOP=/u01/app/patch/p11661824_112010_Linux-x86-64
$ unzip -d $PATCH_TOP p11661824_112010_Linux-x86-64.zip

-- ------------
-- Installation
-- ------------

**for all 4 nodes repeat below steps [steps 5 to 7]:

5. Install Patch on GI_HOME
- As oracle run below:
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ $ORACLE_HOME/bin/srvctl stop instance -d db_name -n node_name

- As root run below:
# /u01/app/11.2.0/grid/bin/crsctl stop crs
# /u01/app/11.2.0/grid/crs/install/rootcrs.pl -unlock

- As oracle run below:
$ export ORACLE_HOME=/u01/app/11.2.0/grid
$ export PATH=$PATH:$ORACLE_HOME/OPatch
$ which opatch
/u01/app/11.2.0/grid/OPatch/opatch
$ export PATCH_TOP=/u01/app/patch/p11661824_112010_Linux-x86-64
$ cd $PATCH_TOP/11661824
$ opatch apply -local

- As root run below:
# /u01/app/11.2.0/grid/crs/install/rootcrs.pl -patch
# /u01/app/11.2.0/grid/bin/crsctl check crs

- As oracle run below:
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ $ORACLE_HOME/bin/srvctl start instance -d db_name -n node_name

6. Install Patch on DB_HOME [Start a new session to unset ENV variable in step:5]
- As oracle run below:
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ $ORACLE_HOME/bin/srvctl stop home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s stat_file_location -n node_name
$ export PATH=$PATH:$ORACLE_HOME/OPatch
$ which opatch
/u01/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch
$ export PATCH_TOP=/u01/app/patch/p11661824_112010_Linux-x86-64
$ cd $PATCH_TOP/11661824
$ opatch apply -local
$ $ORACLE_HOME/bin/srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s
tat_file_location -n node_name

-- --------------
-- Deinstallation
-- --------------

7. **for all 4 nodes repeat steps 5 to 6] except use "opatch rollback -id 11661824 -local" inetead of "opatch apply -local"

Implement Grid Infrastructure Patch (10110978)

We have prepared below action plan for our implementation:

-- -------------------------------------------------------------------------------------------
-- BUNDLE Patch for Base Bug 10110978, Grid Infrastructure 11.2.0.1 Bundle Patch 4 for Exadata
-- -------------------------------------------------------------------------------------------
-- -------------
-- Prerequisites
-- -------------
1. latest Opatch [already installed for BP8 ] and dismount all DBFS

2. Verify the OUI Inventory:

** Logon to each(4) DB Machine and follow below steps:

For GI HOME:
------------
$ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/11.2.0/grid

For DB HOME:
------------
$ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/oracle/product/11.2.0/dbhome_1

This should list the components and the list of nodes.
[If the command errors out, contact Oracle Support]

3. Create a location PATCH_TOP

** Logon All 4 DB Machine and follow below steps:

$ mkdir /u01/app/patch/p10110978_112010_LINUX.X64
$ export PATCH_TOP=/u01/app/patch/p10110978_112010_LINUX.X64
$ unzip -d $PATCH_TOP p10110978_112010_LINUX.X64.zip
[** make the $PATCH_TOP directory empty except the two patch folders]
4. When run the "opatch auto" command, it prompts the user to enter the OCM (Oracle Configuration Manager) response file path.
If the OCM (NOT OCR) response file is already created in your environment, enter the absolute file path of the OCM responsefile.
Otherwise, you can create the OCM response file with the command:

$ /u01/app/11.2.0/grid/OPatch/ocm/bin/emocmrsp
--- created at /home/oracle/BP8/ocm.rsp
as root run below:[ID 1128528.1]
$mkdir /u01/app/11.2.0/grid/crs/admin
$cd /u01/app/11.2.0/grid/crs/
$chown oracle:oinstall admin
$touch /u01/app/11.2.0/grid/install/params.crs
$cd /u01/app/11.2.0/grid/install/
$chown oracle:oinstall params.crs
$chmod 755 params.crs
-- ------------
-- Installation
-- ------------
5. apply the patch [from GRIH_HOME] to the Grid Home and all applicable RDBMS homes
** as in our case both DB & GI Home are owned by same user we will go for "opatch auto"
** for each DB node run blow commands as *root* user:

# export ORACLE_HOME=/u01/app/11.2.0/grid
# export PATH=$PATH:$ORACLE_HOME/OPatch
# which opatch
/u01/app/11.2.0/grid/OPatch/opatch
# export PATCH_TOP=/u01/app/patch/p10110978_112010_LINUX.X64
# cd $PATCH_TOP
# opatch auto

6. On success you can determine whether the patch has been installed by using the following command:
** for each(4) DB Machine and follow below steps:
For GI HOME:
------------
$ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/11.2.0/grid

For DB HOME:
------------
$ /u01/app/11.2.0/grid/OPatch/opatch lsinventory -detail -oh /u01/app/oracle/product/11.2.0/dbhome_1

This should list the components and the list of nodes.

-- --------------
-- Deinstallation
-- --------------
7. ** for each DB nodes follow all the steps above except "opatch auto" in step 4.
** instead of "opatch auto" in step 4. use below command:
$ opatch auto -rollback $PATCH_TOP

8. Check deinstallation using step 6.


Exadata: Implementing Database Machine Bundle Patch (8)

According to the readme and companion documents we have prepared below action plan:
(We need to monitor Oracle MOS Note 888828.1 for Exadata supported patch)

** This patch is RAC Rolling Installable for ourcase, as BP5 9870547 is currently installed

-- -------------
-- Prerequisites
-- -------------

1. Inatall latest version of OPatch and dismount all DBFS:

** Logon to each(4) DB Machine and follow below steps:

For GI HOME:
------------
$ echo $ORACLE_HOME
/u01/app/11.2.0/grid

$ mv /u01/app/11.2.0/grid/OPatch /u01/app/11.2.0/grid/OPatch_old
$ unzip p6880880_112000_Linux-x86-64.zip -d /u01/app/11.2.0/grid/
$ cd $ORACLE_HOME/OPatch
$ ./opatch version [to check that the version is 11.2.0.1.4]

For DB HOME:
------------
$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/dbhome_1

$ mv /u01/app/oracle/product/11.2.0/dbhome_1/OPatch /u01/app/oracle/product/11.2.0/dbhome_1/OPatch_old
$ unzip p6880880_112000_Linux-x86-64.zip -d /u01/app/oracle/product/11.2.0/dbhome_1/
$ cd $ORACLE_HOME/OPatch
$ ./opatch version [to check that the version is 11.2.0.1.4]


2. Verify the OUI Inventory:

** Logon to each(4) DB Machine and follow below steps:

For GI HOME:
------------
$ echo $ORACLE_HOME
/u01/app/11.2.0/grid

$cd $ORACLE_HOME/OPatch
$./opatch lsinventory

For DB HOME:
------------
$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/dbhome_1

$ cd $ORACLE_HOME/OPatch
$ ./opatch lsinventory

[If the command errors out, contact Oracle Support]

3. Create a location PATCH_TOP

** Logon All 4 DB Machine and follow below steps:

$ mkdir /u01/app/patch/p10389035_112010_Linux-x86-64
$ export PATCH_TOP=/u01/app/patch/p10389035_112010_Linux-x86-64
$ unzip -d $PATCH_TOP p10389035_112010_Linux-x86-64.zip

4. Determine whether any currently installed one-off patches conflict with the DBM bundle patch8 10389035:
** For Both RDBMS & GI_HOME on all 4 nodes, check below

$ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $PATCH_TOP/10389035
- Conflicts with a patch already applied to the ORACLE_HOME - In this case, stop the patch installation and contact Oracle Support Services
- Conflicts with subset patch already applied to the ORACLE_HOME - In this case, continue with the patch installation because as the new patch

-- ------------
-- Installation
-- ------------
**for all 4 nodes repeat below steps [steps 5 to 7]:

5. Install Patch on GI_HOME
- As oracle run below:
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ $ORACLE_HOME/bin/srvctl stop instance -d db_unique_name -n node_name

- As root run below:
# /u01/app/11.2.0/grid/bin/crsctl stop crs
# /u01/app/11.2.0/grid/crs/install/rootcrs.pl -unlock

- As oracle run below:
$ export ORACLE_HOME=/u01/app/11.2.0/grid
$ export PATH=$PATH:$ORACLE_HOME/OPatch
$ which opatch
/u01/app/11.2.0/grid/OPatch/opatch
$ export PATCH_TOP=/u01/app/patch/p10389035_112010_Linux-x86-64
$ cd $PATCH_TOP/10389035
$ opatch apply -local

- As root run below:
# /u01/app/11.2.0/grid/crs/install/rootcrs.pl -patch
# /u01/app/11.2.0/grid/bin/crsctl start crs

- As oracle run below:
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ $ORACLE_HOME/bin/srvctl start instance -d db_unique_name -n node_name

6. Install Patch on DB_HOME [Start a new session to unset ENV variable in step:5]
- As oracle run below:
$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
$ $ORACLE_HOME/bin/srvctl stop home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s STAT_FILE_LOCATION -n NODE_NAME
$ export PATH=$PATH:$ORACLE_HOME/OPatch
$ which opatch
/u01/app/oracle/product/11.2.0/dbhome_1/OPatch/opatch
$ export PATCH_TOP=/u01/app/patch/p10389035_112010_Linux-x86-64
$ cd $PATCH_TOP/10389035
$ opatch apply -local
$ $ORACLE_HOME/bin/srvctl start home -o /u01/app/oracle/product/11.2.0/dbhome_1 -s STAT_FILE_LOCATION -n NODE_NAME

-- ---------------------------------------------
-- Postinstallation only on a ***single instance
-- --------------------------------------------

7. Reload the packages into the database:

$ sqlplus /nolog
SQL> connect / as sysdba
SQL> @?/rdbms/admin/catbundle.sql exa apply

8. Navigate to the $ORACLE_HOME/cfgtoollogs/catbundle directory
(if $ORACLE_BASE is defined then the logs will be created under $ORACLE_BASE/cfgtoollogs/catbundle),
and check the following log files for any errors like "grep ^ORA | sort -u".
If there are errors, refer to "Known Issues". Here, the format of the is YYYYMMMDD_HH_MM_SS.

catbundle_EXA__APPLY_.log
catbundle_EXA__GENERATE_.log

-- --------------
-- Deinstallation
-- --------------

9. **for all 4 nodes repeat steps 5 to 7 except use "opatch rollback -id 10389035 -local" inetead of "opatch apply -local"

Exadata Storage Software Version upgrade from 11.2.1.3.1 to 11.2.2.2.0

We have just upgraded our Exadata Storage Software Version upgrade from 11.2.1.3.1 to 11.2.2.2.0 (patch: 10356485):
Before that we have done below:
1. implement Exadata Database Machine Bundle Patch 8
2. implement 10110978 above BP 8
3. implement 11661824 above BP 8

Normally Exadata SW patches are needs to e applied on both Cell Servers and DB Machines.
For supported patches for exadata machines, please monitor oracle MOS Doc ID: 888828.1

According to the readme and other related companion documents we prepared below action plans which is some like below:

---------------
PATCHING CELLS:
---------------

- Find the model of the cell or database host using command

dmidecode -s system-product-name

DB Machines : SUN FIRE X4170 SERVER
Cell Servers: SUN FIRE X4275 SERVER


PREPARING FOR PATCHING:
----------------------

- Check for clean file systems on cells at version less than 11.2.2.1.0

Check only one cell at a time(for all 7 cells) when using rolling updates

- Monitor the cells on their LO web console during this process.

- If you are doing this check in rolling fashion with no deployment downtime, then
ensure the grid disks on the cell are all offlined.

- From ASM execute,

ALTER DISKGROUP DATA1 OFFLINE DISKS IN FAILGROUP [FAIL_GROUP/CELL NAME];
ALTER DISKGROUP DATA2 OFFLINE DISKS IN FAILGROUP
[FAIL_GROUP/CELL NAME];
ALTER DISKGROUP SYSTEMDG OFFLINE DISKS IN FAILGROUP
[FAIL_GROUP/CELL NAME];
ALTER DISKGROUP RECO OFFLINE DISKS IN FAILGROUP
[FAIL_GROUP/CELL NAME];

- Shut down cell services on the cell to avoid automatic online of grid disks:

cellcli -e 'alter cell shutdown services all'

- Login as root user and reboot cell using following command.
You can do this on all cells together if you are using non
rolling update. This forces a file system check at boot.

sync
sync
shutdown -F -r now

- If any cell encounters corrupt file system journal, then it will
not complete the reboot. If you encounter such cell contact Oracle
Support for guidance or refer to bug 9752748 to recover the cell.

- If you are doing this check in rolling fashion with no deployment downtime, then
ensure the grid disks are all online before starting this procedure on another cell.

cellcli -e 'list griddisk attributes name,asmmodestatus'

should show all grid disks that were online before step 02 are all back online.

- Ensure the various cell network configurations are consistent.


- One way to help verify the values are consistent is to run ipconf with -dry option.

- CAUTION: ipconf -dry, does not flag all inconsistencies. Inspection to compare with what the system is really using is needed.

- Any values marked as (custom) during such run indicate that the cell configurations were manually modified instead of using ipconf.

- All other values should be carefully inspected and compared to the actual values in use to ensure they do not differ from each other.

- Setting up for using patchmgr

- patchmgr must be launched as root user from a system that has root user

- ssh equivalence set up to the root user on all the cells to be patched
*** we are ging to use a DB machine for this

- To set up root user ssh equivalence from a Exadata DB Machine follw below:

- Prepare the list of cells file say called cell_group that has one cell host name or ip address per line for each cell to be patched.

- Check for existing root ssh equivalence. Following command should
require no password prompts and no interaction. It should return
the list of hostnames in the cell_group file.

dcli -g cell_group -l root 'hostname -i'

- Set up ssh root equivalence if not already done from the launch host
NOTE: Do not do this step if you already have root ssh equivalence.

- Generate root ssh keys.
a. ssh-keygen -t dsa
b. ssh-keygen -t rsa

Accept defaults so the ssh keys are created for root user

- Push ssh keys to set up ssh equivalence. Enter root password
when prompted for.

dcli -g cell_group -l root -k

- Check prerequisites


* If you are using rolling patch, then

./patchmgr -cells cell_group -patch_check_prereq -rolling

* If you are using rolling rollback, then

./patchmgr -cells cell_group -rollback_check_prereq -rolling


- For rolling patching

- Ensure the database is up and running

- Adjust the disk_repair_time for ASM(value should be 3.6h)
sqlplus> select dg.name,a.value from gv$asm_diskgroup
dg, gv$asm_attribute a where dg.group_number=a.group_number and
a.name='disk_repair_time';


PATCHING CELLS:
---------------

- Use the LO web based console to monitor the cell during patch

- Use a fresh login session for each patch or rollback execution

- Do not reboot or power cycle cells in the middle of applying the patch.
You may be left with an unbootable cell.

- Do not edit any log file or open them in writable mode.

- Monitoring patch activity

- Use 'less -rf patchmgr.stdout' from another terminal session or window
to see raw log details from patchmgr.

- In addition on each cell being patched, you can tail the file,

/root/_patch_hctap_/_p_/wait_out

- Use tail -200f to monitor the ASM alert logs on
all ASM instances to keep a watch on ASM activity as the patch or rollback
progresses.

- Patching steps using patchmgr

- Do not run more than one instance of patchmgr in the deployment. It
can cause a serious irrecoverable breakdown.

- Login to a system that has ssh equivalence set up for root to all cells that are to be patched.

- Unzip the patch, it will extract into the patch_11.2.2.2.0.101206.2 directory.

- Change to the patch_11.2.2.2.0.101206.2 directory.

- [according to DOC ID 1270634.1]Download any work around helpers attached to the MOS note 1270634.1, and replace dostep.sh in patch_11.2.2.2.0.101206.2 directory.

- Prepare a file listing the cells with one cell host name or ip address per
line. Say the file name is cell_group.

- Verify that the cells meet prerequisite checks. Use -rolling option if you plan to use rolling updates.

./patchmgr -cells cell_group -patch_check_prereq -rolling

- If the prerequisite checks pass, then start the patch application. Use -rolling option if you plan to use rolling updates.

./patchmgr -cells cell_group -patch -rolling

- Monitor the log files and cells being patched

See the "Monitoring patch activity" section earlier in this file.

- Verify the patch status

A. Check image status and history

Assuming the patching is successful, check the imageinfo output and
imagehistory output on each cell. A successful patch application will
show output similar to the following.

Kernel version: 2.6.18-194.3.1.0.3.el5 #1 SMP Tue Aug 31 22:41:13 EDT 2010 x86_64
Cell version: OSS_MAIN_LINUX.X64_101206.2
Cell rpm version: cell-11.2.2.2.0_LINUX.X64_101206.2-1

Active image version: 11.2.2.2.0.101206.2
Active image activated:
Active image status: success
Active system partition on device: < / file system device after successful patch e.g. /dev/md5>
Active software partition on device:

In partition rollback: Impossible

Cell boot usb partition:
Cell boot usb version: 11.2.2.2.0.101206.2

Inactive image version:
Inactive image activated:
Inactive image status: success
Inactive system partition on device:
Inactive software partition on device:

Boot area has rollback archive for the version:
Rollback to the inactive partitions: Possible

B. Only if you used -rolling option to patchmgr:

Ensure all grid disks that were active before patching started are
active and their asmmodestatus is ONLINE. You will find the list of
grid disks inactivated by the patch or rollback in the following file
on the respective storage cell

/root/attempted_deactivated_by_patch_griddisks.txt

cellcli -e 'list griddisk attributes name,status,asmmodestatus'

- if imagehistory shows failure and misceachboot shows FAILED in file "/var/log/cellos/validations.log" [the most recent run of validations]
of target CELL then do as below:

01. Log in to the cell as root user
02. Stop cell services: cellcli -e 'alter cell shutdown services all'
03. Set the LSI card to factory defaults: /opt/MegaRAID/MegaCli/MegaCli64 -AdpFacDefSet -a0
04. Reboot the cell.[shutdown -F -r now]
05. Ensure that there are no failed validations in /var/log/cellos/validations.log for
the most recent run of validations. Specifically there is no misceachboot validation failure.
06. /opt/oracle.cellos/imagestatus -set success
07. Verify imageinfo now shows success


- Clean up

Use -cleanup to clean up all the temporary patch or rollback files on the
cells. It will free up close to 900 MB disk space.

Also use this before retrying an aborted or failed run of patchmgr.

./patchmgr -cells cell_group -cleanup

- Optionally remove the root ssh equivalence:

This is an optional step.

dcli -g cell_group -l root --unkey


- Rolling back successfully patched cells using patchmgr

- Do not run more than one instance of patchmgr at a time in the deployment. It
can cause a serious irrecoverable breakdown.

- Check the prerequisites:

./patchmgr -cells cell_group -rollback_check_prereq -rolling

- Execute the rollback:

./patchmgr -cells cell_group -rollback -rolling


---------------------
PATCHING DB MACHINES:
---------------------

- The database host convenience pack is distributed as db_patch_11.2.2.2.0.101206.2.zip inside the cell patch.

- [according to DOC ID 1270634.1] Before copying the db_patch_11.2.2.2.0.101206.2.zip file from cell patch, do below:

- Completely "shut down Oracle" components on the database host and disable clusterware (CRS). For example,
/u01/app/11.2.0/grid/bin/crsctl stop crs -f
/u01/app/11.2.0/grid/bin/crsctl disable crs
- Do a clean reboot of the database host: As root user execute
# reboot

- Shut down oswatcher:
As root user:
a. cd /opt/oracle.oswatcher/osw
b. ./stopOSW.sh
- Turn on the controller cache
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NoCachedBadBBU -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NORA -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp Direct -Lall -a0

- Add the following lines before the last exit 0 line in the file /opt/oracle.cellos/validations/init.d/misceachboot:
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp WB -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NoCachedBadBBU -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp NORA -Lall -a0
/opt/MegaRAID/MegaCli/MegaCli64 -LDSetProp Direct -Lall -a0

- Apply the LSI disk controller firmware manually if it is not already at 12.9.0 [Currently 12.0.1-0081]
As root user:
a. Unzip the database convenience pack and extract the disk controller firmare
unzip db_patch_11.2.2.2.0.101206.2.zip
mkdir /tmp/tmpfw
tar -pjxvf db_patch_11.2.2.2.0.101206.2/11.2.2.2.0.101206.2.tbz -C /tmp/tmpfw \
opt/oracle.cellos/iso/cellbits/dbfw.tbz
tar -pjxvf /tmp/tmpfw/opt/oracle.cellos/iso/cellbits/dbfw.tbz -C /tmp/tmpfw \
ActualFirmwareFiles/12.9.0.0049_26Oct2010_2108_FW_Image.rom
b. Flush file system data to disks:
sync
/opt/MegaRAID/MegaCli/MegaCli64 -AdpCacheFlush -a0
c. Execute firmware update:
/opt/MegaRAID/MegaCli/MegaCli64 -AdpFwFlash -f \
/tmp/tmpfw/ActualFirmwareFiles/12.9.0.0049_26Oct2010_2108_FW_Image.rom -a0
e. Flush file system data to disks:
sync
/opt/MegaRAID/MegaCli/MegaCli64 -AdpCacheFlush -a0
f. Reboot the system: reboot

-Now proceed to applying the rest of the database host convenience pack


PREPARING FOR DATABASE HOST PATCHING:
-------------------------------------

* Run all steps as root user.

* Obtain Lights Out (LO) and serial console access for the database hosts to
be patched. It is useful if an issue needs fixing during the patch or if
some firmware update does not go through properly.

* For Exadata V1 (HP hardware) the serial console can be accessed by telnet
or ssh to the LO host name or ip address of by the administrative user
such as admin. To start serial console,

VSP

To stop it press escape key (ESC) followed by (

ESC (
stop

* For Exadata Oracle-Sun hardware ssh to the ILOM host name or IP
address as root user. To start the serial console,

start /SP/console

To stop it press escape key (ESC) followed by (

ESC (
stop /SP/console

* Check output of /usr/local/bin/imagehistory.

DB host must be at release 11.2.1.2.x or 11.2.1.3.1 or 11.2.2.1.x for Exadata V2.
DB host must be at release 11.2.1.2.x for Exadata machine V1.

* Unmount all NFS and DBFS file systems

* All Oracle software must be shut down before patching.

/u01/app/11.2.0/grid/bin/crsctl stop crs -f

Verify no cluster, ASM or Database processes are running:

ps -ef | grep -i 'grid\|ASM\|ora'

Shut down oswatcher:

cd /opt/oracle.oswatcher/osw
./stopOSW.sh

Oracle recommends that you do not interrupt the procedure with control-C.
-------------------------------------------------------------------

PATCHING DATABASE HOSTS:
-----------------------

00. Repeat the steps 01 onwards for EACH database host. If you are taking
deployment wide downtime for the patching, then these steps
may be performed in parallel on all database hosts.

01. Update the resource limits for the database and the grid users:

NOTE: This step does NOT apply if you have customized the values
for your specific deployment and database requirements.

WARNING: Do NOT execute this step, if you do have specific
customized values in use for your deployment.

01.a Calculate 75% of the physical memory on the machine.

let -i x=($((`cat /proc/meminfo | grep 'MemTotal:' | awk '{print $2}'` * 3 / 4)))
echo $x

01.b Edit the /etc/security/limits.conf file to update or add following
limits for the database owner (orauser) and the grid infrastructure
user (griduser). Your deployment may use the same operating system
user for both and it may be named as oracle user. Adjust the following
appropriately.

########### BEGIN DO NOT REMOVE Added by Oracle ###########

orauser soft core unlimited
orauser hard core unlimited
orauser soft nproc 131072
orauser hard nproc 131072
orauser soft nofile 131072
orauser hard nofile 131072
orauser soft memlock
orauser hard memlock

griduser soft core unlimited
griduser hard core unlimited
griduser soft nproc 131072
griduser hard nproc 131072
griduser soft nofile 131072
griduser hard nofile 131072
griduser soft memlock
griduser hard memlock

########### END DO NOT REMOVE Added by Oracle ###########

02. Login as root user to a the database host and copy over the
db_11.2.2.2.0.101206.2_patch.zip file from unzipped cell patch.

03. Unzip db_11.2.2.2.0.101206.2_patch.zip file. It will create directory:

db_patch_11.2.2.2.0.101206.2

04. Change to the db_patch_11.2.2.2.0.101206.2 directory.

05. Run ./install.sh [this should be run locally from the host which is being patched]

It will return to the prompt immediately after submitting the patch
in background. Return to the prompt does NOT mean the patch is
complete.

NOTE: The install.sh will submit the patch process in the background
to prevent interruption of the patch in case the login session
gets terminated due to network connection break. The database host
will reboot as part of the patch process after a while.

NOTE: The database host will reboot as part of this update.
You will lose your connections including the ssh connection
and the database host may appear to hang which the ssh connection
eventually times out. If the LO gets updated, then the same
connection loss or freeze will be experienced.
After 10 minutes retry connection.

06. Verify the patch status:

After the system is rebooted and up,

06.1. /usr/local/bin/imageinfo should show Image version as 11.2.2.2.0.
For example,

Image version: 11.2.2.2.0.101206.2
Image activated: 2010-12-06 11:35:59 -0700
Image status: success

06.2 Verify the ofa rpm version

Find the kernel value: uname -r
Find the ofa rpm version: rpm -qa | grep ofa

06.2.a ofa versions should match following table,

kernel ofa

2.6.18-194.3.1.0.3 1.5.1-4.0.28
2.6.18-194.3.1.0.2 None, the ofa is part of the kernel
2.6.18-128.1.16.0.1 1.4.2-14
2.6.18-53.1.21.2.1 1.4.2-14

06.3 Only for database host models X4170, X4170M2:

Verify the LSI disk controller firmware is now 12.9.0-0049

/opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aAll | grep 'FW Package Build'

06.4. Only for the model X4170 database hosts:

06.4.a Verify the version of ILOM firmware: ipmitool sunoem version

Version: 3.0.9.19.a r55943

06.4.b Verify the thermal profile to ensure appropriate fan speeds.

ipmitool sunoem cli "show /SP/policy"

Should display: FLASH_ACCELERATOR_CARD_INSTALLED=enabled

Post Installation
-----------------
- Now proceed to applying the rest of the database host convenience pack.

- After the database host reboots at the end of the convenience pack, enable the
clusterware and bring up rest of the Oracle stack.
/u01/app/11.2.0/grid/bin/crsctl enable crs
- Verify that the controller cache is on by running,

/opt/MegaRAID/MegaCli/MegaCli64 -LDInfo -Lall -a0

The output for each Logical Drive should show following lines:

Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
- Reboot and mount unmounted ASM Diskgroups [alter diskgroup ... mount;]



ORA-29339 and Configure db_nk_cache_size

While importing TTS we found below error:
ORA-29339: tablespace block size 32768 does not match configured block sizes

Then, on target DB we execute below:

SQL> show parameter db_32k_cache_size;

NAME TYPE VALUE
------------------------------------
db_32k_cache_size big integer 0

here value=0, so, (32768/1024=)32K cache in not congigured. Inorder to import this TTS, first we need to configure this.

According to oracle doc, before configuring this we need to consider below:
- We cannot configure db_nk_cache_size in db_block_size=n. We cannot configure db_32k_cache_size in db_block_size=32768.[ORA-00380]
- This parameter cannot be set to zero if there are any online tablespaces with an nK block size
- We need to consider Operating system-specific block size restrictions[ORA-00382]
We cannot set DB_32K_CACHE_SIZE if the operating system's maximum block size is less than 32 KB.
Also, you cannot set DB_2K_CACHE_SIZE if the minimum block size is greater than 2 KB.
- For 10g and later, nK-cache values are not automatically managed by Oracle and settings for them are subtracted from the total SGA_TARGET before the remainder is distributed amongst the other, automatically managed, SGA components (like the default cache, shared, large and java pools).
So, if SGA_TARGET=8GB and DB_32K_CACHE_SIZE=1GB, then 7GB of memory is available for distribution amongst the automatically-managed SGA components.
- For for 9i and earlier, we need to readjust other memory component size within sga_max_size [ORA-00384]

Our DB was a 9i DB, so we first shirnk db_cache_size by 50M and then set db_32k_cache_size=50M:

SQL> alter system set db_cache_size=158M scope=both;

SQL> alter system set db_32k_cache_size=50M scope=both;

SQL> create pfile from spfile;

Hidden Parameter Value

Today we needed to view value of some hidden ('_') parameter. We did this using below query:


select x.indx + 1 NUM,
ksppinm NAME,
ksppity TYPE,
ksppstvl display_Value,
ksppstdf ISDEFAULT,
decode(bitand(ksppiflg / 256, 1), 1, 'TRUE', 'FALSE') ISSES_MODIFIABLE,
decode(bitand(ksppiflg / 65536, 3) ,1,'IMMEDIATE',2,'DEFERRED',3,'IMMEDIATE','FALSE') ISSYS_MODIFIABLE,
decode(bitand(ksppstvf, 7), 1, 'MODIFIED', 4, 'SYSTEM_MOD', 'FALSE') ISMODIFIED,
decode(bitand(ksppstvf, 2), 2, 'TRUE', 'FALSE') ISADJUSTED,
ksppdesc DESCRIPTION,
ksppstcmnt UPDATE_COMMENT
from x$ksppi x, x$ksppcv y
where (x.indx = y.indx);

These parameters can be altered using double quote (") as below:

ALTER SYSTEM SET "_tts_allow_nchar_mismatch"=false;

Tuesday, March 8, 2011

Tracing a specific ORA error

To trace a specific ORA error I am using below scenario:

One of out users complained us about below error while running a third party application:
"ORA-01775: looping chain of synonyms"

We have a 4 node RAC and we went through all 4 alert logs but could not find it on alert logs.
Then we have done below on "node 1":
Start trace:
SQL> alter system set events '1775 trace name ERRORSTACK level 3';

Then, told the user to connect to "node 1" [using SID & IP in TNS ] repeat the action caused the error.

Now on alert log on "node 1" we found below entry:

Sat Mar 05 11:59:31 2011
Errors in file /u01/app/oracle/diag/rdbms/dw/dw1/trace/dw1_ora_30788.trc:
ORA-01775: looping chain of synonyms

From the trace file found in the alert log we point out the query & object responsible for the error.
In this case we found that a synonym pointing a table but the table was not in dba_objects.

Now,on "node 1" stop trace:
SQL> alter system set events '1775 trace name errorstack off';

More about trace:
http://toddlerdba.blogspot.com/2011/01/session-trace.html
https://netfiles.uiuc.edu/jstrode/www/oraelmt/trace_using_parameters.html