Saturday, January 12, 2013

Shutting Down Exadata Storage Cell


Normally I follow below steps while shutting down an Exadata Storage Cell for maintenance purpose. (We were using Exadata storage SW version 11.2.2.4.0.) 


1. check disk repair time from ASM instance and ajust it if required.
select dg.name, a.value
  from v$asm_diskgroup dg, v$asm_attribute a
 where dg.group_number = a.group_number
   and a.name = 'disk_repair_time';

2. from both ASM & DB instance, make sure that all the DISKs are online

select header_status,mode_status,state,failgroup,count(*) from gv$asm_disk group by header_status,mode_status,state,failgroup;

3. from ASM instanvce wait until below query returns zero rows
select * from gv$asm_operation;

4. Next you will need to check if ASM will be OK if the grid disks go OFFLINE. The following command should return 'Yes' for the grid disks being listed:
   cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome 


5. inactivated all griddisks
cellcli -e ALTER GRIDDISK ALL INACTIVE

6. Confirm that the griddisks are now offline by performing the following actions:
   a. Execute the command below and the output should show asmmodestatus=UNUSED and asmdeactivationoutcome=Yes for all griddisks once the disks are offline in ASM. Only then is it safe to proceed with shutting down or restarting the cell:
      cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome
   b. List the griddisk to confirm that all show inactive:
      cellcli -e LIST GRIDDISK    

7. from both ASM & DB instance, make sure that all the DISKs are online

select header_status,mode_status,state,failgroup,count(*) from gv$asm_disk group by header_status,mode_status,state,failgroup;     

8. The following command will reboot Oracle Exadata Storage Server immediately:
   (When powering off Oracle Exadata Storage Servers, all storage services are automatically stopped.)
        # shutdown -y now   
        sync 
        sync
        init 0   

9. check validation log's last 20 lines all should bew PASSED or NOHUP RUN or BACKGROUND RUN
tail -20 /var/log/cellos/validations.log

10. Once the cell comes back online - you will need to reactive the griddisks:

        cellcli -e alter griddisk all active

11. Issue the command below and all disks should show 'active':

        cellcli -e list griddisk

12. Verify all grid disks have been successfully put online using the following command:
    (Wait until asmmodestatus is ONLINE for all grid disks 
      and Oracle ASM synchronization is only complete when all grid disks show asmmodestatus=ONLINE.)
     cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

13. from both ASM & DB instance, make sure that all the DISKs are online

select header_status,mode_status,state,failgroup,count(*) from gv$asm_disk group by header_status,mode_status,state,failgroup;

14. Next you will need to check if ASM will be OK if the grid disks go OFFLINE. The following command should return 'Yes' for the grid disks being listed:
   cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome 
-- >>> check that all the grid disks are online

15. from ASM instanvce wait until below query returns zero rows
select * from gv$asm_operation;

16. check cell alert for any uncleared critical alerts
cellcli -e list alerthistory

17. check below from both DB & ASM 
select * from gv$instance;

No comments:

Post a Comment