Oracle Rac 11g Maintenance

Digest: Oracle11g RAC常用操作 (维护及管理)
http://space.itpub.net/35489/viewspace-683852

1. 查看各资源状态(nodeapps节点应用程序,ASM实例,数据库实例等):

[root@rac01 u01]# su - grid
[grid@rac01 ~]$ crs_stat -t (命令兼容10g)
Name Type Target State Host


ora….ER.lsnr ora….er.type ONLINE ONLINE rac01
ora….N1.lsnr ora….er.type ONLINE ONLINE rac01
ora….VOTE.dg ora….up.type ONLINE ONLINE rac01
ora.ORADATA.dg ora….up.type ONLINE ONLINE rac01
ora….LASH.dg ora….up.type ONLINE ONLINE rac01
ora.asm ora.asm.type ONLINE ONLINE rac01
ora.eons ora.eons.type ONLINE ONLINE rac01
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora….network ora….rk.type ONLINE ONLINE rac01
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ons ora.ons.type ONLINE ONLINE rac01
ora….SM1.asm application ONLINE ONLINE rac01
ora….01.lsnr application ONLINE ONLINE rac01
ora.rac01.gsd application OFFLINE OFFLINE
ora.rac01.ons application ONLINE ONLINE rac01
ora.rac01.vip ora….t1.type ONLINE ONLINE rac01
ora….SM2.asm application ONLINE ONLINE rac02
ora….02.lsnr application ONLINE ONLINE rac02
ora.rac02.gsd application OFFLINE OFFLINE
ora.rac02.ons application ONLINE ONLINE rac02
ora.rac02.vip ora….t1.type ONLINE ONLINE rac02
ora.racdb.db ora….se.type ONLINE ONLINE rac01
ora….ry.acfs ora….fs.type ONLINE ONLINE rac01
ora.scan1.vip ora….ip.type ONLINE ONLINE rac01

在11g R2中,默认 oc4j和gsd资源是 disable 的;oc4j 是用于WLM 的一个
资源, WLM在 11.2.0.2 才可用;gsd 是 CRS 用于跟 9i RAC 进行通信的一
个模块,是为了向后兼容才保留的,不影响性能;建议不要刪除, 也不要尝试开
启他们, 忽略即可。

11g RAC 常用的是下面的命令:crsctl stat resource -t .

[root@rac01 u01]# su - grid
[grid@rac01 ~]$ crsctl stat resource -t

如果后面不带 -t , 那么可以看到相对详细的资源信息 。
[grid@rac01 ~]$ crsctl stat resource

2. 常用开关机命令

注意, 11g RAC 开启资源相对比较慢(即使命令后面显示的资源都start succeeded,
通过crs_stat -t查看都不一定online), 请注意命令操作后观察crs log变化,以免出
现还没有开启就怀疑启动有异常而采取重复动作 。

以下命令供参考:


在本地服务器上停止Oracle Clusterware 系统:

[root@rac01 ~]# /u01/grid/11.2.0/bin/crsctl stop cluster
注:在运行“crsctl stop cluster”命令之后,如果 Oracle Clusterware 管理的
资源中有任何一个还在运行,则整个命令失败。使用 -f 选项无条件地停止所有资源
并停止 Oracle Clusterware 系统。

[root@rac02 ~]# /u01/grid/11.2.0/bin/crsctl stop cluster -all
停止所有节点上的clusterware系统。

在本地服务器上启动oralce clusterware系统:
[root@rac01 ~]# /u01/grid/11.2.0/bin/crsctl start cluster

注:可通过指定 -all 选项在集群中所有服务器上启动 Oracle Clusterware 系统。
[root@rac02 ~]# /u01/grid/11.2.0/bin/crsctl start cluster –all

还可以通过列出服务器(各服务器之间以空格分隔)在集群中一个或多个指定的服务器上启动 Oracle Clusterware 系

统:
[root@rac01 ~]# /u01/grid/11.2.0/bin/crsctl start cluster -n rac01 rac02

使用 SRVCTL 启动/停止所有实例:

[oracle@rac01 ~]# srvctl stop database -d racdb
[oracle@rac01 ~]# srvctl start database -d racdb

参考顺序

关机顺序: 先关闭Oracle实例(或数据库),然后关闭ASM实例,最后关闭节点应用
程序(虚拟 IP、GSD、TNS 监听器和 ONS) .

手工开机顺序: 先启动节点应用程序(虚拟 IP、GSD、TNS 监听器和 ONS)。当成功
启动节点应用程序后,启动 ASM 实例。最后,启动 Oracle 实例(相关服务)以及
企业管理器数据库控制台。

例子:

关闭:

在节点1上关闭所有节点的clusterware(如果有资源不能被关闭,使用-f).
[root@rac01 bin]# /u01/grid/11.2.0/bin/crsctl stop cluster -all

节点1,2上都关闭后我们查看
[grid@rac02 rac02]$ crsctl stat resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

开启:
[root@rac01 bin]# /u01/grid/11.2.0/bin/crsctl start cluster -all

一般来说,开启上面一个命令就够用了,但是如果发现有异常,比如db等很长
时间都没有onine, 我们也可以手工开启数据库(任何一个节点执行):
[root@rac02 ~]# /u01/grid/11.2.0/bin/srvctl start database -d racdb

备注:
如果有需要也可以一个一个实例开启。
[root@rac02 ~]# /u01/grid/11.2.0/bin/srvctl start instance -d racdb -i racdb1
[root@rac02 ~]# /u01/grid/11.2.0/bin/srvctl start instance -d racdb -i racdb2

3. 常用建立表空间及加入数据文件 (ASM)

参考 :http://space.itpub.net/?uid-7607759-action-viewspace-itemid-670722

[root@rac01 bin]# su - oracle
[oracle@rac01 ~]$ sqlplus / as sysdba
SQL> create tablespace test datafile '+oradata/racdb/datafile/test01.dbf' size 50m ;
Tablespace created.

[root@rac02 ~]# su - grid
[grid@rac02 ~]$ asmcmd
ASMCMD>
ASMCMD> pwd
+oradata/racdb/datafile
ASMCMD>
ASMCMD> ls
SYSAUX.261.739387301
SYSTEM.260.739387283
TEST.340.740166807
UNDOTBS1.262.739387315
UNDOTBS2.264.739387351
USERS.265.739387361
test01.dbf

SQL> alter tablespace test add datafile '+oradata/racdb/datafile/test02.dbf' size 50m ;
Tablespace altered.

SQL>

ASMCMD> ls
SYSAUX.261.739387301
SYSTEM.260.739387283
TEST.340.740166807
TEST.341.740166937
UNDOTBS1.262.739387315
UNDOTBS2.264.739387351
USERS.265.739387361
test01.dbf
test02.dbf
ASMCMD>

ASMCMD> ls -al
Type Redund Striped Time Sys Name
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y none => SYSAUX.261.739387301
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y none => SYSTEM.260.739387283
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y +ORADATA/RACDB/DATAFILE/test01.dbf =>

TEST.340.740166807
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y +ORADATA/RACDB/DATAFILE/test02.dbf =>

TEST.341.740166937
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y none => UNDOTBS1.262.739387315
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y none => UNDOTBS2.264.739387351
DATAFILE UNPROT COARSE JAN 11 17:00:00 Y none => USERS.265.739387361
N test01.dbf =>

+ORADATA/RACDB/DATAFILE/TEST.340.740166807
N test02.dbf =>

+ORADATA/RACDB/DATAFILE/TEST.341.740166937
ASMCMD>

4. 查看ASM实例及用户数据库实例(注意分别是grid及oracle用户):

查看ASM实例(以grid用户登入,通过查看初始参数可以看到instance_name=+ASM1):

[grid@rac01 ~]$ id
uid=501(grid) gid=501(oinstall) groups=501(oinstall),504(asmadmin),506(asmdba),507(asmoper)
[grid@rac01 ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 11.2.0.1.0 Production on Tue Jan 4 00:58:52 2011
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options

SQL> show parameter

NAME TYPE VALUE
-------- --- ------
asm_diskgroups string ORADATA, ORAFLASH
asm_diskstring string
asm_power_limit integer 1
asm_preferred_read_failure_groups string
audit_file_dest string /u01/grid/11.2.0/rdbms/audit
…..

查看用户数据库实例(以oracle用户登入,查看instance_name=racdb2,显然是用户实例):

[root@rac02 u01]# su - oracle
[oracle@rac02 ~]$
[oracle@rac02 ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 11.2.0.1.0 Production on Tue Jan 4 01:01:04 2011

Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> show parameter

NAME TYPE VALUE
-------- --- ------
O7_DICTIONARY_ACCESSIBILITY boolean FALSE
active_instance_count integer
aq_tm_processes integer 0
archive_lag_target integer 0
asm_diskgroups string
asm_diskstring string
asm_power_limit integer 1
…….

5. 查看11g 数据库实例的alert log及trace :

[oracle@rac01 trace]$ pwd
/u01/product/oracle/diag/rdbms/racdb/racdb1/trace
[oracle@rac01 trace]$
[oracle@rac01 trace]$ vi alert_racdb1.log

6. 查看11g RAC Clusterware的log :

[root@rac01 sbin]# su - grid
[grid@rac01 ~]$
[grid@rac01 rac01]$ pwd
/u01/grid/11.2.0/log/rac01

[grid@rac01 trace]$ pwd
/u01/grid/11.2.0/log/diag/tnslsnr/rac01/listener_scan1/trace
[grid@rac01 trace]$ ls
listener_scan1.log

[grid@rac01 rac01]$ pwd
/u01/grid/11.2.0/log/rac01
[grid@rac01 rac01]$ ls
admin/ alertrac01.log crsd/ ctssd/ evmd/ gnsd/ mdnsd/ racg/
agent/ client/ cssd/ diskmon/ gipcd/ gpnpd/ ohasd/ srvm/

7. 常用集群命令

[grid@rac02 ~]$ crs_stat -t

检查Oracle Clusterware 是否在线

[grid@rac02 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

检查cssd (Cluster Synchronization Services) 是否在线

[grid@rac02 ~]$ crsctl check cssd
CRS-272: This command remains for backward compatibility only
Cluster Synchronization Services is online

检查crsd (Cluster Ready Services) 是否在线

[grid@rac02 ~]$ crsctl check crsd
CRS-272: This command remains for backward compatibility only
Cluster Ready Services is online

检查evmd (Event Mananger)是否在线

[grid@rac02 ~]$ crsctl check evmd
CRS-272: This command remains for backward compatibility only
Event Manager is online

在节点间检查CSS的存活

[grid@rac02 ~]$ crsctl check cluster -n rac01

rac01:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[grid@rac02 ~]$ crsctl check cluster -n rac02

rac02:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

开启数据库

[grid@rac02 ~]$ srvctl start database -d racdb

开启oc4j :

[grid@rac02 ~]$ ./srvctl enable oc4j
[grid@rac02 ~]$ ./srvctl start oc4j
[grid@rac02 ~]$ ./crs_stat -t

8. vote disk 管理

[grid@rac01 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2720
Available space (kbytes) : 259400
ID : 132900461
Device/File Name : +OCR_VOTE
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check bypassed due to non-privileged user

在Oracle11g R2中,不必备份voting disk, 当任何配置发生改变,voting disk
数据会自动备份在OCR中,并自动恢复到任何加入的voting disk中。 从下面可以看
到OCR和VotingDisk是一个文件。

[grid@rac01 ~]$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
- ----— ---
1. ONLINE 095112005ec24f57bf98f6148818cc53 (ORCL:OCR_VOTE01) [OCR_VOTE]
Located 1 voting disk(s).
[grid@rac01 ~]$

[grid@rac01 ~]$ asmcmd
ASMCMD> ls
OCR_VOTE/
ORADATA/
ORAFLASH/
ASMCMD> cd ocr_vote
ASMCMD> cd rac
ASMCMD> ls
ASMPARAMETERFILE/
OCRFILE/
ASMCMD> cd ocrfile
ASMCMD> ls
REGISTRY.255.739337635

OCR 管理

[grid@rac01 ~]$ crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.1.0]

[grid@rac01 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2720
Available space (kbytes) : 259400
ID : 132900461
Device/File Name : +OCR_VOTE
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check bypassed due to non-privileged user

使用下面命令(root登陆)使用 destination_file 或者 +ASM_disk_group取代现在的OCR Location:

  1. ocrconfig -replace current_OCR_location -replacement new_OCR_location

如果只有一个OCR Location, 那么使用下面的命令:

  1. ocrconfig -add +new_storage_disk_group
  2. ocrconfig -delete +current_disk_group

运行下面的命令显示备份:

[grid@rac01 ~]$ ocrconfig -showbackup
rac01 2011/01/08 17:54:51 /u01/grid/11.2.0/cdata/rac/backup00.ocr
rac01 2011/01/08 13:54:49 /u01/grid/11.2.0/cdata/rac/backup01.ocr
rac02 2011/01/08 06:34:46 /u01/grid/11.2.0/cdata/rac/backup02.ocr
rac01 2011/01/07 02:15:37 /u01/grid/11.2.0/cdata/rac/day.ocr
rac01 2011/01/02 07:51:43 /u01/grid/11.2.0/cdata/rac/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available

当Oracle Clusterware起来的时候,在一个节点上运行ocrconfig -manualbackup命令

[grid@rac01 ~]$ ocrconfig -manualbackup
在 /u01/grid/11.2.0/cdata/rac/day.ocr下生成备份文件 backup_20100112_141900.ocr

然后使用 $ ocrconfig -showbackup 可以查看到备份信息。

运行下面的命令检验备份文件内容及完整性。
$ ocrdump -backupfile backup_file_name