oracle rac 常用维护工具和命令-凯发app官方网站

welcome to vcdog's blog 

凯发app官方网站首页　| 　博文目录　| 　关于我

vcdog

博客访问： 2090687
博文数量： 195
博客积分： 4378
博客等级：上校
技术积分： 4046
用户组：普通用户
注册时间： 2007-09-09 11:37

个人简介

白天和黑夜只交替没交换无法想像对方的世界

文章分类

全部博文（195）

webscalesql（1）
mysql（5）
timesten数据库专（5）
hp-unix（0）
音乐天堂（11）
户外爬山（0）
生活随笔（0）
faq问与答（1）
存储管理（3）

nas（0）

sna（0）

das（1）

raid（1）
邮件服务（2）

qmail（0）

extmail（1）

postfix（0）

sendmail（1）
编辑工具（5）

grep（0）

awk（2）

sed（1）

vim（1）
缓存技术（0）

squid（0）

varnish（0）

memcached（0）
web服务器（0）

nginx（0）

resin（0）

lighttpd（0）

tomcat（0）

apache（0）
集群负载（1）

nginx（0）

lvs（0）

haproxy（1）
脚本编程（6）

php（0）

perl（0）

python（1）

shell（5）
nagios监控（6）

nagios监控（3）

cacti监控（2）
英语文摘（2）
linux运维（38）

rsync（0）

tc流控（1）

kickstart（1）

lvm（0）

pam（0）

dhcpd（0）

nfs（0）

troubleshooting（2）

ntpd（1）

openvpn（0）

tcp_wrapper（0）

squid（0）

dns（0）

svn_cvs（1）

security（0）

iptables（0）

samba（0）

vsftpd（2）
oracle数据（49）

osb备份（0）

oracle ora-错误（2）

oracle体系结构（1）

oracle性能调优（1）

oracle备份与恢复（5）

oracle pl/sql（0）

oracle sql 基础（0）

oracle rac集群（17）

backup and recov（0）
文章杂记（12）
未分配的博文（48）

文章存档

2014年（3）

2013年（20）

2012年（18）

2011年（107）

2010年（17）

2009年（5）

2008年（20）

2007年（5）

我的朋友

最近访客

推荐博文

oracle rac 常用维护工具和命令

分类：

2011-07-12 20:23:12

原文地址：作者：liyf0371

oracle 的管理可以通过oem或者命令行接口。 oracle clusterware的命令集可以分为以下4种：

节点层：osnodes

网络层：oifcfg

集群层：crsctl, ocrcheck,ocrdump,ocrconfig

应用层：srvctl,onsctl,crs_stat

下面分别来介绍这些命令。

一．节点层

只有一个命令：　osnodes，这个命令用来显示集群点列表，可用的参数如下，这些参数可以混合使用。

[root@raw1 bin]# ./olsnodes --help

usage: olsnodes [-n] [-p] [-i] [ | -l] [-g] [-v]

where

-n print node number with the node name

-p print private interconnect name with the node name

-i print virtual ip name with the node name

print information for the specified node

-l print information for the local node

-g turn on logging

-v run in verbose mode

[root@raw1 bin]# ./olsnodes -n -p -i

raw1 1 raw1-priv raw1-vip

raw2 2 raw2-priv raw2-vip

二．网络层

网络层由各个节点的网络组件组成，包括2个物理网卡和3个ip 地址。也只有一个命令：oifcfg.

oifctg 命令用来定义和修改oracle 集群需要的网卡属性，这些属性包括网卡的网段地址，子网掩码，接口类型等。要想正确的使用这个命令，必须先知道oracle 是如何定义网络接口的，oracle的每个网络接口包括名称，网段地址，接口类型3个属性。

oifcfg 命令的格式如下： interface_name/subnet:interface_type

这些属性中没有ip地址，但接口类型有两种，public和private，前者说明接口用于外部通信，用于oracle net和vip 地址，而后者说明接口用于interconnect。

接口的配置方式分为两类： global 和node-specific。前者说明集群所有节点的配置信息相同，也就是说所有节点的配置是对称的；而后者意味着这个节点的配置和其他节点配置不同，是非对称的。

iflist：显示网口列表

getif: 获得单个网口信息

setif：配置单个网口

delif：删除网口

[root@raw1 bin]# ./oifcfg --help

prif-9: incorrect usage

name:

oifcfg - oracle interface configuration tool.

usage: oifcfg iflist [-p [-n]]

oifcfg setif {-node | -global} {/:}...

oifcfg getif [-node | -global] [ -if [/] [-type ] ]

oifcfg delif [-node | -global] [[/]]

oifcfg [-help]

- name of the host, as known to a communications network

- name by which the interface is configured in the system

- subnet address of the interface

- type of the interface { cluster_interconnect | public | storage }

[root@raw1 bin]# ./oifcfg iflist

eth0 10.85.10.0

eth1 192.168.1.0

[root@raw1 bin]# ./oifcfg getif

eth0 10.85.10.119 global public

eth0 10.85.10.121 global public

eth0 10.85.10.0 global public

eth1 192.168.1.119 global cluster_interconnect

eth1 192.168.1.121 global cluster_interconnect

eth1 192.168.1.0 global cluster_interconnect

-- 查看public 类型的网卡

[root@raw1 bin]# ./oifcfg getif -type public

eth0 10.85.10.119 global public

eth0 10.85.10.121 global public

eth0 10.85.10.0 global public

-- 删除接口配置

[root@raw1 bin]# ./oifcfg delif -global

-- 添加接口配置

[root@raw1 bin]# ./oifcfg setif -global eth0/192.168.1.119:public

[root@raw1 bin]# ./oifcfg setif -global eth1/10.85.10.119:cluster_interconnect

三．集群层

集群层是指由clusterware组成的核心集群，这一层负责维护集群内的共享设备，并为应用集群提供完整的集群状态视图，应用集群依据这个视图进行调整。这一层共有4个命令： crsctl, ocrcheck,ocrdump,ocrconfig. 后三个是针对ocr 磁盘的。

3.1 crsctl

crsctl 命令可以用来检查crs 进程栈，每个crs 进程状态，管理votedisk，跟踪crs进程功能。

[root@raw1 bin]# ./crsctl

usage: crsctl check crs - checks the viability of the crs stack

crsctl check cssd - checks the viability of css

crsctl check crsd - checks the viability of crs

crsctl check evmd - checks the viability of evm

crsctl set css - sets a parameter override

crsctl get css - gets the value of a css parameter

crsctl unset css - sets css parameter to its default

crsctl query css votedisk - lists the voting disks used by css

crsctl add css votedisk - adds a new voting disk

crsctl delete css votedisk - removes a voting disk

crsctl enable crs - enables startup for all crs daemons

crsctl disable crs - disables startup for all crs daemons

crsctl start crs - starts all crs daemons.

crsctl stop crs - stops all crs daemons. stops crs resources in case of cluster.

crsctl start resources - starts crs resources.

crsctl stop resources - stops crs resources.

crsctl debug statedump evm - dumps state info for evm objects

crsctl debug statedump crs - dumps state info for crs objects

crsctl debug statedump css - dumps state info for css objects

crsctl debug log css [module:level]{,module:level} ...

- turns on debugging for css

crsctl debug trace css - dumps css in-memory tracing cache

crsctl debug log crs [module:level]{,module:level} ...

- turns on debugging for crs

crsctl debug trace crs - dumps crs in-memory tracing cache

crsctl debug log evm [module:level]{,module:level} ...

- turns on debugging for evm

crsctl debug trace evm - dumps evm in-memory tracing cache

crsctl debug log res turns on debugging for resources

crsctl query crs softwareversion [] - lists the version of crs software installed

crsctl query crs activeversion - lists the crs software operating version

crsctl lsmodules css - lists the css modules that can be used for debugging

crsctl lsmodules crs - lists the crs modules that can be used for debugging

crsctl lsmodules evm - lists the evm modules that can be used for debugging

if necesary any of these commands can be run with additional tracing by

adding a "trace" argument at the very front.

example: crsctl trace check css

3.1.1 检查crs 状态

[root@raw1 bin]# ./crsctl check crs

css appears healthy

crs appears healthy

evm appears healthy

-- 检查单个状态

[root@raw1 bin]# ./crsctl check cssd

css appears healthy

[root@raw1 bin]# ./crsctl check crsd

crs appears healthy

[root@raw1 bin]# ./crsctl check evmd

evm appears healthy

3.1.2 配置crs 栈是否自启动

crs 进程栈默认随着操作系统的启动而自启动，有时出于维护目的需要关闭这个特性，可以用root 用户执行下面命令。

[root@raw1 bin]# ./crsctl disable crs

[root@raw1 bin]# ./crsctl enable crs

这个命令实际是修改了/etc/oracle/scls_scr/raw/root/crsstart 这个文件里的内容。

3.1.3 启动，停止crs 栈。

oracle 在10.1时，必须通过重新启动系统重启clusterware，但是从oracle 10.2 开始，可以通过命令来启动和停止crs.

-- 启动crs：

[root@raw1 bin]# ./crsctl start crs

attempting to start crs stack

the crs stack will be started shortly

-- 关闭crs：

[root@raw1 bin]# ./crsctl stop crs

stopping resources.

successfully stopped crs resources

stopping cssd.

shutting down css daemon.

shutdown request successfully issued.

3.1.4 查看votedisk 磁盘位置

[root@raw1 bin]# ./crsctl query css votedisk

0. 0 /dev/raw/raw2

located 1 votedisk(s).

3.1.5 查看和修改crs 参数

-- 查看参数：用get

[root@raw1 bin]# ./crsctl get css misscount

-- 修改参数：用set，但是这个功能要慎用

[root@raw1 bin]# ./crsctl set css miscount 60

3.1.6 跟踪crs 模块，提供辅助功能

crs由crs，css，evm 三个服务组成，每个服务又是由一系列module组成，crsctl 允许对每个module进行跟踪，并把跟踪内容记录到日志中。

[root@raw1 bin]# ./crsctl lsmodules css

the following are the css modules ::

cssd

commcrs

commns

[root@raw1 bin]# ./crsctl lsmodules crs

the following are the crs modules ::

crsui

crscomm

crsrti

crsmain

crsplace

crsapp

crsres

crscomm

crsocr

crstimer

crsevt

crsd

clucls

cssclnt

commcrs

commns

[root@raw1 bin]# ./crsctl lsmodules evm

the following are the evm modules ::

evmd

evmdmain

evmcomm

evmevt

evmapp

evmagent

crsocr

clucls

cssclnt

commcrs

commns

--跟踪cssd模块，需要root 用户执行：

[root@raw1 bin]# ./crsctl debug log css "cssd:1"

configuration parameter trace is now set to 1.

set crsd debug module: cssd level: 1

-- 查看跟踪日志

[root@raw1 cssd]# pwd

/u01/app/oracle/product/crs/log/raw1/cssd

[root@raw1 cssd]# more ocssd.log

...

[ cssd]2010-03-08 00:19:27.160 [36244384] >trace: clssscsetdebuglevel: the logging level is set to 1 ,the cache level is set to 2

[ cssd]2010-03-08 00:19:52.139 [119085984] >trace: clssgmclientconnectmsg: connect from con(0x834fd18) proc(0x8354c70) pid() proto(10:2:1:1)

...

3.1.7 维护votedisk

以图新方式安装clusterware的过程中，在配置votedisk时，如果选择external redundancy策略。则只能填写一个votedisk。但是即使使用external redundancy作为冗余策略，也可以添加多个vodedisk，只是必须通过crsctl 命令来添加，添加多个votedisk后，这些votedisk 互为镜像，可以防止votedisk的单点故障。

需要注意的是，votedisk使用的是一种“多数可用算法”，如果有多个votedisk，，则必须一半以上的votedisk同时使用，clusterware才能正常使用。比如配置了4个votedisk，坏一个votedisk，集群可以正常工作，如果坏了2个，则不能满足半数以上，集群会立即宕掉，所有节点立即重启，所以如果添加votedisk，尽量不要只添加一个，而应该添加2个。这点和ocr 不一样。ocr 只需配置一个。

添加和删除votedisk的操作比较危险，必须停止数据库，停止asm,停止crs stack后操作，并且操作时必须使用-force参数。

1）查看当前配置

[root@raw1 bin]# ./crsctl query css votedisk

2) 停止所有节点的crs：

[root@raw1 bin]# ./crsctl stop crs

3）添加votedisk

[root@raw1 bin]# ./crsctl add css votedisk /dev/raw/raw1 -force

注意：即使在crs 关闭后，也必须通过-force 参数来添加和删除votedisk，并且-force 参数只有在crs关闭的场合下使用才安全。否则会报：cluter is not a ready state for online disk addition.

4) 确认添加后的情况：

[root@raw1 bin]# ./crsctl query css votedisk

5）启动crs

[root@raw1 bin]# ./crsctl start crs

3.2 ocr命令系列

oracle clusterware把整个集群的配置信息放在共享存储上，这个存储就是ocr disk. 在整个集群中，只有一个节点能对ocr disk 进行读写操作，这个节点叫作master node，所有节点都会在内存中保留一份ocr的拷贝，同时哟一个ocr process 从这个内存中读取内容。 ocr 内容发生改变时，由master node的ocr process负责同步到其他节点的ocr process。

因为ocr的内容如此重要，oracle 每4个小时对其做一次备份，并且保留最后的3个备份，以及前一天，前一周的最后一个备份。这个备份由master node crsd进程完成，备份的默认位置是$crs_home\crs\cdata\目录下。每次备份后，备份文件名自动更改，以反应备份时间顺序，最近一次的备份叫作backup00.ocr。这些备份文件除了保存在本地，dba还应该在其他存储设备上保留一份，以防止意外的存储故障。

3.2.1 ocrdump

该命令能以ascii的方式打印出ocr的内容，但是这个命令不能用作ocr的备份恢复，也就是说产生的文件只能用作阅读，而不能用于恢复。

命令格式： ocrdump [-stdout] [filename] [-keyname name] [-xml]

参数说明：

-stdout: 把内容打印输出到屏幕上

filename：内容输出到文件中

-keyname：只打印某个键及其子健内容

-xml：以xml格式打印输出

示例：把system.css键的内容以.xml格式打印输出到屏幕

[root@raw1 bin]# ./ocrdump -stdout -keyname system.css -xml|more

03/08/2010 04:28:41

/dev/raw/raw1

./ocrdump.bin -stdout -keyname system.css -xml

......

这个命令在执行过程中，会在$crs_home\log\\client 目录下产生日志文件，文件名ocrdump_.log,如果命令执行出现问题，可以从这个日志查看问题原因。

3.2.2 ocrcheck

ocrcheck 命令用于检查ocr内容的一致性，命令执行过程会在$crs_home\log\nodename\client 目录下产生ocrcheck_pid.log 日志文件。这个命令不需要参数。

[root@raw1 bin]# ./ocrcheck

status of oracle cluster registry is as follows :

version : 2

total space (kbytes) : 147352

used space (kbytes) : 4360

available space (kbytes) : 142992

id : 1599790592

device/file name : /dev/raw/raw1

device/file integrity check succeeded

device/file not configured

cluster registry integrity check succeeded

3.2.3 ocrconfig

该命令用于维护ocr 磁盘，安装clusterware过程中，如果选择external redundancy冗余方式，则只能输入一个ocr磁盘位置。但是oracle允许配置两个ocr 磁盘互为镜像，以防止ocr 磁盘的单点故障。 ocr 磁盘和votedisk磁盘不一样，ocr磁盘最多只能有两个，一个primary ocr 和一个mirror ocr。

[root@raw1 bin]# ./ocrconfig --help

name:

ocrconfig - configuration tool for oracle cluster registry.

synopsis:

ocrconfig [option]

option:

-export [-s online]

- export cluster register contents to a file

-import - import cluster registry contents from a file

-upgrade [ []]

- upgrade cluster registry from previous version

-downgrade [-version ]

- downgrade cluster registry to the specified version

-backuploc - configure periodic backup location

-showbackup - show backup information

-restore - restore from physical backup

-replace ocr|ocrmirror [] - add/replace/remove a ocr device/file

-overwrite - overwrite ocr configuration on disk

-repair ocr|ocrmirror - repair local ocr configuration

-help - print out this help information

note:

a log file will be created in

$oracle_home/log//client/ocrconfig_.log. please ensure

you have file creation privileges in the above directory before

running this tool.

-- 查看自助备份

[root@raw1 bin]# ./ocrconfig -showbackup

在缺省情况下，ocr自动备份在$crs_home\crs\cdata\cluster_name 目录下，可以通过ocrconfig -backuploc 命令修改到新的目录

3.2.4 使用导出，导入进行备份和恢复

oracle 推荐在对集群做调整时，比如增加，删除节点之前，应该对ocr做一个备份，可以使用export 备份到指定文件，如果做了replace或者restore 等操作，oracle 建议使用 cluvfy comp ocr -n all 命令来做一次全面的检查。该命令在clusterware 的安装软件里。

1）首先关闭所有节点的crs

[root@raw1 bin]# ./crsctl stop crs

stopping resources.

successfully stopped crs resources

stopping cssd.

shutting down css daemon.

shutdown request successfully issued.

2）用root 用户导出ocr内容

[root@raw1 bin]# ./ocrconfig -export /u01/ocr.exp

3）重启crs

[root@raw1 bin]# ./crsctl start crs

attempting to start crs stack

the crs stack will be started shortly

4）检查crs 状态

[root@raw1 bin]# ./crsctl check crs

css appears healthy

crs appears healthy

evm appears healthy

5）破坏ocr内容

[root@raw1 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=102400

102400 0 records in

102400 0 records out

6）检查ocr一致性

[root@raw1 bin]# ./ocrcheck

prot-601: failed to initialize ocrcheck

7）使用cluvfy 工具检查一致性

[root@raw1 cluvfy]# ./runcluvfy.sh comp ocr -n all

verifying ocr integrity

unable to retrieve nodelist from oracle clusterware.

verification cannot proceed.

8）使用import 恢复ocr 内容

[root@raw1 bin]# ./ocrconfig -import /u01/ocr.exp

9）再次检查ocr

[root@raw1 bin]# ./ocrcheck

status of oracle cluster registry is as follows :

version : 2

total space (kbytes) : 147352

used space (kbytes) : 4364

available space (kbytes) : 142988

id : 610419116

device/file name : /dev/raw/raw1

device/file integrity check succeeded

device/file not configured

cluster registry integrity check succeeded

10）使用cluvfy工具检查

[root@raw1 cluvfy]# ./runcluvfy.sh comp ocr -n all

verifying ocr integrity

warning:

these nodes cannot be reached:

raw2

verification will proceed with nodes:

raw1

error:

user equivalence unavailable on all the nodes.

verification cannot proceed.

verification of ocr integrity was unsuccessful on all the nodes.

注：此处不成功是因为我的机器卡，故raw2节点没有启动

3.2.5 移动ocr 文件位置

实例演示将ocr从/dev/raw/raw1 移动到/dev/raw/raw3上。

1）查看是否有ocr备份

[root@raw1 bin]# ./ocrconfig -showbackup

如果没有备份，可以立即执行一次导出作为备份：

[root@raw1 bin]# ./ocrconfig -export /u01/ocrbackup -s online

2）查看当前ocr配置

[root@raw1 bin]# ./ocrcheck

status of oracle cluster registry is as follows :

version : 2

total space (kbytes) : 147352

used space (kbytes) : 4364

available space (kbytes) : 142988

id : 610419116

device/file name : /dev/raw/raw1

device/file integrity check succeeded

device/file not configured

cluster registry integrity check succeeded

输出显示当前只有一个primary ocr，在/dev/raw/raw1。没有mirror ocr。因为现在只有一个ocr文件，所以不能直接改变这个ocr的位置，必须先添加镜像后在修改，否则会报：failed to initialize ocrconfig.

3) 添加一个mirror ocr

[root@raw1 bin]# ./ocrconfig -replace ocrmirror /dev/raw/raw4

4) 确认添加成功

[root@raw1 bin]# ./ocrcheck

5）改变primary ocr 位置

[root@raw1 bin]# ./ocrconfig -replace ocr /dev/raw/raw3

确认修改成功：

[root@raw1 bin]# ./ocrcheck

6）使用ocrconfig命令修改后，所有rac节点上的/etc/oracle/ocr.loc 文件内容也会自动同步了，如果没有自动同步，可以手工的改成以下内容。

[root@raw1 bin]# more /etc/oracle/ocr.loc

ocrconfig_loc=/dev/raw/raw1

ocrmirrorconfig_loc=/dev/raw/raw3

local_only=false

四．应用层

应用层就是指rac数据库了，这一层有若干资源组成，每个资源都是一个进程或者一组进程组成的完整服务，这一层的管理和维护都是围绕这些资源进行的。有如下命令: srvctl, onsctl, crs_stat 三个命令。

4.1 crs_stat

crs_stat 这个命令用于查看crs维护的所有资源的运行状态，如果不带任何参数时，显示所有资源的概要信息。每个资源显示是各个属性：资源名称，类型，目录，资源运行状态等。

[root@raw1 bin]# ./crs_stat

name=ora.raw.db

type=application

target=online

state=offline

......

也可以指定资源名，查看指定资源的状态，并可以使用-v 和-p 选项，以查看详细信息，其中-p 参数显示的内容比-v 更详细。

1）查看制定资源状态

[root@raw1 bin]# ./crs_stat ora.raw2.vip

name=ora.raw2.vip

type=application

target=online

state=offline

2）使用-v 选项，查看详细内容，这时输出多出4项内容，分别是允许重启次数，已执行重启次数，失败阀值，失败次数。

[root@raw1 bin]# ./crs_stat -v ora.raw2.vip

name=ora.raw2.vip

type=application

restart_attempts=0

restart_count=0

failure_threshold=0

failure_count=0

target=online

state=offline

3）使用-p 选项查看更详细内容

[root@raw1 bin]# ./crs_stat -p ora.raw2.vip

name=ora.raw2.vip

type=application

action_script=/u01/app/oracle/product/crs/bin/racgwrap

active_placement=1

auto_start=1

check_interval=60

description=crs application for vip on a node

failover_delay=0

failure_interval=0

failure_threshold=0

hosting_members=raw2

optional_resources=

placement=favored

required_resources=

restart_attempts=0

script_timeout=60

start_timeout=0

stop_timeout=0

uptime_threshold=7d

usr_ora_alert_name=

usr_ora_check_timeout=0

usr_ora_connect_str=/ as sysdba

usr_ora_debug=0

usr_ora_disconnect=false

usr_ora_flags=

usr_ora_if=eth0

usr_ora_inst_not_shutdown=

usr_ora_lang=

usr_ora_netmask=255.255.255.0

usr_ora_open_mode=

usr_ora_opi=false

usr_ora_pfile=

usr_ora_preconnect=none

usr_ora_srv=

usr_ora_start_timeout=0

usr_ora_stop_mode=immediate

usr_ora_stop_timeout=0

usr_ora_vip=10.85.10.123

这些字段是所有资源共有的，但是根据资源类型不同，某些字段可以空值。

4）使用-ls 选项，可以查看每个资源的权限定义，权限定义格式和linux 一样。

[root@raw1 bin]# ./crs_stat -ls

name owner primary privgrp permission

-----------------------------------------------------------------

ora.raw.db oracle oinstall rwxrwxr--

ora.raw.dmm.cs oracle oinstall rwxrwxr--

ora....aw2.srv oracle oinstall rwxrwxr--

ora....w1.inst oracle oinstall rwxrwxr--

ora....w2.inst oracle oinstall rwxrwxr--

ora....sm1.asm oracle oinstall rwxrwxr--

ora....w1.lsnr oracle oinstall rwxrwxr--

ora.raw1.gsd oracle oinstall rwxr-xr--

ora.raw1.ons oracle oinstall rwxr-xr--

ora.raw1.vip root oinstall rwxr-xr--

ora....sm2.asm oracle oinstall rwxrwxr--

ora....w2.lsnr oracle oinstall rwxrwxr--

ora.raw2.gsd oracle oinstall rwxr-xr--

ora.raw2.ons oracle oinstall rwxr-xr--

ora.raw2.vip root oinstall rwxr-xr--

4.2 onsctl

这个命令用于管理配置ons(oracle notification service). ons 是oracle clusterware 实现fan event push模型的基础。

在传统模型中，客户端需要定期检查服务器来判断服务端状态，本质上是一个pull模型，oracle 10g 引入了一个全新的push 机制--fan(fast application notification),当服务端发生某些事件时，服务器会主动的通知客户端这种变化，这样客户端就能尽早得知服务端的变化。而引入这种机制就是依赖ons实现，在使用onsctl命令之前，需要先配置ons服务。

4.2.1 ons 配置内容

在rac 环境中，需要使用$crs_home下的ons,而不是$oracle_home下面的ons，这点需要注意。配置文件在$crs_home\opmn\conf\ons.config.

[root@raw1 conf]# pwd

/u01/app/oracle/product/crs/opmn/conf

[root@raw1 conf]# more ons.config

localport=6100

remoteport=6200

loglevel=3

useocr=on

参数说明：

localport: 这个参数代表本地监听端口，这里本地特指：127.0.0.1 这个回环地址，用来和运行在本地的客户端进行通信

remoteport：这个参数代表的是远程监听端口，也就是除了127.0.0.1 以外的所有本地ip地址，用来和远程的客户端进行通信。

loglevel: oracle 允许跟踪ons进程的运行，并把日志记录到本地文件中，这个参数用来定义ons进程要记录的日志级别，从1-9，缺省值是3.

logfile: 这个参数和loglevel参数一起使用，用于定义ons进程日志文件的位置，缺省值是$crs_home\opmn\logs\opmn.log

nodes和useocr: 这两个参数共同决定饿了本地的ons daemon要和哪些远程节点上的ons daemon进行通信。

nodes 参数值格式如下：hostname/ip:port[hostname/ip:port]

如：useoce=off

nodes=rac1:6200,rac2:6200

而useocr 参数值为on/off, 如果useocr 是on，说明信息保存在ocr中，如果是off，说明信息取nodes中的配置。对于单实例而言，要把useocr设置为off。

4.2.2 配置ons

可以直接编译ons的配置文件来修改配置，如果使用了ocr，则可以通过racgons命令进行配置，但必须以root用户来执行，如果用oracle 用户来执行，不会提示任何错误，但也不会更改任何配置。

若要添加配置，可以使用下面命令：

racgons add_config rac1:6200 rac2:6200

若要删除配置，可以用下面命令：

racgons remove_config rac1:6200 rac2:6200

4.2.3 onsctl 命令

使用onsctl命令可以启动，停止，调试ons，并重新载入配置文件，其命令格式如下：

[root@raw1 bin]# ./onsctl

usage: ./onsctl start|stop|ping|reconfig|debug

start - start opmn only.

stop - stop ons daemon

ping - test to see if ons daemon is running

debug - display debug information for the ons daemon

reconfig - reload the ons configuration

help - print a short syntax description (this).

detailed - print a verbose syntax description.

ons 进程运行，并不一定代表ons 正常工作，需要使用ping命令来确认。

1）在os级别查看进程状态。

[root@raw1 bin]# ps -aef|grep ons

root 1924 6953 0 03:17 pts/1 00:00:00 grep ons

oracle 30723 1 0 mar08 ? 00:00:00 /u01/app/oracle/product/crs/opmn/bin/ons -d

oracle 30724 30723 0 mar08 ? 00:00:04 /u01/app/oracle/product/crs/opmn/bin/ons -d

2）确认ons服务的状态

[root@raw1 bin]# ./onsctl ping

number of onsconfiguration retrieved, numcfg = 2

onscfg[0]

{node = raw1, port = 6200}

adding remote host raw1:6200

onscfg[1]

{node = raw2, port = 6200}

adding remote host raw2:6200

ons is running ...

3）启动ons服务

[root@raw1 bin]# ./onsctl start

4）使用debug 选项，可以查看详细信息，其中最有意义的就是能显示所有连接。

[root@raw1 bin]# ./onsctl debug

number of onsconfiguration retrieved, numcfg = 2

onscfg[0]

{node = raw1, port = 6200}

adding remote host raw1:6200

onscfg[1]

{node = raw2, port = 6200}

adding remote host raw2:6200

http/1.1 200 ok

content-length: 1357

content-type: text/html

response:

======== ons ========

listeners:

name bind address port flags socket

------- --------------- ----- -------- ------

local 127.000.000.001 6100 00000142 7

remote 010.085.010.119 6200 00000101 8

request no listener

server connections:

id ip port flags sendq worker busy subs

---------- --------------- ----- -------- ---------- -------- ------ -----

1 010.085.010.121 6200 00104205 0 1 0

client connections:

id ip port flags sendq worker busy subs

---------- --------------- ----- -------- ---------- -------- ------ -----

3 127.000.000.001 6100 0001001a 0 1 0

4 127.000.000.001 6100 0001001a 0 1 1

pending connections:

id ip port flags sendq worker busy subs

---------- --------------- ----- -------- ---------- -------- ------ -----

0 127.000.000.001 6100 00020812 0 1 0

worker ticket: 3/3, idle: 180

thread flags

-------- --------

17faba0 00000012

67f6ba0 00000012

32d6ba0 00000012

resources:

notifications:

received: 1, in receive q: 0, processed: 1, in process q: 0

pools:

message: 24/25 (1), link: 25/25 (1), subscription: 24/25 (1)

[root@raw1 bin]#

4.3 srvctl

该命令是rac维护中最常用的命令，也是最复杂的命令。这个工具可以操作下面的几种资源：database，instance，asm，service，listener 和 node application，其中node application又包括gsd，ons，vip。这些资源除了使用srvctl工具统一管理外，某些资源还有自己独立的管理工具，比如ons可以使用onsctl命令进行管理；listener 可以通过lsnrctl 管理。

[root@raw1 bin]# ./srvctl --help

usage: srvctl