oracle 10.2.0.4 rac aix 7.1 主机重启后,有个节点没正常启动,ocssd.log提示以下信息:
[ cssd]2020-11-22 02:50:21.750 [1544] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2758) lats(7664721) disk lastseqno(2758)
[ cssd]2020-11-22 02:50:22.752 [1030] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2759) lats(7665724) disk lastseqno(2759)
[ cssd]2020-11-22 02:50:22.752 [1287] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2759) lats(7665724) disk lastseqno(2759)
[ cssd]2020-11-22 02:50:22.753 [1544] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2759) lats(7665724) disk lastseqno(2759)
[ cssd]2020-11-22 02:50:23.527 [4628] >trace: clssnmrcfgmgrthread: local join
[ cssd]2020-11-22 02:50:23.527 [4628] >warning: clssnmlocaljoinevent: takeover aborted due to alive node on disk
[ cssd]2020-11-22 02:50:23.755 [1287] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2760) lats(7666726) disk lastseqno(2760)
[ cssd]2020-11-22 02:50:23.755 [1030] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2760) lats(7666727) disk lastseqno(2760)
[ cssd]2020-11-22 02:50:23.755 [1544] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2760) lats(7666727) disk lastseqno(2760)
[ cssd]2020-11-22 02:50:24.757 [1030] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2761) lats(7667729) disk lastseqno(2761)
只是主机重启,各项配置都没有改动,当时私网有点问题不通。
参考:
crs can not start after node reboot (doc id 733260.1)
changes
this can happen in an environment where a node is shutdown for various reasons, then restarted.
cause
1. during reboot, crs is started automatically before the network interface is ready.
2. /etc/hosts mismatch, wrong definition for the problem node
3. the private network ip has been changed, but /etc/hosts reflects the changes in a wrong way
4. private network is not pingable or ping response is slow, there is packet loss from ping command
5. different clusterware used for different nodes
6. if /etc/init.d/init.cssd startcheck does not complete, usually /tmp/crsctl.xxx file should give the clue as to why it does not complete. in case there is no /tmp/crsctl.xxx file generated
7. ocr is pointing to a wrong device
8. localconfig has been run on cluster node accidentally
9. if crs does not start automatically after node reboot, please check if auto start is disable by:
cat /etc/oracle/scls_scr//root/crsstart
私网正常后,重启crs解决:
$crs_home/bin/crsctl stop crs
$crs_home/bin/crsctl start crs
阅读(877) | 评论(0) | 转发(0) |