clssnmreaddskheartbeat: node(1) is down-凯发app官方网站

oracle 10.2.0.4 rac aix 7.1 主机重启后，有个节点没正常启动，ocssd.log提示以下信息：

[ cssd]2020-11-22 02:50:21.750 [1544] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2758) lats(7664721) disk lastseqno(2758)
[ cssd]2020-11-22 02:50:22.752 [1030] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2759) lats(7665724) disk lastseqno(2759)
[ cssd]2020-11-22 02:50:22.752 [1287] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2759) lats(7665724) disk lastseqno(2759)
[ cssd]2020-11-22 02:50:22.753 [1544] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2759) lats(7665724) disk lastseqno(2759)
[ cssd]2020-11-22 02:50:23.527 [4628] >trace: clssnmrcfgmgrthread: local join
[ cssd]2020-11-22 02:50:23.527 [4628] >warning: clssnmlocaljoinevent: takeover aborted due to alive node on disk
[ cssd]2020-11-22 02:50:23.755 [1287] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2760) lats(7666726) disk lastseqno(2760)
[ cssd]2020-11-22 02:50:23.755 [1030] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2760) lats(7666727) disk lastseqno(2760)
[ cssd]2020-11-22 02:50:23.755 [1544] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2760) lats(7666727) disk lastseqno(2760)
[ cssd]2020-11-22 02:50:24.757 [1030] >trace: clssnmreaddskheartbeat: node(1) is down. rcfg(2) wrtcnt(2761) lats(7667729) disk lastseqno(2761)

只是主机重启，各项配置都没有改动，当时私网有点问题不通。

参考：
crs can not start after node reboot (doc id 733260.1)

changes

this can happen in an environment where a node is shutdown for various reasons, then restarted.

cause

1. during reboot, crs is started automatically before the network interface is ready.
2. /etc/hosts mismatch, wrong definition for the problem node
3. the private network ip has been changed, but /etc/hosts reflects the changes in a wrong way
4. private network is not pingable or ping response is slow, there is packet loss from ping command
5. different clusterware used for different nodes
6. if /etc/init.d/init.cssd startcheck does not complete, usually /tmp/crsctl.xxx file should give the clue as to why it does not complete. in case there is no /tmp/crsctl.xxx file generated
7. ocr is pointing to a wrong device
8. localconfig has been run on cluster node accidentally
9. if crs does not start automatically after node reboot, please check if auto start is disable by:
cat /etc/oracle/scls_scr//root/crsstart

私网正常后，重启crs解决：

$crs_home/bin/crsctl stop crs
$crs_home/bin/crsctl start crs

阅读(877) | 评论(0) | 转发(0) |

上一篇：opatch lsinv 报错：opatch failed with error code 1

下一篇：启停oracle 10g rac集群

给主人留下些什么吧！~~

| | | | |

感谢所有关心和支持过chinaunix的朋友们