注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

常在心

淡泊明志,人生自在

 
 
 

日志

 
 

When NFS Server Is Down, Oracle Server Freezes With No Errors In Alert Log File [ID 1316251.1]  

2011-12-19 14:09:16|  分类: ORACLE BUG |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

In this Document
  Symptoms
  Changes
  Cause
  Solution


Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.4 and later   [Release: 10.2 and later ]
IBM AIX on POWER Systems (64-bit)

Symptoms

Each of the Oracle instances on AIX has a NFS mount point for backup purposes. It is mounted with following options:

bg,hard,intr,rsize=32768,wsize=32768,sec=sys,noac,rw

When the NFS server is down, the Oracle RDBMS freezes with no errors in alert log file. When the NFS server is up again, the database is working without problems.

Changes

No changes on the environment, just lost NAS connectivity (to NFS server), so the remote directories are not  available.

Cause

From the uploaded truss output of sqlplus and df command, we can see the statx command is hanging at /backup , i.e. the NFS mounted drive:

462940: statx("./../../../../backup", 0x0FFFFFFFFFFF5980, 176, 021) (sleeping...)
561338: kread(14, " ? ? J ?\0\0\0\0\0\0\010".., 64) Err#82 ERESTART
561338: Received signal #2, SIGINT [caught]
561338: sigprocmask(0, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: sigprocmask(1, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: ksetcontext_sigreturn(0x0FFFFFFFFFFF37A0, 0x0000000000000000, 0x00000001100F04F0,
0x800000000000D032, 0x3000000000000000, 0x0000000000000360, 0x0000000000000000, 0x0000000000000000)
561338: kread(14, " ? ? J ?\0\0\0\0\0\0\010".., 64) Err#82 ERESTART
561338: Received signal #2, SIGINT [caught]
561338: sigprocmask(0, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: sigprocmask(1, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: ksetcontext_sigreturn(0x0FFFFFFFFFFF37A0, 0x0000000000000000, 0x00000001100F04F0,
0x800000000000D032, 0x3000000000000000, 0x0000000000000320, 0x0000000000000000, 0x0000000000000000)
561338: kread(14, " ? ? J ?\0\0\0\0\0\0\010".., 64) Err#82 ERESTART
561338: Received signal #2, SIGINT [caught]
561338: sigprocmask(0, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: sigprocmask(1, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: ksetcontext_sigreturn(0x0FFFFFFFFFFF37A0, 0x0000000000000000, 0x00000001100F04F0,
0x800000000000D032, 0x3000000000000000, 0x0000000000000310, 0x0000000000000000, 0x0000000000000000)
561338: kread(14, " ? ? J ?\0\0\0\0\0\0\010".., 64) Err#82 ERESTART
561338: Received signal #2, SIGINT [caught]
561338: sigprocmask(0, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: sigprocmask(1, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: ksetcontext_sigreturn(0x0FFFFFFFFFFF37A0, 0x0000000000000000, 0x00000001100F04F0,
0x800000000000D032, 0x3000000000000000, 0x0000000000000310, 0x0000000000000000, 0x0000000000000000)
561338: kread(14, " ? ? J ?\0\0\0\0\0\0\010".., 64) Err#82 ERESTART
561338: Received signal #2, SIGINT [caught]
561338: sigprocmask(0, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: sigprocmask(1, 0x0FFFFFFFFFFF3620, 0x0000000000000000) = 0
561338: ksetcontext_sigreturn(0x0FFFFFFFFFFF37A0, 0x0000000000000000, 0x00000001100F04F0,
0x800000000000D032, 0x3000000000000000, 0x0000000000000320, 0x0000000000000000, 0x0000000000000000)
561338: kread(14, " ? ? J ?\0\0\0\0\0\0\010".., 64) (sleeping...)
462940: statx("./../../../../backup", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../usr", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../lib", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../audit", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../dev", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../etc", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../u", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../lpp", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../mnt", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../proc", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../sbin", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../bin", 0x0FFFFFFFFFFF5980, 176, 021) = 0
462940: statx("./../../../../oracle", 0x0FFFFFFFFFFF5980, 176, 021) = 0


The problem comes from:

statx("./../../../../backup", 0x0FFFFFFFFFFF5980, 176, 021) (sleeping...)

Oracle code calls a Unix system call, 'getcwd' to get the current working directory. Then, after that, all the control reverts over to the operating system.  From what we can see, the function 'getcwd' calls 'getwd' which in turn calls 'stat'. Once 'stat' is entered it starts processing directory entries in the order shown below by performing a 'statx' call for each entry:

./
./..
./../..
./../../.. (this goes on until the root directory is reached)

Once the root directory is reached then 'lstat' calls 'statx' for each entry in the directory. Oracle has no control over this processing and there is nothing we can do to prevent it (it is all at the OS level at this point).

Solution

From one similar issue, IBM has suggested the following action plan to avoid this issue. The answer from IBM is:

Here's a solution to avoid the problem described by Oracle:
DO NOT have the NFS mounts directly under /, but put them one level lower. Then, we can use symbolic links to them.

NFS mount point on node  /nfs/backup (/nfs is a directory we'll create, it can have any name) and create a softlink /backup -> /nfs/backup.

$ ln -s /nfs/backup /backup


This will avoid the statx problem without having to make changes in the setup (because /backup is still there).

Additionally you can ask IBM about APAR # IZ85027, IZ85029, IZ85032, IZ86102, IZ87374, IZ90533.

Check with IBM which one applies to your configuration.

  评论这张
 
阅读(615)| 评论(0)
推荐 转载

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017