Donnerstag, 26. Oktober 2017

prelink makes instance unusable

I spent a recent night hunting an issue which was non trivial to detect.
On a 4 node cluster 2 nodes got Linux patches and were rebooted. This was done after 10pm for minimal business impact. (The application is written by Oracle - so it's not fully TAF aware).
The DBs started fine, all services were available.
At 3:50 in the morning I got a call some processes show strange errors.
As I connected to the Server, the Instance shows running proceses. But I could not connect to the instance:
With proper environment variables, the errors were:
sqlplus system/xxx

SQL*Plus: Release Production on Thu Oct 19 03:58:52 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Enter password:
ORA-01034: ORACLE not available
ORA-27123: unable to attach to shared memory segment
Linux-x86_64 Error: 22: Invalid argument
Additional information: 25
Additional information: 323780623
Additional information: -1610612736
Process ID: 0
Session ID: 0 Serial number: 0

Enter user-name:
ORA-01017: invalid username/password; logon denied


sqlplus system/xxx

SQL*Plus: Release Production on Thu Oct 19 03:59:33 2017

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.


So it seems it's related to shared memory.

First checks showed there are enough hugepages configured and available, ipcs lists them correct.

in alert.log I found entries like
Thu Oct 19 03:34:41 2017
Process q003 died, see its trace file
Thu Oct 19 03:35:40 2017
Process W000 died, see its trace file
Process W000 died, see its trace file
Process W000 died, see its trace file
Thu Oct 19 03:35:52 2017
Process W000 died, see its trace file
Process W000 died, see its trace file
Thu Oct 19 03:35:57 2017
Process m000 died, see its trace file
Process W000 died, see its trace file
Thu Oct 19 03:36:04 2017
Thu Oct 19 04:00:02 2017
Process J000 died, see its trace file
kkjcre1p: unable to spawn jobq slave process
Errors in file /...diag.../trace/INST_NAME_cjq0_61788.trc:
Process W000 died, see its trace file
Thu Oct 19 04:00:04 2017

this contains
*** 2017-10-19 04:00:02.536
*** SESSION ID:(631.21) 2017-10-19 04:00:02.536
*** CLIENT ID:() 2017-10-19 04:00:02.536
*** SERVICE NAME:(SYS$BACKGROUND) 2017-10-19 04:00:02.536
*** MODULE NAME:() 2017-10-19 04:00:02.536
*** ACTION NAME:() 2017-10-19 04:00:02.536

Process J000 is dead (pid=119122 req_ver=453 cur_ver=453 state=KSOSP_SPAWNED).

*** 2017-10-19 04:00:09.545
Process J000 is dead (pid=119264 req_ver=456 cur_ver=456 state=KSOSP_SPAWNED).

*** 2017-10-19 04:00:12.539
Process J000 is dead (pid=119278 req_ver=25 cur_ver=25 state=KSOSP_SPAWNED).

With all this information (and some search effort) the best match I found is Connected to an idle instance, while database instance is running (Process J000 died, see its trace file kkjcre1p: unable to spawn jobq slave process ) (Doc ID 1578491.1).

But why is prelink doing such bad things?

Some information about the memory structure of the running instance and new started process, together with a good old strace will shed some light:

first of all the /proc/<pid>/maps content of the running smon (any process attached to the instance would do):
00400000-0bdbb000 r-xp 00000000 fd:ac 4730278                            /O_H/bin/oracle
0bfba000-0c1b4000 rw-p 0b9ba000 fd:ac 4730278                            /O_H/bin/oracle
0c1b4000-0c20b000 rw-p 00000000 00:00 0
0c89b000-0c8fe000 rw-p 00000000 00:00 0                                  [heap]
60000000-a0000000 rw-s 00000000 00:0c 323747854                          /SYSV00000000 (deleted)
a0000000-3260000000 rw-s 00000000 00:0c 323780623                        /SYSV00000000 (deleted)
3260000000-3260200000 rw-s 00000000 00:0c 323813392                      /SYSV6a4ae364 (deleted)
3c2ee00000-3c2ee01000 r-xp 00000000 fd:01 1625                           /lib64/
3c2ee01000-3c2f000000 ---p 00001000 fd:01 1625                           /lib64/
3c2f000000-3c2f001000 rw-p 00000000 fd:01 1625                           /lib64/
7feb42051000-7feb42061000 rw-p 00000000 00:05 4603                       /dev/zero
7feb42061000-7feb42071000 rw-p 00000000 00:05 4603                       /dev/zero
7feb42071000-7feb42081000 rw-p 00000000 00:05 4603                       /dev/zero
7feb42081000-7feb42091000 rw-p 00000000 00:05 4603                       /dev/zero
7feb42091000-7feb420a1000 rw-p 00000000 00:05 4603                       /dev/zero
7feb42750000-7feb42751000 rw-p 0007f000 00:05 4603                       /dev/zero
7feb42751000-7feb42891000 rw-p 00000000 00:00 0
7feb42891000-7feb4289d000 r-xp 00000000 fd:01 3650                       /lib64/
7feb4289d000-7feb42a9c000 ---p 0000c000 fd:01 3650                       /lib64/
7feb42a9c000-7feb42a9d000 r--p 0000b000 fd:01 3650                       /lib64/
7feb42a9d000-7feb42a9e000 rw-p 0000c000 fd:01 3650                       /lib64/
7feb42a9e000-7feb42b9e000 rw-p 00000000 00:00 0
7feb42b9e000-7feb42ba8000 r-xp 00000000 fd:a4 2880                       /usr/lib64/ (deleted)
7feb42ba8000-7feb42da8000 ---p 0000a000 fd:a4 2880                       /usr/lib64/ (deleted)
7feb42da8000-7feb42da9000 rw-p 0000a000 fd:a4 2880                       /usr/lib64/ (deleted)
7feb42da9000-7feb42f33000 r-xp 00000000 fd:01 118                        /lib64/ (deleted)
7feb42f33000-7feb43133000 ---p 0018a000 fd:01 118                        /lib64/ (deleted)
7feb43133000-7feb43137000 r--p 0018a000 fd:01 118                        /lib64/ (deleted)
7feb43137000-7feb43139000 rw-p 0018e000 fd:01 118                        /lib64/ (deleted)
7feb43139000-7feb4313d000 rw-p 00000000 00:00 0
7feb4313d000-7feb43153000 r-xp 00000000 fd:01 3646                       /lib64/ (deleted)
7feb43153000-7feb43352000 ---p 00016000 fd:01 3646                       /lib64/ (deleted)
7feb43352000-7feb43353000 r--p 00015000 fd:01 3646                       /lib64/ (deleted)
7feb43353000-7feb43354000 rw-p 00016000 fd:01 3646                       /lib64/ (deleted)
7feb43354000-7feb43356000 rw-p 00000000 00:00 0
7feb43356000-7feb4336d000 r-xp 00000000 fd:01 142                        /lib64/ (deleted)
7feb4336d000-7feb4356d000 ---p 00017000 fd:01 142                        /lib64/ (deleted)
7feb4356d000-7feb4356e000 r--p 00017000 fd:01 142                        /lib64/ (deleted)
7feb4356e000-7feb4356f000 rw-p 00018000 fd:01 142                        /lib64/ (deleted)
7feb4356f000-7feb43573000 rw-p 00000000 00:00 0
7feb43573000-7feb435f6000 r-xp 00000000 fd:01 3644                       /lib64/ (deleted)
7feb435f6000-7feb437f5000 ---p 00083000 fd:01 3644                       /lib64/ (deleted)
7feb437f5000-7feb437f6000 r--p 00082000 fd:01 3644                       /lib64/ (deleted)
7feb437f6000-7feb437f7000 rw-p 00083000 fd:01 3644                       /lib64/ (deleted)
7feb437f7000-7feb437f9000 r-xp 00000000 fd:01 3642                       /lib64/ (deleted)
7feb437f9000-7feb439f9000 ---p 00002000 fd:01 3642                       /lib64/ (deleted)
7feb439f9000-7feb439fa000 r--p 00002000 fd:01 3642                       /lib64/ (deleted)
7feb439fa000-7feb439fb000 rw-p 00003000 fd:01 3642                       /lib64/ (deleted)
7feb439fb000-7feb43a07000 r-xp 00000000 fd:ac 4593158                    /O_H/lib/
7feb43a07000-7feb43b06000 ---p 0000c000 fd:ac 4593158                    /O_H/lib/
Here we see at the top 3 shared memory segments (bold) - the one with id 323780623 will be of special interest.

As I try to attach a new process (sqlplus) to the instance, strace shows me
9134  shmat(323780623, 0xa0000000, 0)   = -1 EINVAL (Invalid argument)
9134  shmctl(323780623, IPC_STAT, 0x7fffc48703c8) = 0
9134  shmat(323780623, 0, 0)            = 0x2aaaaac00000
9134  shmdt(0x2aaaaac00000)             = 0
9134  uname({sys="Linux", node="av3l291p", ...}) = 0
9134  times({tms_utime=0, tms_stime=1, tms_cutime=0, tms_cstime=0}) = 432291212
9134  times({tms_utime=0, tms_stime=1, tms_cutime=0, tms_cstime=0}) = 432291212
9134  stat("/O_H/rdbms/log", {st_mode=S_IFDIR|0755, st_size=503808, ...}) = 0
9134  stat("/O_H/rdbms/log/ebssip021_ora_9134.trc", 0x7fffc486c810) = -1 ENOENT (No such file or directory)
9134  open("/O_H/rdbms/log/ebssip021_ora_9134.trc", O_WRONLY|O_CREAT|O_TRUNC, 0660) = 4
9134  lstat("/O_H/rdbms/log/ebssip021_ora_9134.trc", {st_mode=S_IFREG|0640, st_size=0, ...}) = 0
Its a nice habit to write a trace file in such a situation.  Oracle Trace Files are often great. In this case, it contains (beside many other things) information about the shared memory segments:
*********************** Dumping ipcs output ********************
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 260440066  oracle     600        4096       0
0x00000000 260472835  oracle     600        4096       0
0x2bac4398 260505604  oracle     600        24576      95
0x00000000 320405509  oracle     640        67108864   85
0x00000000 320438278  oracle     640        10670309376 85
0xa6f47cf8 320471047  oracle     640        2097152    85
0x00000000 323485704  oracle     640        33554432   94
0x00000000 323551241  oracle     640        16777216   146
0x00000000 323584010  oracle     640        1056964608 146
0x8015f33c 323616779  oracle     640        2097152    146
0x00000000 323649548  oracle     640        2113929216 94
0xb047bb08 323682317  oracle     640        2097152    94
0x00000000 323747854  oracle     640        1073741824 76
0x00000000 323780623  oracle     640        213674622976 75
0x6a4ae364 323813392  oracle     640        2097152    76
0x00000000 326828049  oracle     640        50331648   128
0x00000000 326860818  oracle     640        5318377472 128
0x3df7a980 326893587  oracle     640        2097152    128

and the process map:
******************* Dumping process map ****************
00400000-0bdbb000 r-xp 00000000 fd:ac 4730278                            /O_H/bin/oracle
0bfba000-0c1b4000 rw-p 0b9ba000 fd:ac 4730278                            /O_H/bin/oracle
0c1b4000-0c20b000 rw-p 00000000 00:00 0
0d2c4000-0d309000 rw-p 00000000 00:00 0                                  [heap]
60000000-a0000000 rw-s 00000000 00:0c 323747854                          /SYSV00000000 (deleted)
324e000000-324e020000 r-xp 00000000 fd:01 119                            /lib64/
324e220000-324e221000 r--p 00020000 fd:01 119                            /lib64/
324e221000-324e222000 rw-p 00021000 fd:01 119                            /lib64/
324e222000-324e223000 rw-p 00000000 00:00 0
324e400000-324e58a000 r-xp 00000000 fd:01 120                            /lib64/
324e58a000-324e78a000 ---p 0018a000 fd:01 120                            /lib64/
324e78a000-324e78e000 r--p 0018a000 fd:01 120                            /lib64/
324e78e000-324e790000 rw-p 0018e000 fd:01 120                            /lib64/

My sqlplus process tried to attach the shared memory segment 323780623 at address 0xa0000000 ( shmat(323780623, 0xa0000000, 0) ) but it failed (-1 EINVAL).
The reason for the failure is the alredy linked library /lib64/ at address 324e000000 (and some others as well).

But why is it so important to have this shared memory at the specific address? As strace shows, it was able to map it on some other address without any issue ( shmat(323780623, 0, 0  = 0x2aaaaac00000 ).

But Oracle requires the shared memory segments mapped at the same address for all processes of an instance. It uses the direct memory address to access all it's memory structures. E.g. when querying saddr or paddr from v$session, these are the real addresses.
So if you map a shared memory process with your own binary at the same address, you can access the memory structures direct using these addresses.

So what bad did prelink make to the system?

The Linux server was patched at about midnight. A lot of new RPMs were installed and the server rebooted. Clusterware started and fine and int started all instances.
But sometime close to 03:34 prelink was started, it did a lot of work with all the new libraries and created a new /etc/prelink.cache file, so all new started binaries loads their libraries according to this file. In my case the shared memory segment could not be loaded at the specific address anymore - so the Instance was still healthy, but no new process could attach anymore.

My solution was to kill the instance and restart it.
Generally I recommend to disable prelink on Linux hosts running Oracle database.

Oracle also recommends doing so.

Freitag, 20. Oktober 2017

Oracle Bundle Patches - release timing

I have to invest much of my time (at work) into Oracle and managing of patches, merge patches, patch requests and similar things.
At this task I identified an interesting (but probably worthless) information:
It seems a Bundle Patch (for Version 12.1) is defined approximately (at least) 40 days before it's released.
At least the BP follows this rule.

I have requested a lot of one-off and merge patches on top of over the last days. In the picture you can see 2 requests with date of 18th and 20th Oct. Others with Requested "Not Specified" are delivered already.
But there are some in the list with 7th and 8th of September. As my "Associated Request" ID is increasingit's shown I requested them in order (and without time machine) - all of them were requests based on Database Proactive Bundle Patch Patch 26635880 which was nit GA before 10th of October.
So a valid conclusion is: The content of Patch 26635880 was defined before 7th of September.
At about 7th and 8th September many merge request were raised internally - by whatever mechanism they were chosen.
Many of them were available when Patch 26635880 was available - others still pending. (Maybe they wait for more customers to request - like a kind of vote - but that's pure guessing).
Is this information of any value?
At a first sign no. As a customer still we don't know which patches will be fixed in an upcoming BundlePatch/RCU.
But if we are really demanding, we can start approx. 40 days early to force Support / Presales / Account Manager to make sure a patch for a given bug is available within next PSU/RCU.

So, if it's really urgent, don't wait until mid of January 2018 - start in early December to request your merge patches on top of (or 12.2 ...)