Dienstag, 11. April 2017

search for the missing ORA-04068

Oracle Active Dataguard is a great software for offloading read only activities from the primary nodes to nodes which is not utilized anyhow (at least during non-disaster times).
A very good example for doing so is regular extraction of data for processing in a data warehouse. That's also the case which led to this post.

The setting is just slightly more complex: both, primary and standby site are RACs - of course when you plan to switchover/failover, they should be somehow similar. (And in case of a disaster, it's planned to failover an disable all not-so-important activities; warehouse extract would be one of those).

Also the offloaded activities are slightly more complex than a simple query. In this case PL/SQL code is included. According to the documentation, that's not an issue at all - it's still read only. But in this DB (11.2.0.4), it was an issue: Sometimes the result was not as expected.

Now I first need to define the expectation: in this special case, the ERROR is the expectation, and no error the wrong result. Whenever a package (with global variables) is changed in a session, all other sessions which initialized the package before the change, but called it afterwards, must get

04068, 00000, "existing state of packages%s%s%s has been discarded"
// *Cause:  One of errors 4060 - 4067 when attempt to execute a stored
//          procedure.
// *Action: Try again after proper re-initialization of any application's
//          state.

Then the application can catch this error, re-initiate the package and continue.
that's how it should be.

But we sometimes had strange results in the test environment. After some investigation, we found it and I simplified it to this Testcase:

Testcase: 
(results are only shown when of any interest) 
INST1_ ... Instance1
INST2_ ... Instance2
PRI ... Primary DB 
ADG ... Active DataGuard 
it's important the sessions are not closed within the test!

prepare user:
SYS@INST1_PRI:
=============
-- create the user 
create user SR_TEST1 identified by "xxx" default tablespace users temporary tablespace temp; 
grant connect, create session, create procedure, create table to SR_TEST1; 
alter user SR_TEST1 quota unlimited on users; 

SESSION1@INST1_PRI:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 

create table tab1 (id number, ver varchar2(30)); 
insert into tab1 values (1, '_'); 
commit; 

create or replace PACKAGE PACK AS 

first_load date := to_date('3333-01-01 01:01:01', 'YYYY-MM-DD HH24:MI:SS'); 

PROCEDURE proc; 

END PACK; 
/ 

CREATE OR REPLACE 
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('1: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 
NULL; 
END proc; 

END PACK; 
/ 
update tab1 set ver='1' where id=1; 
commit; 

create or replace procedure proc1 
is 
begin 
SR_TEST1.pack.proc; 
end; 
/ 

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 1: loaded first on 2017-03-15 09:52:00 
> 
> PL/SQL procedure successfully completed. 
> 
select * from tab1; 
> 
> ID VER 
>---------- ------------------------------ 
> 1 1 

SESSION2@INST2_PRI:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 1: loaded first on 2017-03-15 09:53:00 
> 
> PL/SQL procedure successfully completed. 
> 
select * from tab1; 
> 
> ID VER 
>---------- ------------------------------ 
> 1 1 

SESSION3@INST1_ADG:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 
-- here the redo apply is running
select sid from v$session where program like '%MRP0%';

>       SID
>----------
>         2


-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

> 1: loaded first on 2017-03-15 09:53:21 
> 
> PL/SQL procedure successfully completed. 
> 
> select * from tab1; 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 1 

SESSION4@INST1_ADG:
==================
-- connect & create objects 
connect SR_TEST1/"xxx" 
set serveroutput on 
-- NO redo apply is running hereselect inst_id, sid from gv$session where program like '%MRP0%';

> no rows selected

-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

> 1: loaded first on 2017-03-15 09:54:00 
> 
> PL/SQL procedure successfully completed. 
> 
> select * from tab1; 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 1 

SESSION1@INST1_PRI:
==================
-- change package body 

CREATE OR REPLACE 
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('2: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 
NULL; 
END proc; 

END PACK; 
/ 
update tab1 set ver='2' where id=1; 
commit; 


-- test 
set serveroutput on 
exec SR_TEST1.proc1 
select * from tab1; 

> 2: loaded first on 2017-03-15 09:55:51 
> 
> PL/SQL procedure successfully completed. 
> 
> 
> ID VER 
> ---------- ------------------------------ 
> 1 2 

SESSION2@INST2_PRI:
==================
-- test 
set serveroutput on 
exec SR_TEST1.proc1 
SR_TEST1@EBSSID051 > BEGIN SR_TEST1.proc1; END; 

* 
ERROR at line 1: 
ORA-04068: existing state of packages has been discarded 
ORA-04061: existing state of package body "SR_TEST1.PACK" has been invalidated 
ORA-04065: not executed, altered or dropped package body "SR_TEST1.PACK" 
ORA-06508: PL/SQL: could not find program unit being called: "SR_TEST1.PACK" 
ORA-06512: at "SR_TEST1.PROC1", line 4 
ORA-06512: at line 1 


> select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 

-- THIS Is the expected result 

SESSION3@INST1_ADG:
==================
-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

* 
ERROR at line 1: 
ORA-04068: existing state of packages has been discarded 
ORA-04061: existing state of package body "SR_TEST1.PACK" has been invalidated 
ORA-04065: not executed, altered or dropped package body "SR_TEST1.PACK" 
ORA-06508: PL/SQL: could not find program unit being called: "SR_TEST1.PACK" 
ORA-06512: at "SR_TEST1.PROC1", line 4 
ORA-06512: at line 1 

select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 

SESSION4@INST1_ADG:
==================
-- test & initiate 
set serveroutput on 
exec SR_TEST1.proc1 

1: loaded first on 2017-03-15 09:54:00 

PL/SQL procedure successfully completed. 

select * from tab1; 

ID VER 
---------- ------------------------------ 
1 2 
-- HERE you see the update on the table is applied on ADG already, but the procedure output is still prefixed with 1: - not 2: as it should be.

SESSION4@INST2_ADG:
==================
-- to check: the package is there! 
set pages 99 
select text from all_source where name = 'PACK' 
and type = 'PACKAGE BODY' 
3 order by line; 

TEXT 
------------------------
PACKAGE BODY PACK AS 

PROCEDURE proc AS 
BEGIN 
if first_load > sysdate then 
first_load := sysdate; 
end if; 
DBMS_OUTPUT.PUT_LINE('2: loaded first on ' || to_char(first_load,'YYYY-MM-DD HH24:MI:SS') ); 

NULL; 
END proc; 

END PACK; 

12 rows selected. 

Even the testcase is clear and simple, it was not that easy to identify the root cause in a more complex life environment. Special thanks to Andy Sayer who helped me to sort and refine my ideas over twitter.

With this testcase I was able to open a SR at Oracle, and after some some Support-ping-pong I got a useful information:
It is known as Bug: 18357555: ADD SUPPORT FOR KGLIV_BRNULPALL TO KGLCLUSTERMESSAGE() / KQLMBIVG()

I'm only aware of one Patch:25808607 (for PSU:11.2.0.4.160419 - together with some other patches) which is available right now. But you can ask for your own, if you have an Active Dataguard and RAC, and Packages with global variables.
Kommentar veröffentlichen