Posts mit dem Label performance werden angezeigt. Alle Posts anzeigen
Posts mit dem Label performance werden angezeigt. Alle Posts anzeigen

2020-06-22

IDLE wait events can indicate work done

Oracle wait events are categorized into different wait classes. One of these classes is called Idle
As far as I know, wait events were defined by oracle kernel developers. 
From their perspective, this class makes sense: There is no work to be done by this specific session / process, so the session indicates it's waiting for new requests. 

This classification does not help when application performance or user experience is the focus. 

Even when the server process is idle - e.g. in event SQL*Net message from client, the application might do some work. 

Of course, from the server process perspective it can not be decided if somewhere some work is done, or all involved processes are idle and the user falled asleep.
A slightly differenciation could be done: from server process, there is a difference if for any open cursor (statement) more data can be provided, or there is nothing the server process can deliver to the client. Unfortunately this is not indicated in the event. 

From application perspective, idle events are misleading. That's why I prefer to call them timed events. 
This naming is not invented by me, I got some great help: 
 


2019-02-17

shared resources

Some days ago I had one of these calls from an application colleague:
The database was slow yesterday, can you please check what's the problem?
Of course, I had some short discussion if he really means the DB or should it be called the application is slow. Also some other questions needed to be asked first, e.g. if it's a response time or throughput issue, when it was "good" last time, what "bad" and "good" means in numbers (seconds/ experience, requests / second), if it's affecting all the experiences on this DB or only a subset. Can it be reproduced or at least does it occur on a known pattern. Also the DB name and users affected I had to ask.
Some time I should create a comprehensive checklist for these questions as I'm missing some of them regularly. But that not today's topic.

As the answers are very vague (at least I got a DB & username with some rough timestamp when it might has started) I checked the DB. I'm it would be more precise to focus on user experience. But I know the developers quite well; it would cost a lot of effort & management pressure to convince them to implement proper instrumentation.
So I was going for low hanging fruits first. ASH/AWR showed this picture:
There is time of more activities in the Database, and it's quite visible where it started and ended. With some more analysis it was obvious the number of calls didn't change dramatically. Getting data from the application (yes, there was at least an activity log) showed in fact the number of requests followed a daily pattern, but this didn't look like the picture above. The pattern of demand was similar to other days; but on any other day AWR showed no such picture, only a "breathe" similar to the applications demand.
By this conversation we implicit generated a (still very blurry) "good" pattern to compare the "bad" one.


These bits of information were not perfect, but at least something I could use for further investigations.
AWR shows much higher IO times than normal. I crosschecked the most active SQL_IDs from a "good" pattern against the "bad" one and could confirm they are IO-bound in "good" as well, so they might be affected by this unusual IO.
On the DB-cluster TFA is installed, and so is oswatcher. (Again, aggregates only. But still much better information than nothing). A small awk script analysed the data there (I just don't know why oswatcher doesn't has a SQL interface similar to osquery ) and there were huge response time, but no queueing on the Linux hosts.
This would now require digging deeper through the storage layers (FC SAN and several big central storage boxes). Unfortunately I does not have access to them and their performance data is not externalized right now and the SAN admin was to busy with other things, like compiling reports for senior managers 🤦.
But at least I'm somehow lucky: At this company a nice schema is implemented which gives ASM-disks much more information than only /dev/mapper/anything. As the name of the storage box is part of the ASM-disk path, by some knowledge about the possible servers involved (e.g. only servers in this datacenter, and with some other attributes could be involved at all) and a centralized repository I got a list of all DB-servers and DBs potentially involved on that layer.
With this list and another part of our repository (in that case periodic v$sysstat snapshots [aggregates of aggregates - guess the accuracy there]) a list of other DBs with high IO numbers during that time could be compiled.
Checking these databases there was a top producer of IO:

Some additional analysis showed this load on this particular DB created sufficient IO to saturate one subsystem in the storage box. By doing so all other system which use the same subsystem also suffered. To crosscheck that, the Storage admin (who completed the report in the meantime) provided other DBs utilizing that subsystem - and they showed similar patterns at exactly that time.

Even with all the data collected, this accumulation of coincidents is still no prove what was going on. But it was sufficient likely for the management to think about remediation activities.


Here the question might show up why only the admins of this one application showed up and complained about their system? During the investigation it was visible that other DBs showed similar patterns and so other applications were likely to suffer also.
I invested some time to  get answers to these questions. This is not related to the initial goal of the activities, only to satisfy my curiosity and improve my knowledge about the whole landscape.
It showed these other applications also suffered, but the impact was much smaller for 2 reasons:

  1. The portion of DB IO time to their user experience was relatively small. Even these high response time didn't cause a lot of harm. 
  2. The requirements of the most suffering application were more strict. It can be explained by the time a person accepts to wait after a click: If this click is on a webpage, most readers of this blog might understand what's going on behind the scene and some tenth of a second up to some seconds are acceptable for an answer. But if you flip a switch to turn on the light, they expect the light to shine immediately (aka. within less time than the human senses can grasp)


After all the analysis was done and roughly explained to the manager of the most suffering application, he invited for a meeting.
A manager, some people with different roles in development and ops, a representative for the "culprit application which caused the high load" and I were there. He opened the meeting with a summary (condensed by me to):
We and they share the same storage, they did some heavy query and we suffered. 
Technically this is right, but the way it's provided to those not experienced in IT leads to the managers request:
We must ensure they do not cause such IO load anymore. 

This might seem legit, but it might not lead to the managers real goal: Make his application reasonable fast in as many situations as possible. So I had to kill this branch of the discussion for a moment and explained, why they are not the real problem:
As the whole IT is a shared system on various layers (different hosts share the same storage system, different virtual hosts share the same VM layer, different instances share the same DB-cluster, different sessions share the same instance, ...) it is an unreasonable big effort to define resource limits, quotas and similar for all components involved. Especially in a highly dynamic environment with hundreds of DBs, thousands of servers and zillions of processes.
A majority in this room showed some shock & horror and claimed they were not aware of the high level of complexity in basic infrastructure.

With some effort we can limit the IO consumption of this particular DB which caused the issue this time. But next time (and there will be a next time) we only can be sure this particular DB is not involved.
As that's not sufficient, the next request was:
Let's then monitor the storage box and alarm every time a component is saturated.
A nice idea, isn't it?
But whom to alarm? -
The SAN admin of course. 
What should he do then? He can't reduce the load and can't increase the storage capacity either (at least not short term without budget, vendor involved, ...). 
So let's inform those which are connected to this resource! 
In our example it would have been two dozen applications - most of them didn't suffer or care.
Then only the top 5.
In our case you wouldn't have been alarmed.

This also doesn't lead to anything useful.

About at that time I raised a simple question:
Who in this room is responsible for the (suffering part of the) application?
---
<silence>
(the managers face went pale for a second).
After some moments, an external (not employed) developer raised his hand and muttered something like "If no one wants to, I can take that". 
I was slightly bored of this meeting, so to speed up things my next question was:

Is every IO your application doing [during the critical activity] necessary?

The argument in short goes like that: If you are less dependent on IO timing, even bad IOs can't harm that much. I call this

make it robust against impacts out of your control

 Beside the obvious effect to improve the application, a nice side effect is a less busy total environment. This again pays back as better service for the whole company.

The meeting stopped close to that point. The new found architect & developers promised to see if they can make their application more robust, the manager wants to raise awareness in all IT departments what a shared environment is and how it can affect business processes within the company.
I have a mixed feeling: the analysis seems valid, but it was full of educated guess & shortcuts - this doesn't feel right as it contains "experience" and "luck"; there is a method which describes a more structured way, I just felt unable to follow these steps . I hope I could transfer the initial reaction (finger-pointing to another team) to something more valuable for the company: use less resources, make your application robust.
But I doubt my effort was lasting.

2017-07-07

The Slow Tires

Once there was a man with a car. On this car he had 4 tires.
As the car is a modern one it has a nice board computer which collects many measurements. One is the rotation per minute of the tires.
One day it took more time for the man to get home than usually. So he decided to check his car's computer for any values which could lead to that delay.
After some minutes searching he made an alarming observation: This days average rotations per minute of the tires were much lower than the values of the day before. So he did a quick check if he found some errors in the computers log, but as everything seemed ok he was puzzled.
The shop where he bought the tires was still open and so he went there and ask a technician to check if the tires were ok, as they were slower today than yesterday. A technician checked correct pressure and even made sure the tire balancing was ok.
But there still was the lower rotation per minute for this day, and no explanation.
Next to the tire retailer there was a garage, which owner was friends with the technician of the tire shop. He agreed to check the car, if there was anything he could find about this days slow tires. But there wasn't anything. No errors, motor control all ok, the shock absorbers were even less utilised this day than the previous day.
During the investigation our man tried to remember if anything has been different to the previous day. He remembered some things: the day before he had lunch, which he skipped on this day to get home early. And even the coke in the cup holder was empty. So in theory the car should even be lighter than the day before. He told the two engineers about his memories, but they didn't care about some decagram changes in weight. At least the mechanic confirmed the average rotation was slower on this day. It matched the slower rotation in the gearbox, something which was not available for a customer in the cars memory. So the measurement was correct.
All three men were puzzled. But the mechanic had a friend in highway administration who had access to the highways measurements. Maybe there wa a technical issue there?
But the highway was in good shape, quite similar as the day before. Nothing was broken, all signs working ok. No accident reported. Asphalt temperature was slightly higher than the day before, but perfect within the tires comfort zone. The mechanic got one more information from his friend: Highway administration was more pleased about this day, as the utilisation was higher when the man went home on this day compared to his yesterdays commute, but still within acceptable boundaries to avoid heavy complains.
Now the men had checked virtually everything which came to their mind: tires, many components of the car, highway. Everything fine, no errors or accidents, totally within acceptable boundaries.
Big data, big confusion.
The tire retailer grabbed his smartphone in frustration to check new gossip in the app of his preferred local radio station. By accident he hit the traffic tab with his calloused fingers and saw a nice histogram of the traffic in highway segments around the city. At the travel time of our man on this day there were much more cars in his highway segment than at the day before. He showed this nice graph to the others and slowly it all made sense:
They all know from own experience, more cars on the highway will lead to slower average speed at some filling level. Traffic is still smooth but the individual just can't do high speed anymore, regardless of legal limitations or police.
Of course at this point the utilization increases, as from the highway administration perspective, there are more tires on the road now. Even when each one rotates slightly slower, the total sum of all of them is higher than a single car at highest speed ever could create.
Our man just could not drive as fast as yesterday as many other cars were in his way.
And the tires could not rotate faster for exactly that reason.
With all this insight our mechanic once again called his friend and asked him if he know a way to improve the situation. The answer was simple:
Build a dedicated highway for you. Never share it with anyone. Not even with your best friend. IT will go perfect most of the time. But the one day you want to do the surprise birthday party, exactly this friend will be on the highway ahead of you and block all your best intended efforts.

I'm sorry for all who love the english language - if you want to translate this to proper english, please do so and inform me, I'll remove this content and link to yours!


2016-11-20

cost vs. gets

Last week I hit an interesting performance issue:
A Table (T) has 2 index (of interest). One (IX_1) is only on column S, the other (IX_2) on (C, S).
The optimizer calculates the plan with IX_1 more expensive than the plan with IX_2 - as this should be more selective.
But the gets for the plan with IX_1 were less than those the optimizer preferred.

Here the information about the statement, index and plans.
As it's a real life example (just the object names are obfuscated) please do not wonder about the Partition operators. This statement only worked in the same subset of partitions - based on record_timestamp.


SELECT <columns>
FROM T WHERE S = :BIND0
         AND C = :BIND1
   AND record_timestamp > :BIND2
--       AND some more filters 

INDEX:

IX_1 C
IX_2 S, C


Plan1:
Plan hash value: 1749230273
------------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name            | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |                 |   218 |   352K|   517   (1)|
|   1 |  SORT ORDER BY                                |                 |   218 |   352K|   517   (1)|
|*  2 |   FILTER                                      |                 |       |       |            |
|   3 |    PARTITION RANGE ITERATOR                   |                 |   218 |   352K|   516   (0)|
|   4 |     PARTITION LIST SINGLE                     |                 |   218 |   352K|   516   (0)|
|*  5 |      TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T               |   218 |   352K|   516   (0)|
|*  6 |       INDEX RANGE SCAN                        | IX_1            |   473 |       |    63   (0)|
------------------------------------------------------------------------------------------------------
consistent gets 27647

Plan2:
Plan hash value: 4278735161
------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name      | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |           |   218 |   352K|   505   (1)|
|   1 |  SORT ORDER BY                                |           |   218 |   352K|   505   (1)|
|*  2 |   FILTER                                      |           |       |       |            |
|   3 |    PARTITION RANGE ITERATOR                   |           |   218 |   352K|   504   (0)|
|   4 |     PARTITION LIST SINGLE                     |           |   218 |   352K|   504   (0)|
|*  5 |      TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T         |   218 |   352K|   504   (0)|
|*  6 |       INDEX RANGE SCAN                        | IX_2      |   461 |       |    63   (0)|
------------------------------------------------------------------------------------------------
consistent gets 33807


How could the optimizer be wrong, even all table and column statistics are correct?

It's just simple the relation between columns C and S: There can be many S for a given C, but a S always have a dedicated C. Both Index will provide the same ROWIDs for any given (C, S) - but as IX_1 is smaller than IX_2 - less gets are required.

But before we simply suggest just to hint (or outline, profile, patch, ...) the statement to use IX_1 some more information about the ongoing discussion:
As the application loops through all Cs, and for every C through all S, it might be more efficient to have all S per C (as in IX_2) in buffercache for the time C is of any interest - and afterwards do not care about those blocks at all. IX_1 would hit the same blocks quite often as the S are by no way grouped/ordered for any C. I'm not sure if we ever can make a reasonable testcase, as it's hard to bring the environment into a state without disturbing effects to measure such effects.

This is shows how easily the optimizer can be misguided, and even with reasonable knowledge about the data and application a decision for the "best" solution can be hard to impossible.

2016-10-25

no peek in PL/SQL

Connor McDonald wrote a blog about differences between bind peeking and SYS_CONTEXT in SQL queries. This even led to an proposal in Database Ideas: CBO should peek at SYS_CONTEXT values just like bind values (feel free to vote and comment, if you like it)
As I have a friend who really loves SYS_CONTEXT; I showed him the blog.

In the following discussion he stated (more/less
)
I should not have this problem at all, as I'm not using SYS_CONTEXT directly, but wrapped in a package like select * from xyz where a = pkg.get_ctxt('BLAH');

As a proud member of BAAG, I want clarify this should.

The preparation of my environment is nothing specific and quite close to Connors:

create table t ( x varchar2(10), y char(100));

insert into t
select 'a', rownum
from dual
/

insert into t
select 'b', rownum
from dual
connect by level <= 100000
/

commit;

create index ix on t ( x ) ;

exec dbms_stats.gather_table_stats('','T',method_Opt=>'for all columns size 5');

drop context blah;
create context blah using my_package;

create or replace PACKAGE MY_PACKAGE AS 

  procedure my_proc(p_val varchar2); 
  function get_sys_context return varchar2;
  function get_variable return varchar2;

END MY_PACKAGE;

create or replace package body MY_PACKAGE AS 
  my_var varchar2(10);
  procedure my_proc(p_val varchar2)  is
    begin
        my_var := p_val;
        sys.dbms_session.set_context('BLAH','ATTRIB',p_val);
    end my_proc;
    
  function get_sys_context return varchar2 is
    begin
      return sys_context('BLAH','ATTRIB');
    end get_sys_context;

  function get_variable return varchar2 is
    begin
      return my_var;
    end get_variable;

end MY_PACKAGE;




Now the real testcase starts:

exec my_package.my_proc('a');

select my_package.get_sys_context from dual;
select my_package.get_variable from dual;
Just shows everything works as expected:
PL/SQL procedure successfully completed.

GET_SYS_CONTEXT
---------------
a

GET_VARIABLE
---------------
a


So let's check the optimizers opinnion:
select /*+ gather_plan_statistics */ count(y)
from t
where x = my_package.get_sys_context;

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
COUNT(Y)
----------
1


SQL_ID  2mgq2mrzd26j6, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = 
my_package.get_sys_context

Plan hash value: 2966233522

----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.38 |    1507 |   1505 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.38 |    1507 |   1505 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |      1 |00:00:00.38 |    1507 |   1505 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("X"="MY_PACKAGE"."GET_SYS_CONTEXT"())

It seems optimizer doesn't like SYS_CONTEXT wrapped in a package.

select /*+ gather_plan_statistics */ count(y)
from t
where x = my_package.get_variable;

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

COUNT(Y)
----------
1


SQL_ID  4ytd1pqx1yr09, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = 
my_package.get_variable

Plan hash value: 2966233522

----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.16 |    1507 |   1504 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.16 |    1507 |   1504 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |      1 |00:00:00.16 |    1507 |   1504 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("X"="MY_PACKAGE"."GET_VARIABLE"())

The optimizer doesn't like the any pl/sql package function, regardless where the value comes from.

To bring it down to an even simpler construction:
CREATE OR REPLACE FUNCTION RETURN_BIND 
(
  BIND_IN IN VARCHAR2 
) RETURN VARCHAR2 AS 
BEGIN
  RETURN BIND_IN;
END RETURN_BIND;

variable b1 varchar2(10)
exec :b1 := 'a';

select RETURN_BIND(:b1) from dual;

select /*+ gather_plan_statistics */ count(y)
from t
where x = RETURN_BIND(:b1);

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
COUNT(Y)
----------
         1


SQL_ID  84y4gdn5tmsb0, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = 
RETURN_BIND(:b1)
 
Plan hash value: 2966233522
 
----------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.18 |    1507 |   1505 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.18 |    1507 |   1505 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |      1 |00:00:00.18 |    1507 |   1505 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("X"="RETURN_BIND"(:B1))

I hope this is enough evidence to claim, there is no bind peeking at the result of any PL/SQL function.

With this in mind, if you care for performance and your data might not be even distributed, take special care if you filter by the return value of any PL/SQL function (or SYS_CONTEXT).

This all was done in an unpatched 12.1 sandbox.

2013-12-16

Instrumentation still needed

Recently I received an email with an simple content:
"It" is slow again :-)
The email explained a package to be slow.

Ok, I started with my "not enough information" reply and an email conversation began.

We are still not ready to identify the problem, but as there is no code instrumentation in place I tried to identify what's possible with v$active_session_history (ASH).
To do so, I created 2 Packages: P1 and P2. P1 has a procedure DO_IT and a function SLEEP, which burns CPU for approximately x seconds. P2 only has a function SLEEP, which also burns CPU for approximately x seconds and then calls P1.SLEEP again. P1.DO_IT has a simple sql which calls P1.SLEEP and P2.SLEEP. My code is properly instrumented using DBMS_APPLICATION_INFO.set_module in the code and dbms_session.set_identifier to identify one specific execution - to compare it's value to the information already available in v$active_session_history.
Here the packages:
CREATE OR REPLACE
PACKAGE P1
IS
  PROCEDURE DO_IT(SEC IN NUMBER);
  FUNCTION SLEEP(SEC IN NUMBER)
    RETURN NUMBER;
END; 
/

CREATE OR REPLACE
PACKAGE P2
IS
  FUNCTION SLEEP(SEC IN NUMBER)
    RETURN NUMBER;
END; 
/
 
CREATE OR REPLACE PACKAGE BODY P1
IS
  PROCEDURE DO_IT(SEC IN NUMBER)
  IS
    r NUMBER;
  BEGIN
    DBMS_APPLICATION_INFO.set_module(module_name => 'P1', 
                                     action_name => 'DO_IT');
    dbms_output.put_line('++P1.DO_IT++');
    SELECT P2.SLEEP(SEC) into r 
    FROM DUAL
    WHERE P1.SLEEP(SEC) > 1;
    dbms_output.put_line('--P1.DO_IT--');
    DBMS_APPLICATION_INFO.set_module(module_name => 'blaaa',
                                     action_name => 'blaaa');
  END DO_IT;
  
  FUNCTION SLEEP(SEC IN NUMBER)
    RETURN NUMBER
  IS
    C NUMBER := 0;
    now DATE := SYSDATE;
  BEGIN
    DBMS_APPLICATION_INFO.set_module(module_name => 'P1',
                                     action_name => 'SLEEP');
    dbms_output.put_line('++P1.SLEEP++');
    LOOP
      C := C + 1;
      EXIT WHEN now + (SEC * (1/86400)) <= SYSDATE;
    END LOOP;
    
    dbms_output.put_line('--P1.SLEEP--');
    DBMS_APPLICATION_INFO.set_module(module_name => 'blaaa',
                                    action_name => 'blaaa');
    RETURN C;
    END SLEEP;
END;
/

CREATE OR REPLACE
PACKAGE BODY P2
IS
  FUNCTION SLEEP(SEC IN NUMBER)
    RETURN NUMBER
  IS 
    C NUMBER := 0; 
    D NUMBER := 0; 
    now DATE := SYSDATE;
  BEGIN
    DBMS_APPLICATION_INFO.set_module(module_name => 'P2',
                                    action_name => 'SLEEP');
        dbms_output.put_line('++P2.SLEEP++');                                
    LOOP
      C := C + 1;
      EXIT WHEN now + (SEC * (1/86400)) <= SYSDATE;
    END LOOP;  
    SELECT P1.SLEEP(SEC) into D
    FROM DUAL;
    C := C + D; 
    dbms_output.put_line('--P2.SLEEP--');    
    DBMS_APPLICATION_INFO.set_module(module_name => 'blaaa',
                                     action_name => 'blaaa');
    RETURN C;
  END SLEEP;
END;
/

I started the testcase with a simple
set serveroutput on
exec dbms_session.set_identifier(client_id => 'berx testcase ' || to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') );
exec p1.do_it(5);
exec dbms_session.set_identifier(client_id => 'berx testcase ' || to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') );
exec p1.do_it(5);
exec dbms_session.set_identifier(client_id => 'blaaa');

And immediately afterwards I checked for sql_id, top_level_sql_id, action, module, client_id, plsql_entry_object_id, plsql_entry_subprogram_id, plsql_object and plsql_subprogram_id

select SESSION_ID,
       sql_id,
       ( SELECT max( substr( sql_text , 1, 40 )) FROM v$sql sq WHERE sq.sql_id = ash.sql_id ) AS sql_text,
       TOP_LEVEL_SQL_ID,
       ( SELECT max( substr( sql_text , 1, 40 )) FROM v$sql sq WHERE sq.sql_id = ash.top_level_sql_id ) AS top_level_sql_text,
       action,
       module,
       client_id
,      ( SELECT object_name    FROM dba_procedures WHERE object_id = plsql_entry_object_id AND subprogram_id = 0) AS plsql_entry_object
,      ( SELECT procedure_name FROM dba_procedures WHERE object_id = plsql_entry_object_id AND subprogram_id = plsql_entry_subprogram_id) AS plsql_entry_subprogram
,      ( SELECT object_name    FROM dba_procedures WHERE object_id = plsql_object_id       AND subprogram_id = 0) AS plsql_object
,      ( SELECT procedure_name FROM dba_procedures WHERE object_id = plsql_object_id       AND subprogram_id = PLSQL_SUBPROGRAM_ID) AS plsql_subprogram 
from v$active_session_history ash 
where SAMPLE_TIME > sysdate - 1/(24*60)
      and session_id = SYS_CONTEXT('USERENV','SID')
order by SAMPLE_TIME asc;

Here is the result (please excuse the wide list).

ST                  SESSION_ID  SQL_ID        SQL_TEXT                                 TOP_LEVEL_SQL_ID TOP_LEVEL_SQL_TEXT                       ACTION     MODULE     CLIENT_ID                           PLSQL_ENTRY_OBJECT             PLSQL_ENTRY_SUBPROGRAM         PLSQL_OBJECT    PLSQL_SUBPROGRAM
------------------- ----------  ------------- ---------------------------------------- ---------------- ---------------------------------------- ---------- ---------- ----------------------------------- ------------------------------ ------------------------------ --------------- ----------------
2013-12-14 22:24:00        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:01        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:02        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:03        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:04        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:05        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:06        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:07        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          P2              SLEEP            
2013-12-14 22:24:08        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:09        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          P2              SLEEP            
2013-12-14 22:24:10        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:11        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:12        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:13        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:14        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:00   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:15        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:16        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:17        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:18        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:19        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:20        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P2              SLEEP            
2013-12-14 22:24:21        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P2              SLEEP            
2013-12-14 22:24:22        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          STANDARD        SYSDATE          
2013-12-14 22:24:23        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P2              SLEEP            
2013-12-14 22:24:24        160  cqd77zbnpa79n SELECT P2.SLEEP(:B1 ) FROM DUAL WHERE P1 1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P2         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P2              SLEEP            
2013-12-14 22:24:25        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:26        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:27        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:28        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P1              SLEEP            
2013-12-14 22:24:29        160  cc9hp65m5aq6s SELECT P1.SLEEP(:B1 ) FROM DUAL          1ahw2g5n38js5    BEGIN p1.do_it(5); END;                  SLEEP      P1         berx testcase 2013-12-14 22:24:15   P1                             DO_IT                          P1              SLEEP            

 30 rows selected 

So my findings here:

  • TOP_LEVEL_SQL_ID is the same all the time. That's good, as I have a valid anchor on the top most layer. 
  • there is nothing similar to SQL_EXEC_ID for the TOP_LEVEL_SQL_ID. So I can not distinguish the 2 runs of P1.DO_IT.
  • PLSQL_OBJECT.PLSQL_SUBPROGRAM does not always reflect the current application module, as also internal code (as SYSDATE) is shown there. ACTION and MODULE are more reliable. 
  • Only with ASH I don't see any general way to identify the runtime of sub-TOP_LEVEL SQLs or PLSQL_SUBPROGRAMs.
  • The most complete picture from ASH is only given with proper instrumentation
  • If you need the full insight about the stack of SQL and PL/SQL calls, you need to trace the session. This again works like a charm with ACTION and MODULE set and DBMS_MONITOR.SERV_MOD_ACT_STAT_ENABLE 

2011-06-20

ORA-02393 on purpose

Today I got an email by a developer about an error on a test database.
(I shortened the conversation, but tried not to change the content):


Dev:
We get an ORA-02393: exceeded call limit on CPU usage almost immediatedly after executing the following stmt: ...Could you pls adjust the limits?



After some hours of meetings I replied:
Me:

These limits where introduced in test to find exactly those statements which are running too slow for an OLTP application. Can we help you in tuning the statement?

Dev:
I already exchanged details on the issue with [another DBA] and found the cause of the 'longer than usual' execution time (index missing ;-)
Nonetheless 100ms max exec time is a bit too strict for a dev platform imho - but as the ... is going to be replaced soon we won't request any changes to a system/db in the last chapter of its life cycle.


I did not reply yet. Just try to explain my world.

Is a CPU_PER_CALL limit really needed on a test system?


In theory: no.

If the developers implement a perfect code instrumentation, they would know the runtime of every statement. In test it would be evaluated for a proper per-statement performance as well as for an over all performance.

But if they would do so, I would never have received this email. So the world is not perfect. The code isn't, also.

Are 100ms enough time? seems really tough!


Yes, they are.

Even on an 8 years old UltraSparc III+ you can do a lot of CPU cycles within 100ms. And only CPU is count there, no time for IO or other WAITs.

But the biggest argument for this limit: I never got any complaint about it by a good statement. Only by those which needed urgent suport, anyhow.

2008-12-18

Doing it better ... (Part 2)

Based on my Doing it better ... (Part 1), several days after the first approach and a lot of testcases (for other parts of the application) the performance of our so well tuned statement decreased dramatically.
What happened? The index on RESOURCE_POOL.FREED_TIME became soomething unbalanced. Not the kind you might argue 'B* trees can not be unbalanced per definition', but more of the kind 'based on the code there often remain some blocks nearly unused, which makes the selectivity of the index apear less and less usable. In fact, the statment is not really nice to the CBO, as an AND ROWNUM=1 is hard to estimate.

This brings me to one of the highlights in this project: Based on good arguments and a test-case we could argue to make a slightly change in the code and the logic. To understand the change some words must be said to the logic: the part v_time>FREED_TIME was used to introduce a kind of 'cold down period' where a resource was not used but neverthleless NOT reused again. This period was ever planned to be several minutes (about 5 was the asumption for the project, but it could also be 2 or 15, but it was agreed NEVER to be less than 1 or more than 20 minutes).

This led us to these changes:
we could change the table to make FREED_TIME of type NUMBER and changed the statement to:

UPDATE RESOURCE_POOL SET
STATUS='INUSE',
FREED_TIME=NULL,
SESSION_START_TIME=:B7
WHERE
FREED_TIME NOT IN
(select (to_char(sysdate, 'MI')-(rownum - 1)) - 60 * FLOOR((to_char(sysdate, 'MI')-(rownum - 1))/60) safe_minutes
from dual
connect by level <= :MINUTES_GRACE)
AND STATUS='FREE'
AND ADDRESS_POOL=:B1
AND ROWNUM=1
RETURNING IP_ADDR INTO :O0

and the statement which 'frees' the resource:


UPDATE RESOURCE_POOL SET
...
STATUS='FREE' FREED_TIME=to_char(sysdate, 'MI')
...
This leads to a logic, where every time the status is freed the current minute is stored in FREED_TIME. There are only 60 different values which come again and again and again. (so The selectivity is quite stable.
The select ... from dual ... connect by ... is a simple way to have a eaily changable list of 'the last x minutes' which is aware the break every hour; the more or less strange formula is a kind of MOD().

2008-12-17

Doing it better ... (Part 1)

I never thought something like this could be worth a blog entry, but as Tom Kyte can do this, I can also! ;-)

I had the pleasure to assist in a little project as a DBA.
The pleasure was mine, as the target (in response-time) was outstanding (in my current working circumstances) and the setup was done to achieve this in a more or less professional manner (which much more outstanding ...), so we had testcases, a kind of change-management, really load-tests, totally cooperative designers/developers. Great!

The goal of the whole thing was to 'lock' and 'free' available resources out of a pool (according to some additional 'quality'). The example I'm writing is about the process 'provide the next free ressource out of the given pool'. It should come back in about 10ms.

Here the Iterations we came through:

1) primary code:


SELECT resource into v_resource
FROM RESOURCE_POOL
WHERE v_time>FREED_TIME
AND STATUS='FREE'
AND POOL=v_pool
AND ROWNUM=1
FOR UPDATE;
UPDATE RESOURCE_POOL set
STATUS='INUSE',
FREED_TIME=null,
START_TIME=current_timestamp,
WHERE RESOURCE=v_resource;

with this comment beside these statements:
-- FOR UPDATE statement ensures that the selected row is locked until the transition is completed --
-- open 2 different sqlplus commands, issue the command and watch the last command. it waits until update is completed (or commit) on the first session
-- FOR UPDATE NOWAIT would report an error if the row cant be locked immediately --
-- FOR UPDATE SKIP LOCKED pauses other db queries - this could be the perf bottleneck ...


BRRR! Even with an index in RESOURCE_POOL.RESOURCE it's not really optimal. And, the hard fact is, it's to slow.

2) after some iterations we came to:

UPDATE RESOURCE_POOL set
STATUS='INUSE',
FREED_TIME=null,
SESSION_START_TIME=current_timestamp,
WHERE v_time>FREED_TIME
AND STATUS='FREE'
AND RESOURCE_POOL=v_pool
AND ROWNUM=1
AND v_time>FREED_TIME
RETURNING RESOURCE INTO v_resource;


As we are counting miliseconds, this WAS an advantage. And for me personally, it's more beautiful, even this is not really a tuning-goal.

Beside some tests on the physical layout (we came to the conclusion to use partitions by resource pools and local indexes on them) this saved us enough time so we could stop tuning.
Maybe I also have to tell all the data is small enough to fit into the buffer_pool.
And for the Records, it's 10.2.0.4 EE on HP-UX (Itanium), Failover ServiceGuard.

2008-11-26

dynamic SQL with DMLs

Once again, Christian Antognini is the reason for this entry (beside my curiosity of course).
In Chapter 8 - Parsing of his book Troubleshooting Oracle Performance he showed the advantage of explicit cursor handling. The package dbms_sql shows the most flexibility. Unfortunately he only showed SELECT statements, no DML (which I'm interrested for some reasons). So I had to do the tests myselve:

CREATE TABLE dyn_sql (id number, val varchar2(32));

DECLARE   v_cur_hdl         INTEGER;
  v_stmt_str        VARCHAR2(200);
  v_rows_processed  BINARY_INTEGER;
  gid               INTEGER;
  gval              VARCHAR2(32);
BEGIN   v_stmt_str := 'INSERT INTO dyn_sql VALUES (:gid, :gval)';
  v_cur_hdl := DBMS_SQL.OPEN_CURSOR(2); -- to avoid ORA-29471 in 11g  
  DBMS_SQL.PARSE(v_cur_hdl, v_stmt_str, DBMS_SQL.NATIVE);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gid',  1);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gval', 'V01C01');
  
  v_rows_processed := DBMS_SQL.EXECUTE(v_cur_hdl);
  
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gid',  2);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gval', 'V02C01');
  v_rows_processed := DBMS_SQL.EXECUTE(v_cur_hdl);
  COMMIT;

  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gid',  3);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gval', 'V03C02');

  v_rows_processed := DBMS_SQL.EXECUTE(v_cur_hdl);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gid',  4);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gval', 'V04C02');
  
  v_rows_processed := DBMS_SQL.EXECUTE(v_cur_hdl);
  ROLLBACK;
  
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gid',  5);
  DBMS_SQL.BIND_VARIABLE (v_cur_hdl, ':gval', 'V05C03');
  
  v_rows_processed := DBMS_SQL.EXECUTE(v_cur_hdl);
  
  DBMS_SQL.CLOSE_CURSOR(v_cur_hdl);
END;
 /

SQL> select * from dyn_sql;
        ID VAL
---------- ------
         1 V01C01
         2 V02C01
         5 V05C03
blubb
what are my findings?
  1. DBMS_SQL works fine also for DMLs.
  2. commit/rollback within an open cursor works fine.
  3. closing a cursor implicite commits (direct sql-trace sequence: EXEC(insert) - CLOSE(insert) - EXEC(pl/sql) - CLOSE (pl/sql)Deleted after Chris commentar
There is an explicite warning in the documentation: Native Dynamic SQL is faster than DBMS_SQL. I did not test this. But be warned. (and test it yourselve, If you fear performance!)

2008-11-25

sysstat_sample.sql

Based on Tanel Poders sample.sql and ses.sql I was interrested to get some informations about the System at all.
There are various discussions wether or not it is useful to have a look on the system when hunting a problem within a session. Nevertheless I decided I need such a tool. On the one hand to generate a 'good' baseline, on the other to have a quick, zentralised view. It's also of some use to compare the system-statistics to the values of one dedicated sessions to see if this session is the major reason for any changes or submit only a small amount, but might suffer from others in the system.

So here is the code:

select name,
min(value) "MIN",
max(value) "MAX",
max(value) - min(value) "overall_delta",
max(delta) "max_single_delta",
sum(decode(delta,0,0,1)) "#changes" -- count distinct
from
(with c as
(select &1 counter from dual)
select rn,
name,
value,
decode(rn,1,0,value-prev) delta
from ( select /*+ ordered use_nl(t) */
r.rn,
NAME,
Value,
LAG(VALUE) OVER (Partition BY NAME order by r.rn) prev
from
(select /*+ no_unnest */
rownum rn,
1 sample_dummy
from dual
connect by level <= (select counter from c) ) r, v$sysstat t order by name, r.rn ) ) syss
where value > 0
and delta > 0
group by name
order by name
/

This piece of code samples through v$sysstat and generates for every entry (where the value > 0, to reduce the lines) a line with the minimal value, the maximal value, the delta between them (this could also be done by DBMS_LOCK.SLEEP) but also the greatest single step between 2 sequent samples.
There might also be other statistical functions of some interrest, but I have too little knowledge on statistics and how to use them (comments/suggestions welcome!).

The script can also be found here, where I will keep the most recent version all the time.

2008-11-07

Composite Bitmap Indexes

In hist Book Troubleshoot Oracle Performance Christian Antognini wrote also about Composite Bitmap Indexes. (p400)
Unfortunately he only proclaims
Composite bitmap indexes are rarely created. This is because several indexes can be combined efficiently in order to apply a restriction. To see how powerful bitmap indexes are, let’s look at several queries. (p400)
without givin gany evidence. So I grabbed his scripts (thank you for providing them) and run some testcases (on 11.1.0.6):
I just created this additional bitmap index:
CREATE BITMAP INDEX bx_i_n456 on t (n4, n5, n6);

bitmap AND:

index_combine:

SELECT /*+ index_combine(t i_n4 i_n5 i_n6) */ *
FROM t
WHERE n4 = 6 AND n5 = 42 AND n6 = 11

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T    |      1 |      1 |      1 |00:00:00.01 |       7 |
|   2 |   BITMAP CONVERSION TO ROWIDS|      |      1 |        |      1 |00:00:00.01 |       6 |
|   3 |    BITMAP AND                |      |      1 |        |      1 |00:00:00.01 |       6 |
|*  4 |     BITMAP INDEX SINGLE VALUE| I_N5 |      1 |        |      1 |00:00:00.01 |       2 |
|*  5 |     BITMAP INDEX SINGLE VALUE| I_N6 |      1 |        |      1 |00:00:00.01 |       2 |
|*  6 |     BITMAP INDEX SINGLE VALUE| I_N4 |      1 |        |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

   4 - access("N5"=42)
   5 - access("N6"=11)
   6 - access("N4"=6)
Cost: 3

composite bitmap index(CIB):

SELECT *
FROM t
WHERE n4 = 6 AND n5 = 42 AND n6 = 11

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T         |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |      1 |        |      1 |00:00:00.01 |       3 |
|*  3 |    BITMAP INDEX SINGLE VALUE | BX_I_N456 |      1 |        |      1 |00:00:00.01 |       3 |
----------------------------------------------------------------------------------------------------

   3 - access("N4"=6 AND "N5"=42 AND "N6"=11)
Cost:1

In this case the CBI wins.

bitmap OR:

index_combine:

SELECT /*+ index_combine(t i_n4 i_n5 i_n6) */ *
FROM t
WHERE n4 = 6 OR n5 = 42 OR n6 = 11

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T    |      1 |    797 |    767 |00:00:00.01 |     419 |
|   2 |   BITMAP CONVERSION TO ROWIDS|      |      1 |        |    767 |00:00:00.01 |       7 |
|   3 |    BITMAP OR                 |      |      1 |        |      1 |00:00:00.01 |       7 |
|*  4 |     BITMAP INDEX SINGLE VALUE| I_N4 |      1 |        |      1 |00:00:00.01 |       3 |
|*  5 |     BITMAP INDEX SINGLE VALUE| I_N6 |      1 |        |      1 |00:00:00.01 |       2 |
|*  6 |     BITMAP INDEX SINGLE VALUE| I_N5 |      1 |        |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

   4 - access("N4"=6)
   5 - access("N6"=11)
   6 - access("N5"=42)
 Cost: 135

no hints:

SELECT *
FROM t
WHERE n4 = 6 OR n5 = 42 OR n6 = 11

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T    |      1 |    797 |    767 |00:00:00.01 |     419 |
|   2 |   BITMAP CONVERSION TO ROWIDS|      |      1 |        |    767 |00:00:00.01 |       7 |
|   3 |    BITMAP OR                 |      |      1 |        |      1 |00:00:00.01 |       7 |
|*  4 |     BITMAP INDEX SINGLE VALUE| I_N4 |      1 |        |      1 |00:00:00.01 |       3 |
|*  5 |     BITMAP INDEX SINGLE VALUE| I_N6 |      1 |        |      1 |00:00:00.01 |       2 |
|*  6 |     BITMAP INDEX SINGLE VALUE| I_N5 |      1 |        |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

   4 - access("N4"=6)
   5 - access("N6"=11)
   6 - access("N5"=42)
Cost:135

index_combine with BX_I_N456

SELECT /*+ index_combine(t BX_I_N456 i_n5 i_n6) */ *
FROM t
WHERE n4 = 6 OR n5 = 42 OR n6 = 11   

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T         |      1 |    797 |    767 |00:00:00.04 |     420 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |      1 |        |    767 |00:00:00.03 |       8 |
|   3 |    BITMAP OR                 |           |      1 |        |      1 |00:00:00.03 |       8 |
|   4 |     BITMAP MERGE             |           |      1 |        |      1 |00:00:00.03 |       4 |
|*  5 |      BITMAP INDEX RANGE SCAN | BX_I_N456 |      1 |        |    527 |00:00:00.01 |       4 |
|*  6 |     BITMAP INDEX SINGLE VALUE| I_N6      |      1 |        |      1 |00:00:00.01 |       2 |
|*  7 |     BITMAP INDEX SINGLE VALUE| I_N5      |      1 |        |      1 |00:00:00.01 |       2 |
----------------------------------------------------------------------------------------------------

   5 - access("N4"=6)
       filter("N4"=6)
   6 - access("N6"=11)
   7 - access("N5"=42)
Cost: 138

index_combine with BX_I_N456 on 2nd place

SELECT /*+ index_combine(t i_n4 BX_I_N456 i_n6) */ *
FROM t
WHERE n4 = 6 OR n5 = 42 OR n6 = 11   

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T         |      1 |    797 |    767 |00:00:00.01 |     420 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |      1 |        |    767 |00:00:00.01 |       8 |
|   3 |    BITMAP OR                 |           |      1 |        |      1 |00:00:00.01 |       8 |
|   4 |     BITMAP MERGE             |           |      1 |        |      1 |00:00:00.01 |       4 |
|*  5 |      BITMAP INDEX RANGE SCAN | BX_I_N456 |      1 |        |    527 |00:00:00.01 |       4 |
|*  6 |     BITMAP INDEX SINGLE VALUE| I_N6      |      1 |        |      1 |00:00:00.01 |       2 |
|*  7 |     BITMAP INDEX SINGLE VALUE| I_N5      |      1 |        |      1 |00:00:00.01 |       2 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("N4"=6)
       filter("N4"=6)
   6 - access("N6"=11)
   7 - access("N5"=42)
Costs: 138

in this case, the 3 seperated bitmap indices wins.

NOT EQUAL AND:

index_combine:

SELECT /*+ index_combine(t i_n4 i_n5 i_n6) */ *
FROM t
WHERE n4 != 6 AND n5 = 42 AND n6 = 11   

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS BY INDEX ROWID | T    |      1 |      1 |      1 |00:00:00.01 |       6 |
|   2 |   BITMAP CONVERSION TO ROWIDS|      |      1 |        |      2 |00:00:00.01 |       4 |
|   3 |    BITMAP AND                |      |      1 |        |      1 |00:00:00.01 |       4 |
|*  4 |     BITMAP INDEX SINGLE VALUE| I_N5 |      1 |        |      1 |00:00:00.01 |       2 |
|*  5 |     BITMAP INDEX SINGLE VALUE| I_N6 |      1 |        |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

   1 - filter("N4"<>6)
   4 - access("N5"=42)
   5 - access("N6"=11)
Costs: 2

a different execution plan from Chris' Book, there a BITMAP MINUS was shown in the execution plan.
Maybe a question for a seperated session, where the BITMAP MINUS disappeared.

for some reason, this hint generated the BITMAP MINUS
SELECT /*+ index(t  BX_I_N456) */ *
FROM t
WHERE n4 != 6 AND n5 = 42 AND n6 = 11   

------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID   | T         |      1 |      1 |      1 |00:00:00.01 |      13 |
|   2 |   BITMAP CONVERSION TO ROWIDS  |           |      1 |        |      1 |00:00:00.01 |      12 |
|   3 |    BITMAP MINUS                |           |      1 |        |      1 |00:00:00.01 |      12 |
|   4 |     BITMAP MINUS               |           |      1 |        |      1 |00:00:00.01 |       8 |
|   5 |      BITMAP AND                |           |      1 |        |      1 |00:00:00.01 |       4 |
|*  6 |       BITMAP INDEX SINGLE VALUE| I_N5      |      1 |        |      1 |00:00:00.01 |       2 |
|*  7 |       BITMAP INDEX SINGLE VALUE| I_N6      |      1 |        |      1 |00:00:00.01 |       2 |
|   8 |      BITMAP MERGE              |           |      1 |        |      1 |00:00:00.01 |       4 |
|*  9 |       BITMAP INDEX RANGE SCAN  | BX_I_N456 |      1 |        |    526 |00:00:00.01 |       4 |
|  10 |     BITMAP MERGE               |           |      1 |        |      1 |00:00:00.01 |       4 |
|* 11 |      BITMAP INDEX RANGE SCAN   | BX_I_N456 |      1 |        |    526 |00:00:00.01 |       4 |
------------------------------------------------------------------------------------------------------

   6 - access("N5"=42)
   7 - access("N6"=11)
   9 - access("N4"=6)
  11 - access("N4" IS NULL)
Cost: 4

So I decided to do a slightly different testcase:

index_combine:

SELECT /*+ index_combine(t i_n4 i_n5 i_n6) */ * 
FROM t 
WHERE n4 = 6 and n5 != 42 and n6 = 11

-------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID   | T    |      1 |      8 |      7 |00:00:00.01 |      15 |
|   2 |   BITMAP CONVERSION TO ROWIDS  |      |      1 |        |      7 |00:00:00.01 |       8 |
|   3 |    BITMAP MINUS                |      |      1 |        |      1 |00:00:00.01 |       8 |
|   4 |     BITMAP MINUS               |      |      1 |        |      1 |00:00:00.01 |       6 |
|   5 |      BITMAP AND                |      |      1 |        |      1 |00:00:00.01 |       4 |
|*  6 |       BITMAP INDEX SINGLE VALUE| I_N6 |      1 |        |      1 |00:00:00.01 |       2 |
|*  7 |       BITMAP INDEX SINGLE VALUE| I_N4 |      1 |        |      1 |00:00:00.01 |       2 |
|*  8 |      BITMAP INDEX SINGLE VALUE | I_N5 |      1 |        |      1 |00:00:00.01 |       2 |
|*  9 |     BITMAP INDEX SINGLE VALUE  | I_N5 |      1 |        |      1 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------------

   6 - access("N6"=11)
   7 - access("N4"=6)
   8 - access("N5"=42)
   9 - access("N5" IS NULL)
Costs: 6

CIB:

SELECT /*+ index(t BX_I_N456) */ * 
FROM t 
WHERE n4 =6 and n5 != 42 and n6 = 11

----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   1 |  TABLE ACCESS BY INDEX ROWID | T         |      1 |      8 |      7 |00:00:00.01 |      12 |
|   2 |   BITMAP CONVERSION TO ROWIDS|           |      1 |        |      7 |00:00:00.01 |       5 |
|*  3 |    BITMAP INDEX RANGE SCAN   | BX_I_N456 |      1 |        |      7 |00:00:00.01 |       5 |
----------------------------------------------------------------------------------------------------

   3 - access("N4"=6)
       filter(("N6"=11 AND "N5"<>42 AND "N4"=6))
Costs: 107

no hints:

SELECT  * 
FROM t 
WHERE n4 =6 and n5 != 42 and n6 = 11    
    
-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------
|*  1 |  TABLE ACCESS BY INDEX ROWID | T    |      1 |      8 |      7 |00:00:00.01 |      12 |
|   2 |   BITMAP CONVERSION TO ROWIDS|      |      1 |        |      8 |00:00:00.01 |       4 |
|   3 |    BITMAP AND                |      |      1 |        |      1 |00:00:00.01 |       4 |
|*  4 |     BITMAP INDEX SINGLE VALUE| I_N6 |      1 |        |      1 |00:00:00.01 |       2 |
|*  5 |     BITMAP INDEX SINGLE VALUE| I_N4 |      1 |        |      1 |00:00:00.01 |       2 |
-----------------------------------------------------------------------------------------------

   1 - filter("N5"<>42)
   4 - access("N6"=11)
   5 - access("N4"=6) 
Costs: 4

My a little bit more explicit view of composite bitmap indexes is:
  • They can be useful in AND statements:
  • even not optimal for OR statements, they can replace the bitmap index which is created only on the first column without high cost increasement
  • in NOT EQUAL AND statements they really kill the performance if enfoced.

2008-09-19

simple tuning using decode

Currently I have to assist in tuning of a really time critic application (in comparison to other applications/projects I have to support). There I found 2 statements called within an PL/SQL procedure:
  1. select count(id)
    from berx1
    where val=1 and type is null
  2. select count(id)
    from berx1
    where val=1 and type is null and dim= 2
(The statements are reduced to test-cases, my test-table is
create table berx1 (id number, val number, type number, dim number);

insert into berx1 values (1, 1, NULL, 1);
insert into berx1 values (2, 1, 1, 2);
insert into berx1 values (3, 1, NULL, 3);
insert into berx1 values (4, 1, 1, 1);
insert into berx1 values (5, 2, NULL, 2);
insert into berx1 values (6, 2, 1, 3);
insert into berx1 values (7, 2, NULL, 1);
insert into berx1 values (8, 2, 1, 2);
insert into berx1 values (9, 3, NULL, 3);
insert into berx1 values (10, 3, 1, 1);
insert into berx1 values (11, 3, NULL, 2);
insert into berx1 values (12, 3, 1, 3);
insert into berx1 values (13, 4, NULL, 1);
insert into berx1 values (14, 4, 1, 2);
insert into berx1 values (15, 4, NULL, 3);
insert into berx1 values (16, 4, 1, 1);
insert into berx1 values (17, 5, NULL, 2);
insert into berx1 values (18, 5, 1, 3);
)
Even both of them are very fast (with an index on val), As we have to save every slice of time we can, I merged the statements into one:
SELECT count(id) count
, NVL( SUM( DECODE(dim, 2 -- if 'dim' is 2
, 1 -- add 1 to sum
, 0) -- otherwise add nothing
), 0
) count_dim
FROM berx1 where val=1 and type is null;
What did I expect so far? (not tested into detail!)
  • less context-switches between PL/SQL and SQL
  • less soft parses / library cache lookups
  • less activity in the buffer cache
In fact, at the end of the optimization I hope I can modify the program logic, maybe we can save a lot more time on other places and also get rid of this statement. But this is still matter of investigation with the application-designers.

2008-07-18

Oracle Performance books

I got 2 new books:
Cost-Based Oracle Fundamentals by Jonathan Lewis (ISBN:1-59059-636-6)
and
Troubleshooting Oracle Performance by Christian Antognini (ISBN:1-59059-917-9)
Both are worth reading.
I will try to read and understand both of them.
Maybe I will reproduce some of their tests and will try to do my own, and maybe maybe I will post the results here ;-)