Sonntag, 28. Juni 2020

the size of a thumb

The size of a thumb was used by humans as a measurement for very long time. Used by craftsmen, the size of body parts like a thumb, hand, forearm, foot and so on were both good enough to be used within a team, and even the ratio between those measurement were acceptable accurate and comparable for each individual. 
But such a system comes to its limits. Different regions had different commonly agreed definitions for the same unit. This makes it hard to properly compare and convert, especially when it comes to written contracts between different regions. 
The metric system was designed to overcome problems in converting units between regions and types of units. 
Still it took some time and effort to show the advantage of this system. As an example, around 1999 NASA invested approx $125 million to teach some engineers in Colorado the difference between pound and newton.    

For me it seems, this problem is also exists in rules of thumb. Some of them are similar, still different. 
My current example is a rule of thumb to decide, if an access path of a SQL statement is acceptable.

The first rule I found by Jeff Holt and Cary Millsap in Optimize Oracle Performance (2003):
High LIO count (more than about 10 LIO calls per nonaggregate row returned per table in the FROM clause)  

It was explained in more detail by Cary in Mastering Oracle Trace Data (2019): 
The quantity 10 is a nice round number that represents the most touches on the buffer cache you should need for plunging an index from root to leaf, plus accessing a data block (even if you have row migration or row chaining). On average, you shouldn’t need more than 10 LIOs to find a row from a single-table query. You shouldn’t need more than 20 LIOs to find a row from a two-table join, more than 30 LIOs to find a row from a three-table join, and so on. You shouldn’t need more than 420 LIOs to retrieve 7 rows from a 6-table join.

A similar, but different rule by Chris Antognini in Troubleshooting Oracle Performance (2019): 

  • Access paths that lead to less than about 5 logical reads per returned row are probably good.
  • Access paths that lead to up to about 10–15 logical reads per returned row are probably acceptable.
  • Access paths that lead to more than about 20 logical reads per returned row are probably inefficient. In other words, there’s probably room for improvement.

These different rules might be based on different assumptions. 
The example of 420 LIOs for 7 rows from a 6 table join is acceptable for Cary, maybe because he accepts the schema which enforces this 6 table join as given, but Chris might suggest a schema redesign or physical design optimization. 

At the end, all rules of thumb must be considered with reasonable care. It's never a law, but can be seen as a starting point. 

Update 2020-06-30: Chris Antognini informed me about a very important error I made:
He is referring to Access paths, not full complete execution plans.
Access paths are defined as
An access path is a technique used by a query to retrieve rows from a row source.

This means there is no big difference between Chris and Carys rules - which makes this post kind of useless. Still it will stay to document the importance of accurate reading!
I have to apologize to Chris and Cary for artificially generating a disagreement.

Donnerstag, 25. Juni 2020

Draw an GANTT chart of how your workload fits together


In this Post I try tho provide a different view on ASH.
Again I'm focussed more on a specific user experience than general system overview.
The Idea again is from Cary Millsap - for details see this Youtube Video.

This time I rely on a script from my post Oceans, Islands, and Rivers in ASH and will show how it can provide a GANTT diagram of a specific user experience.

I created 2 dummy experiences - both by a user A. With a small extension of the yesterdays script and a filter on only this user_id and only current samples, 
...
WHERE 1=1 
  AND USER_ID=111
  AND SAMPLE_ID > 293910
...
the result is

It shows 17 different sessions which were active at some time. And with some experience and good imagination, the 2 different experiences can be separated. 
But as my testcases are properly instrumented, with an additional filter it's easy to show only the one experience I'm interested in.
...
WHERE 1=1 
  AND USER_ID=111
  D SAMPLE_ID > 293910
    AND CLIENT_ID='R1'
...

This shows only those 7 lines which matches my CLIENT_ID='R1' user experience. 
It also shows 4 parallel proceses in lines 3-6. 2 of them completed slightly faster. 
Because of these parallel processes, the user experience is 816 seconds, but DB-time is 1195 seconds! (Another reason why simple aggregates on ASH are often misleading). 

This result can be used as a starting point for further investigation - often together with the applications responsible. 

The full statement is here, the GANTT_LENGTH defines how many characters the Gantt diagram should have for better visibility.
WITH CONST as (
SELECT /*+ qb_name(CONST) */
    100 as GANTT_LENGTH -- unit: characters
  , 1   as RIVER_WIDTH  -- unit: sample_periode of ASH. --everything else IDLE is an OCEAN
FROM DUAL  
) ,FILTER_FIRST as (
SELECT /*+ qb_name('FILTER_FIRST') */  ash.* 
FROM gv$active_session_history ash
WHERE 1=1 
--  AND USER_ID=123
--  AND ....
--  AND SAMPLE_ID < 487871 -->
--  AND CLIENT_ID='R1'
), ISLANDS as (
SELECT /*+ qb_Name('ISLANDS') */
      min(BEGIN_SAMPLE_ID) OVER () total_min_sample_ID
   ,  max(END_SAMPLE_ID) OVER () total_max_sample_ID      
   ,   BEGIN_SAMPLE_ID
   ,  END_SAMPLE_ID
   ,  END_SAMPLE_ID - BEGIN_SAMPLE_ID +1  as ISLAND_LENGTH
   ,  ACTIVE_COUNT
   ,  inst_id
   ,  session_id
   ,  session_serial#
FROM FILTER_FIRST ff
       MATCH_RECOGNIZE(
         PARTITION BY inst_id, session_id, session_serial#
         ORDER BY SAMPLE_ID 
         MEASURES 
           first(SAMPLE_ID) as BEGIN_SAMPLE_ID,
           LAST(sample_id)  as END_SAMPLE_ID,
           COUNT(sample_id) as ACTIVE_COUNT
         ONE ROW PER MATCH
         PATTERN( frst cont*)
         DEFINE cont as SAMPLE_ID - prev(SAMPLE_ID) <= (SELECT RIVER_WIDTH FROM CONST)  -->
       )  
), DERIVED as (
SELECT /*+ qb_name (DERIVED) */ 
 (select (total_max_sample_id - total_min_sample_id) / ( SELECT GANTT_LENGTH FROM CONST)
   FROM ISLANDS
   WHERE rownum =1) as divisor
FROM DUAL
)
SELECT i.begin_sample_id
     , i.end_sample_id
     , i.inst_id
     , i.session_id
     , i.session_serial#
     , rpad('>', trunc( (begin_sample_id - total_min_sample_ID)/ d.divisor ,1)+1, ' ') || 
         rpad('*',ceil(island_length/ d.divisor) ,'*') AS GANTT
from ISLANDS i
   , DERIVED d
WHERE     END_SAMPLE_ID - BEGIN_SAMPLE_ID +1 > 2   
ORDER BY BEGIN_SAMPLE_ID, ISLAND_LENGTH, INST_ID, SESSION_ID
/
and is also availalbe on github.

Mittwoch, 24. Juni 2020

Oceans, Islands, and Rivers in ASH

The Idea of Oceans, Islands and Rivers in performance data from Oracle sessions comes from Cary Millsap. I recommend reading his Presentation - especially page 33+. 
To summarize it very short: in an ocean of idle states (where the application doesn't interact with the session at all) there are islands of activity, but within these islands, there still can be rivers of idle events - when there is communication between the server process and the application. 

Even the algorithm was defined for trace data, I'm sure it can be implemented based on ASH data. Of course, the resolution isn't as fine as trace data, but if only big islands are of interest, and these islands doesn't have to many rivers, sampled ASH sould be acceptable.

This can be done quite easy with a statement like this.
I tried to keep it simple by having a CONST CTE at the beginning. there the RIVER_WIDTH can be set. It defines how many idle (non visible) samples define the widest river. Everything greater is an ocean which separates islands. 
In the 2nd CTE FILTER_FIRST, additional filters can be applied to ASH. Everything which narrows the focus can help, like USER_ID, SAMPLE_ID or timestamps. The idea matches proper scoping in SQL*Trace.

WITH CONST as (
SELECT /*+ qb_name(QB_CONST) */
    100 as GANTT_LENGTH -- unit: characters
  , 1   as RIVER_WIDTH  -- unit: sample_periode of ASH. --everything else IDLE is an OCEAN
FROM DUAL  
) ,FILTER_FIRST as (
SELECT /*+ qb_name(QB_FILTER_FIRST) */  ash.* 
FROM gv$active_session_history ash
WHERE 1=1 
--  AND USER_ID=123
--  AND ....
--  AND SAMPLE_ID < 487871
), ISLANDS as (
SELECT /*+ qb_Name(QB_ISLANDS) */
      min(BEGIN_SAMPLE_ID) OVER () total_min_sample_ID
   ,  max(END_SAMPLE_ID) OVER () total_max_sample_ID      
   ,   BEGIN_SAMPLE_ID
   ,  END_SAMPLE_ID
   ,  END_SAMPLE_ID - BEGIN_SAMPLE_ID +1  as ISLAND_LENGTH
   ,  ACTIVE_COUNT
   ,  inst_id
   ,  session_id
   ,  session_serial#
FROM FILTER_FIRST ff
       MATCH_RECOGNIZE(
         PARTITION BY inst_id, session_id, session_serial#
         ORDER BY SAMPLE_ID 
         MEASURES 
           first(SAMPLE_ID) as BEGIN_SAMPLE_ID,
           LAST(sample_id)  as END_SAMPLE_ID,
           COUNT(sample_id) as ACTIVE_COUNT
         ONE ROW PER MATCH
         PATTERN( frst cont*)
         DEFINE cont as SAMPLE_ID - prev(SAMPLE_ID) <= (SELECT RIVER_WIDTH FROM CONST) 
       )  
)
SELECT /*+ qb_name(QB_MAIN)*/ isl.begin_sample_id
     , isl.end_sample_id
     , isl.island_length
     , isl.active_count
     , isl.inst_id
     , isl.session_id
     , isl.session_serial#
FROM ISLANDS isl
Order by isl.begin_sample_id
        , isl.inst_id
        , isl.session_id
/
The result on my sandbox is like this:

BEGIN SAMPLE_ID END SAMPLE_ID ISLAND LENGTH ACTIVE COUNT INST ID SESSION ID SESSION SERIAL#
487760 487776 17 17 1 21 6605
487777 487780 4 4 1 274 7693
487781 487782 2 2 1 264 65268
487785 487785 1 1 1 278 22792
487791 487792 2 2 1 22 17104
487794 487794 1 1 1 22 37292
487795 487795 1 1 1 284 36814
487796 487797 2 2 1 278 22792
487798 487798 1 1 1 22 37292
487800 487800 1 1 1 31 44098
487800 487803 4 4 1 278 22792
487802 487875 74 71 1 22 37292
487807 487807 1 1 1 282 7351

BEGIN_SAMPLE_ID, END_SAMPLE_ID, INST_ID, SESSION_ID and SESSION_SERIAL# together describe every single island. It can be userd later on to do specific analysis on this island. 
ISLAND_LENGTH show the total length of the island, and ACTIVE_COUNT is the amount of "land" on this island. The lower this number, the more rivers were found. 

This can be a total different approach to analyse ASH data.

Dienstag, 23. Juni 2020

sampling everything

Oracle ASH is a great tool! 
It's on by default, and (when properly licensed) can give a lot of information about a system.
Unfortunately there are some drawbacks. One is the limited information for users experience, when part of the work is spent outside of the instance (and it's subsystems like disk-IO). Normally this part is spent on application side. 
Oracle indicates this with wait event SQL*Net message from client. As this wait event belongs to the wait class idle, it's not visible in ASH for a simple reason: ASH stands for Active Session History.  And idle just doesn't count as active. 
In my previous post I shortly explained why idle stati can be of some interest from users experience perspective. 

There is a parameter which transforms ASH into ALL Session History:

_ash_sample_all = TRUE

With this undocumented (and unsupported) setting, all sessions (not only those not idle) are sampled and the sample are written into ASH memory ring buffer. 
(There are more ash related undocumented parameters

On my sandbox system a small test created these entries:

SAMPLE_ID SQL_ID EVENT SEQ# WAIT_CLASS
485161   ON CPU 8  
485162 30kanbhk5wg4m SQL*Net message from client 32 Idle
485163 30kanbhk5wg4m SQL*Net message from client 32 Idle
485164 817dp83nj72zp ON CPU 32  
485165   SQL*Net message from client 36 Idle
485166   SQL*Net message from client 36 Idle


This corresponds quite well with the matching tracefile (filtered for SQL*Net message form client

WAIT #139691636527808: nam='SQL*Net message from client' ela= 22382 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=740940045
WAIT #139691636521672: nam='SQL*Net message from client' ela= 217489 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=741185838
WAIT #139691636521672: nam='SQL*Net message from client' ela= 11484 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=741200014
WAIT #139691637134592: nam='SQL*Net message from client' ela= 2241071 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=743459220
WAIT #139691636521672: nam='SQL*Net message from client' ela= 11994 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=743475271
WAIT #0: nam='SQL*Net message from client' ela= 2223365 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=745699337
  
In this specific case we can match the event of sample_id 485162 and 485163 (it is the same event, as seq# doesn't change) matches the 4th line of the trace excerpt (the one which lasts 2241071 ). 
The event of samles 485165 and 485166 match to the last line in the trace excerpt - ela= 2223365



Of course putting additional load onto the system comes at some risk. Let's consider how risky it is - especially if enabled for short time for specific analysis.

First a busy system where most sessions are active doing some work. 
In this instance most of the sessions are sampled anyhow - because active. The additional idle samples only add a relative small amount of lines - little additional load. 

Next a system where most sessions are idle. 
In this instance, much more lines will appear in ASH. Initially this can be seend as dangerous load. 
But all these idle sessions can become active anyhow - in worst case all together. 
If the instance is sized for the load of most sessions active, writing ALL sessions information to ASH will not hurt while it's idle. If the instance is NOT sized for the sessions coming active, the system is in high structural danger anyhow. 
 
Still another detail sould be cared of: ASH is written to AWR at some time - which will lead to more storage used when enabled for longer time. 


When compared with DBMS_MONITOR.DATABASE_TRACE_ENABLE, both methods have slightly different impact. When the system is busy, it will have a lot of quite fast wait events. A lot will be written to tracefiles and many lines will appear in ASH. 
But when the system has events which last quite long (as the 2 events in my example ago) they will create only one entry in the tracefile, but more lines in ASH. 

Montag, 22. Juni 2020

IDLE wait events can indicate work done

Oracle wait events are categorized into different wait classes. One of these classes is called Idle
As far as I know, wait events were defined by oracle kernel developers. 
From their perspective, this class makes sense: There is no work to be done by this specific session / process, so the session indicates it's waiting for new requests. 

This classification does not help when application performance or user experience is the focus. 

Even when the server process is idle - e.g. in event SQL*Net message from client, the application might do some work. 

Of course, from the server process perspective it can not be decided if somewhere some work is done, or all involved processes are idle and the user falled asleep.
A slightly differenciation could be done: from server process, there is a difference if for any open cursor (statement) more data can be provided, or there is nothing the server process can deliver to the client. Unfortunately this is not indicated in the event. 

From application perspective, idle events are misleading. That's why I prefer to call them timed events. 
This naming is not invented by me, I got some great help: 
 


Samstag, 20. Juni 2020

comparison between SQL*TRACE and ASH - get the right information

I'm quite interested in analyzing and improving Performance in Oracle databases. 
There are different approaches. A lot was discussed already if tracing or sampling is appropriate. Namely SQL*Trace or ASH/AWR. 
I will not favor one side here, just pick one aspect and compare both methods. 

Todays aspect is to compare how a specific user experience can be identified. As samples (ASH) are always enabled, it means to filter the right data out of this heap of data; In SQL*Trace it's to scope what or where SQL*Trace should be enabled. 

As I'm a lazy guy, I refer to Cary Millsaps Mastering Oracle Trace Data. In my copy of the book, Chapter 3 (Tracing Oracle) he shows an ordered list of tracing preferences. I will follow this list:

  1. If the application set a distinct client ID for each business task execution ...
    SQL*Trace can be enabled by DBMS_MONITOR.CLIENT_ID_TRACE_ENABLE
    ASH can be filtered on column CLIENT_ID
  2. If the application set a distinct module and action name ... 
    SQL*Trace can be enabled by DBMS_MONITOR.SERV_MODE_ACT_TRACE_ENABLE
    ASH can be filtered on columns MODULE and ACTION
  3. If the program run long enough to be identified ... and still run long enough ... 
    SQL*Trace can be enabled for this session by DBMS_MONITOR.SESSION_TRACE_ENABLE 
    ASH can be filtered on columns SESSION_ID and SESSION_SERIAL# 
  4. If there is any x such as SYS_CONTEXT('USERENV', x)  ...
    SQL*Trace can be enabled by an AFTER LOGON trigger
    ASH use this AFTER LOGON trigger to set CLIENT_ID, MODULE or ACTION and then filter on any of those
  5. If a new service can be created for the problematic experience ... 
    SQL*Trace can be enabled by DBMS_MONITOR.SERV_MODE_ACT_TRACE_ENABLE
    ASH can be filtered on column SERVICE_HASH (with a little join of DBA_SERVICES)
  6. If nothing else helps ... 
    SQL*Trace can be enabled for everything a short period of time by DBMS_MONITOR.DATABASE_TRACE_ENABLE and the user experience can be identified later
    ASH is on all the time, still the user experience mist be identified by the same investigation effort 

This comparison seems to be equal between SQL*Trace and ASH sampling. But I know of a small but specific difference: Wherever the user experience can not be accurately enabled by any setting, it must be filtered by other means. Here SQL*Trace has a big advantage, especially when investigated by a toolset like mrskew in Method-R Workbench: Some activities can be identified by specific start and stop statements, e.g. getting some parameters from a configuration table with a SELECT at the beginning, and write a trace line at the end. mrskew can be used to generate an artificial experience-ID for every experience between these 2 markers. In ASH, as the sampling might not catch any of these statements for a specific execution. This needs additional triggers on the statements to set any CLIENT_ID, MODULE or ACTION as in point 4 above. For sure more work in  this case, especially for SELECTs

My high level summary on identifying the users experience: both SQL*Trace and ASH are comparable. The big advantage of ASH is it's always on implementation, whereas SQL*Trace has advantages when it's about specific SQLs as markers. 

Freitag, 19. Juni 2020

unauditable select

 
Source: Natalie Berger
 Oracle Unified Audit is a great and flexible tool. 
Unfortunately I found a little gap where a SELECT is not audited. 
It is an edge-case: a common use in a CDB selects the object via the CONTAINERS() syntax.

Please follow this example to see the details. 
As I'm lazy, I used a lot of code from Tims example



First the preparations:
SYSTEM@CDB
CREATE USER c##common_user IDENTIFIED BY Common1 QUOTA UNLIMITED ON users;

GRANT CREATE SESSION, CREATE TABLE, CREATE VIEW, CREATE SYNONYM TO c##common_user CONTAINER=ALL;

SYSTEM@PDB
CREATE USER local_user IDENTIFIED BY Local1 QUOTA UNLIMITED ON users;

GRANT CREATE SESSION, CREATE TABLE TO local_user;

CREATE TABLE local_user.local_user_tab AS
    SELECT level AS ID
    FROM   dual
    CONNECT BY level <= 2; 

GRANT SELECT, INSERT, UPDATE, DELETE ON local_user.local_user_tab TO c##common_user;

Now let's enable & test an audit policy:
SYSTEM@PDB1
Create Audit Policy local_user_select
        ACTIONS
          ALL on local_user.local_user_tab
    ;

select * from local_user.local_user_tab;
  ID
   1
   2
   
select event_timestamp, action_name, object_name, sql_text,  system_privilege_used
    from unified_audit_trail u
    where UNIFIED_AUDIT_POLICIES='LOCAL_USER_SELECT'
    order by u.event_timestamp desc;   

EVENT_TIMESTAMP                   DBUSERNAME       ACTION_NAME   OBJECT_NAME      SQL_TEXT                                    SYSTEM_PRIVILEGE_USED
19-JUN-20 09.36.35.912206000 AM   SYSTEM           SELECT        LOCAL_USER_TAB   select * from LOCAL_USER.LOCAL_USER_TAB     SELECT ANY TABLE
Let's continue the setup:
C##COMMON_USER@CDB
CREATE TABLE c##common_user.local_user_tab_v (id NUMBER);

C##COMMON_USER@PDB1
CREATE VIEW c##common_user.local_user_tab_v AS
    SELECT * FROM local_user.local_user_tab;
    
select * from local_user_tab_v;
  ID
   1
   2    
SYSTEM@PDB1
/

EVENT_TIMESTAMP                   DBUSERNAME       ACTION_NAME   OBJECT_NAME      SQL_TEXT                                    SYSTEM_PRIVILEGE_USED
19-JUN-20 09.42.44.379274000 AM   C##COMMON_USER   SELECT        LOCAL_USER_TAB   select * from local_user_tab_v                                                   
19-JUN-20 09.42.33.786949000 AM   C##COMMON_USER   SELECT        LOCAL_USER_TAB   CREATE VIEW c##common_user.local_use  ...   CREATE VIEW
19-JUN-20 09.36.35.912206000 AM   SYSTEM           SELECT        LOCAL_USER_TAB   select * from LOCAL_USER.LOCAL_USER_TAB     SELECT ANY TABLE

Everything fine so far.
But the following statement is not audited:
C##COMMON_USER@CDB
SELECT con_id, id
    FROM   CONTAINERS(local_user_tab_v)
    ORDER BY con_id, id;

CON_ID   ID
       3    1
       3    2

But there is nothing in the audit trail on PDB1:
SYSTEM@PDB1
/

EVENT_TIMESTAMP                   DBUSERNAME       ACTION_NAME   OBJECT_NAME      SQL_TEXT                                    SYSTEM_PRIVILEGE_USED
19-JUN-20 09.42.44.379274000 AM   C##COMMON_USER   SELECT        LOCAL_USER_TAB   select * from local_user_tab_v                                                   
19-JUN-20 09.42.33.786949000 AM   C##COMMON_USER   SELECT        LOCAL_USER_TAB   CREATE VIEW c##common_user.local_use  ...   CREATE VIEW
19-JUN-20 09.36.35.912206000 AM   SYSTEM           SELECT        LOCAL_USER_TAB   select * from LOCAL_USER.LOCAL_USER_TAB     SELECT ANY TABLE

I can't enable a proper audit in the CDB, as the object does not exist in ALL containers:
SYSTEM@CDB
Create Audit Policy local_user_select
            ACTIONS
              ALL on local_user.local_user_tab
    Container=ALL;

          ALL on local_user.local_user_tab
                            *
ERROR at line 3:
ORA-00942: table or view does not exist
Here it's visible, there is a SELECT which is not audited properly.



But it's not totally invisible to Oracle. Performing privilege analysis in the PDB shows the activity:
After a quick sequence
 DBMS_PRIVILEGE_CAPTURE.enable_capture (on PDB) - SELECT (C##COMMON_USER@CDB) - DBMS_PRIVILEGE_CAPTURE.disable_capture & DBMS_PRIVILEGE_CAPTURE.generate_result (on PDB), 
at least these results are available:
select * from DBA_USED_OBJPRIVS       where capture not in ('ORA$DEPENDENCY') ;
CAPTURE   SEQUENCE   OS_USER   USERHOST   MODULE                      USERNAME         USED_ROLE        OBJ_PRIV   OBJECT_OWNER   OBJECT_NAME     OBJECT_TYPE  
db_pol          1    oracle    localhost  java@localhost (TNS V1-V3)  C##COMMON_USER   C##COMMON_USER   SELECT     LOCAL_USER     LOCAL_USER_TAB  TABLE            


If you have any idea how to audit this select properly, please leave a comment.