berxblog: SQL

Posts mit dem Label SQL_TRACE werden angezeigt. Alle Posts anzeigen

2024-02-08

reading a LOB in another session

Reading a LOB in Oracle is often quite easy: it's just a column in the query and all the libraries available today are doing the work necessary to make the LOBs content appear in the desired variable. What a nice world.

Unfortunately it is still a LOB and requires special methods to fetch the data. And especially because this is covered by libraries, LOBs can be called slow.

If fast access to many rows with many LOBs is required, leaving all the work to a single session might not meet the requirements. But LOBs also can be fetched in other sessions - kind of in parallel.
(this must NOT be mixed up with parallel queries - instead they are separate sessions doing the LOB interactions while the main session processes the tables queries).

As the specific implementation might be led by the requirements, my example here only shows what's possible.

I have a very small table with only 1 row:


create table T (id number, B clob);

insert into T values (1, 'aäoöuüsß');

commit;

A small perl program selects the row in one session and reads it content in another one.
(please do NOT use this script as a blueprint!)

the relevant parts are


my $sql = "SELECT id, B LOB_LOCATOR FROM T WHERE ROWNUM = 1";
my $sth = $dbh->prepare($sql, { ora_auto_lob => 0 })
    or die "Couldn't prepare statement: $DBI::errstr";
$sth->execute() or die "Couldn't execute statement: $DBI::errstr";

# Get the first row
my ($id, $lob_locator) = $sth->fetchrow_array();
print "id: $id \n";

# Close the first connection
$sth->finish;
$dbh->disconnect;

print "ll: $lob_locator \n";
print " --- \n";

here the content was fetched in first session and printed to stdout ...

and

$dbh2 = DBI->connect("dbi:Oracle:host=$hostname;port=$port;service_name=$service_name", $username, $password)
    or die "Couldn't connect to database: $DBI::errstr";

   my $chunk_size = 1034;   # Arbitrary chunk size, for example
   my $offset = 1;   # Offsets start at 1, not 0
   my $data = $dbh2->ora_lob_read( $lob_locator, $offset, $chunk_size ) ;
   print "lob: $data \n";

where the LOBs content is read and printed. the execution looks like

[oracle@localhost ~]$ perl lob.pl
id: 1
ll: OCILobLocatorPtr=SCALAR(0x2b388a8)
 ---
lob: aaoouus?
[oracle@localhost ~]$

all fine here.

To be sure no fancy session cache ruined my testcase, I enabled sql trace before with


 exec dbms_monitor.database_trace_enable();

and so I got 2 tracefiles: FREE_ora_70047.trc with the SELECT


=====================
PARSING IN CURSOR #140276220704640 len=48 dep=0 uid=131 oct=3 lid=131 tim=22284540551 hv=1723986746 ad='61d7dbe0' sqlid='c2zwqytmc3wtu'
SELECT id, B LOB_LOCATOR FROM T WHERE ROWNUM = 1
END OF STMT

and FREE_ora_70050.trc with the LOBREAD


LOBREAD: type=PERSISTENT LOB,bytes=8,c=1934,e=1693,p=0,cr=1,cu=0,tim=22284648543
WAIT #0: nam='SQL*Net message from client' ela=3182 driver id=1952673792 #bytes=1 p3=0 obj#=4294967295 tim=22284651786
XCTEND rlbk=0, rd_only=1, tim=22284651919

all the files can be found here.
This shows how we can read a lob locators content in another session. That somehow conflicts with the documentation which states

A LOB locator points to the actual LOB contents. The locator is returned when you retrieve the LOB, not the LOB's contents. LOB locators cannot be saved in one transaction or session and used again in a later transaction or session.

but I assume this explanation did care about session context, not clients.

This little example shows that LOBs content CAN be read in other sessions when the lob locator was fetched in a main query and the LOB handling should be offloaded to other connections.

2023-12-09

The cost of BIND information in SQL Trace files

Oracle SQL Trace files are a very valuable source of information - especially when it comes to performance analysis.

But people often are afraid of the impact of tracing to performance.

Luckily, Oracle chose quite useful defaults - at least when using DBMS_MONITOR:

WAITS=>TRUE
BINDS=>FALSE
PLAN_STAT=>'FIRST_EXECUTION'

This still leaves the question about the impact of writing BINDS to the tracefile.

The best information I could find is by Cary Millsap in his book Tracing Oracle - chapter 25: Why We Recommend Not Tracing BINDS:

Here’s why that matters: every time an application binds a value to a placeholder, a BINDS=>TRUE trace causes Oracle to write 5 lines (about 180 bytes) into the trace file, using one write call per line. So, an EXEC that might consume only a few hundred microseconds when BINDS=>FALSE has to write an extra 36KB to the trace file with 2,000 extra write calls when BINDS=>TRUE:

200 _placeholders × 180 _{bytes/placeholder} = 36,000 _bytes

200 _placeholders × 5 _{lines/placeholder} × 1 _write/line = 1,000 _writes

Even this is a great technical explanation, it does not tell us how much time it takes.

While analyzing some other questions, a tracefile with a lot of BIND entries was generated - and it significantly slowed down the application. Not a problem in this case, but something worth to check.

The tracefile I generated covers ~117 seconds - 1 minutes and 57 seconds.

mrskew just doesn't know about some of this time:


CALL-NAME                      DURATION       %   CALLS      MEAN       MIN       MAX
---------------------------  ----------  ------  ------  --------  --------  --------
EXEC                          91.022874   89.9%   4,398  0.020696  0.000000  0.093324
SQL*Net message from client    6.423181    6.3%   4,398  0.001460  0.000334  0.518238
log file sync                  3.633550    3.6%   4,397  0.000826  0.000372  0.013874
enq: SQ - contention           0.070351    0.1%      63  0.001117  0.000007  0.003597
library cache: mutex X         0.040133    0.0%       6  0.006689  0.000022  0.015928
cursor: pin S wait on X        0.027345    0.0%       3  0.009115  0.003581  0.019512
SQL*Net message to client      0.015619    0.0%   4,398  0.000004  0.000002  0.000079
CLOSE                          0.014905    0.0%   4,405  0.000003  0.000000  0.000079
enq: TX - index contention     0.012613    0.0%      11  0.001147  0.000036  0.005059
buffer busy waits              0.008619    0.0%     184  0.000047  0.000003  0.002146
14 others                      0.019216    0.0%   4,486  0.000004  0.000000  0.007130
---------------------------  ----------  ------  ------  --------  --------  --------
TOTAL (24)                   101.288406  100.0%  26,749  0.003787  0.000000  0.518238

mrskew only aggregates waits and dbcalls but doesn't care about the time "in between". We can assume the missing 16 seconds are the amount of time used to write the binds - and call it a day.

mrprof shows some more information in the profile:

	Subroutine	Duration		Calls	Duration per call (seconds)
	Subroutine	seconds	% R	Calls	mean	min	max	skew
1	CPU: EXEC dbcalls	92.562280	78.9%	17,710	0.005227	·	0.062486	▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆█▁▁▁▁▁
2	unaccounted-for within dbcalls	9.062352	7.7%	39,840	0.000227	−0.025781	0.026413	▁▁▁▁▁▂▆▁▁▁▁▁▁▁▂█▄▁▁▁▁▁
3	SQL*Net message from client	6.423181	5.5%	4,398	0.001460	0.000334	0.518238	▁▁▁▄█▁▂▁▁▁▁
4	log file sync	3.633550	3.1%	4,397	0.000826	0.000372	0.013874	▁▁▁█▄▁▁▁▁▁▁
5	25 other subroutines	5.663594	4.8%	51,276	0.000110	·	0.019512	▁▁▁▁▁▁▁▁▁▁▁▁▁▄█▂▁▁▁▁▁▁
6	Total (29)	117.344957	100.0%	117,621	0.000998	−0.025781	0.518238

There 9 seconds are called unaccounted-for within dbcalls - the explanation how this artificial component is calculated is

mrprof synthesizes an “unaccounted-for between calls” call when there is missing time between dep = 0 database and system calls. Every adjacent pair of such calls will define some non-zero “unaccounted-for between calls” duration

That is somehow useful, but there is also a different approach possible.
In the tracefile, I don't have timing information for the BINDS, but at least I have some activity before and afterwards. e.g.


PARSE #1795421110760:c=32,e=32,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=0,tim=968215712176
BINDS #1795421110760:

 Bind#0
  oacdty=02 mxl=22(06) mxlc=00 mal=00 scl=00 pre=00
  oacflg=10 fl2=0001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=881f7430  bln=22  avl=06  flg=09
  value=6607304129
  ...
 Bind#2
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=00 fl2=1000001 frm=00 csi=00 siz=24 off=0
  kxsbbbfp=06881698  bln=22  avl=02  flg=05
  value=1
EXEC #1795421110760:c=605,e=604,p=0,cr=0,cu=4,mis=0,r=1,dep=1,og=4,plh=0,tim=968215713003

I see the EXEC completed at 968215713003 and had a duration of 604 microseconds - so it started at 968215712399. The last call before was a PARSE at 968215712176. This leaves a gap of 223 μs. Not a lot, but worth to care. To make this time somehow visible, I have a little perl script :


#!/usr/bin/perl -w
#bind_synt_wait.pl
# creating a synthetic bind "WAIT" for otherwise UFBCs 
use strict;

my $prev_cursor     = 0 ;
my $cursor          = 0 ;
my $prev_timestamp  = 0 ;
my $timestamp       = 0 ;
my $calc_timestamp  = 0 ;
my $ela             = 0 ;
my $calc_ela        = 0 ;
my $to_be_processed = 0 ; 
# 0 => nothing to do, 
# 1 => BIND seen, but no WAIT, 
# 2 => BIND & WAIT seen ==> process!

#my $tablespace;
#my $block;
#my $object_id;

while (<>) {
  my $line = $_;
  if ($line =~ m/.*#(\d+)[: ].*/) {
    ($cursor = $line) =~ s/.*#(\d+)[: ].*/$1/ ;
  }
  $cursor =~ tr/\n//d;

  if ($line =~ m/.*tim=(\d+).*/) {
    ($timestamp = $line) =~ s/.*tim=(\d+).*/$1/ ;
	if ($to_be_processed == 1) {
	  $to_be_processed = 2;	
	}
  }
  $timestamp =~ tr/\n//d;
#  print " ---- timestamp = $timestamp \n";

  if ($line =~ m/.*ela= (\d+) .*/) {
    ($ela = $line) =~ s/.*ela= (\d+) .*/$1/ ;
  }
  $ela =~ tr/\n//d;
  
  if ($line =~ m/.*,e=(\d+),.*/) {
    ($ela = $line) =~ s/.*,e=(\d+),.*/$1/ ;
  }
  $ela =~ tr/\n//d;
#  print " ---- ela = $ela \n";
  
  if ($line =~ m/^BINDS/) {
    $to_be_processed = 1;
  }
  
  if ($to_be_processed == 2) {
# WAIT #391723720: nam='db file scattered read' ela= 2876 file#=5 block#=129 blocks=127 obj#=71123 tim=90982740602
    $calc_ela = ($timestamp - $ela) - $prev_timestamp;
	if ( $calc_ela < 0) {
		$calc_ela = 0 ; #$timestamp - $prev_timestamp;
	}
	$calc_timestamp = $timestamp-$ela;
    print "WAIT #"; # as the hash is somethign special ...
    print "$cursor: nam='BIND to tracefile write' ela= $calc_ela tim=$calc_timestamp\n";
    $to_be_processed = 0;
  }
  print $line;
  
  # will need these when generating tim & ela for BIND
  $prev_cursor = $cursor;
  $prev_timestamp = $timestamp;
}

which does exactly this calculation and inserts an artificial line into the tracefile:

WAIT #1795421110760: nam='BIND to tracefile write' ela= 223 tim=968215712399

Now mrskew knows about this time:

    
CALL-NAME                      DURATION       %   CALLS      MEAN       MIN       MAX
---------------------------  ----------  ------  ------  --------  --------  --------
EXEC                          91.022874   66.3%   4,398  0.020696  0.000000  0.093324
BIND to tracefile write       36.088810   26.3%  13,487  0.002676  0.000000  0.031517
SQL*Net message from client    6.423181    4.7%   4,398  0.001460  0.000334  0.518238
log file sync                  3.633550    2.6%   4,397  0.000826  0.000372  0.013874
enq: SQ - contention           0.070351    0.1%      63  0.001117  0.000007  0.003597
library cache: mutex X         0.040133    0.0%       6  0.006689  0.000022  0.015928
cursor: pin S wait on X        0.027345    0.0%       3  0.009115  0.003581  0.019512
SQL*Net message to client      0.015619    0.0%   4,398  0.000004  0.000002  0.000079
CLOSE                          0.014905    0.0%   4,405  0.000003  0.000000  0.000079
enq: TX - index contention     0.012613    0.0%      11  0.001147  0.000036  0.005059
15 others                      0.027835    0.0%   4,670  0.000006  0.000000  0.007130
---------------------------  ----------  ------  ------  --------  --------  --------
TOTAL (25)                   137.377216  100.0%  40,236  0.003414  0.000000  0.518238

and also the profile shows my artificial wait event:

	Subroutine	Duration		Calls	Duration per call (seconds)
	Subroutine	seconds	% R	Calls	mean	min	max	skew
1	CPU: EXEC dbcalls	92.562280	78.9%	17,710	0.005227	·	0.062486	▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆█▁▁▁▁▁
2	BIND to tracefile write	36.088810	30.8%	13,487	0.002676	·	0.031517	▁▁▁▂▃█▁▁▁▁▁
3	SQL*Net message from client	6.423181	5.5%	4,398	0.001460	0.000334	0.518238	▁▁▁▄█▁▂▁▁▁▁
4	log file sync	3.633550	3.1%	4,397	0.000826	0.000372	0.013874	▁▁▁█▄▁▁▁▁▁▁
5	unaccounted-for within dbcalls	−25.605864	−21.8%	39,840	−0.000643	−0.038324	0.026413	▁▁▁▁▁█▇▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁
6	25 other subroutines	4.243000	3.6%	64,522	0.000066	−0.016385	0.019512	▁▁▁▁▁▁▁▁▁▁▁▁▁▄█▂▁▁▁▁▁▁
7	Total (30)	117.344957	100.0%	144,354	0.000813	−0.038324	0.518238

But unfortunately, mrskews TOTAL is now bigger than 117 sec, and mrprof needs to introduce negative unaccounted-for within db calls to compensate that.

The problem in this situation is the different dept and how their EXEC dbcalls are aggregated - and the artificial BIND to tracefile write doesn't fit well. It would required further modifications which would lead to further problems - and are probably not worth.

At the end, it would be fine if Oracle could also instrument the code for writing BIND information to the tracefile - this could eliminate all this manual work. Until then, it might sometimes help to slightly improve a tracefile with data already hidden inside.

2022-02-22

sqltrace - stitch together one EXPERIENCE ID from different tracefiles

Today I stumbled across an interesting question:
(slightly edited)

How can I cut one specific EXPERIENCE ID from several SQL-tracefiles and merge it together in one (artificial) tracefile for further processing.

For those who are familiar with SQL-Trace files, the term EXPERIENCE ID might be dubious. That's for a good reason: The EXPERIENCE ID is developed by Cary Millsap and Jeff Holt to identify one specific experience. It can be implemented by several ways - here an example from The Method R Guide to Mastering Oracle Trace Data:


begin 
    dbms_session.session_trace_enable(true, false, 'first_execution');
	
    dbms_log.ksdwrt(1, '*** EXPERIENCE ID:('||sys_guid()||')'); 
    dbms_session.set_identifier(sys_context('userenv','session_user')); 
    dbms_application_info.set_module('OE', 'BOOK');
end;
/

select 'your application goes here' from dual
/
begin
    dbms_application_info.set_module('', '');
    dbms_session.set_identifier('');
    dbms_log.ksdwrt(1, '*** EXPERIENCE ID:()'); dbms_session.session_trace_disable();
end;
/

You can imagine, all the tools in the Method R Workbench can deal with this artificial trace entry.

Now the initial request is still somehow special: Normally one EXPERIENCE ID will be visible only in one tracefile. But in this special implementation, one experience can be spread over several tracefiles due to the applications special handling of its connection pool.

There is no simple tool to grab all the different pieces and glue them together:

oracles trcsess does not know about the EXPERIENCE ID, so it is of no help here to identify ine experience.
Method-Rs mrcrop can create single files for each EXPERIENCE ID, but it doesn't glue them together as required.

But both can be combined in a quite nice way:

mrcrop can filter for a specific experience with the parameters experience ‑‑id=$experience,

and trcsess can stitch together several tracefiles, when they have one of its filter-criteria in common.

This leads to a simple pseudo-code which does the trick :


for each experience ID:
  {purge previous mrcrop directories}
  for each XE_ora_*.trc
    mrcrop experience --id=$experience XE_ora_file.trc
  trcsess output=$experience.trc service=xepdb1 XE_ora_*/XE_ora*.trc

By this we have one tracefile for each experience - goal achieved.

2021-01-05

convert SQL Trace to a list of database calls in chronological order with human readable timestamps

It all started so easy!
A customer found in one database one statement which is outstanding high in statspack reports. Unfortunately, as the application is not instrumented, it's hard to understand where the SQL is coming from and what's its purpose.
To bring it into context, a simple SQL TRACE can do miracles.

But as SQL Trace is hard to read and has a lot of confusing content, I used mrskew and a slightly modified calls.rc to generate a chronological list of all DB calls.

The result loks similar to this output


          END-TIM       LINE       SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------------
  12568091.341788         36 5kpbj024phrws EXEC     /*Begin.Work*/ SELECT ...
  12568091.343769         42               XCTEND

For everyone used to calculate seconds and miliseconds by just looking at it, and with the little additional information the tracefile starts at

*** 2020-11-26 10:06:01.867

it can be no issue at all to see when exactly a call was executed, and maybe correlate it to otle rlogfiles of the application.

As I want to provide more value than only relative timestamps, my idea was to let a program do the calculations. As I struggled doing so in pure mrskew commands, I asked for some help at stackoverflow.

There Cary confirmed my goal isn't possible with pure mrskew, but mrwhen can be of some help.

The first aproach was to extend my calls.rc file to create this output:


          END-TIM       LINE       SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------------
tim=12568091341788         36 5kpbj024phrws EXEC     /*Begin.Work*/ SELECT ...
tim=12568091343769         42               XCTEND

but mrwhen wasn't happy (yet):


mrwhen: no START value in file 'bx1.trc'
mrwhen: no DEP value in file 'bx1.trc'
mrwhen: file 'bx1.trc' contains no timestamp line

The START value is not documented right now, but it can be easily guessed mrwhen needs a real timestamp to correlate the tim values. The next try is:


*** 2020-11-26 10:06:01.867

          END-TIM       LINE       SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------------
tim=12568091341788         36 5kpbj024phrws EXEC     /*Begin.Work*/ SELECT ...
tim=12568091343769         42               XCTEND

and now mrwhen changes it's mind slightly:


mrwhen: no DEP value in file 'bx1.trc'
mrwhen: file 'bx1.trc' contains no tim-containing line

No complain about START anymore, and it doesn't see any tim-containing line? I'm very sure I see tim= there.

After playing around by adding dep= ( as it complained about it) and e= (to provide some duration) - all without any success - it seems mrwhen wants the line to look more like a real tracefile.

My final attempt is a file bx1.trc:


*** 2020-11-26 10:06:01.867

          END-TIM       LINE       SQL_ID CALL-NAME STATEMENT-TEXT
-----------------------------------------------------------------------------
XCTEND tim=12568091341788 e=2 dep=0          36 5kpbj024phrws EXEC     /*Begin.Work*/ SELECT ..
XCTEND tim=12568091343769 e=1 dep=0          42               XCTEND

Now mrwhen still complains


mrwhen: no DEP value in file 'bx1.trc'

But the result is useful now:


2020-11-26 10:06:01.867000  2020-11-26 10:06:01.867000  *** 2020-11-26 10:06:01.867

                                                                  END-TIM       LINE       SQL_ID CALL-NAME STATEMENT-TEXT
                                                        -----------------------------------------------------------------------------
2020-11-26T10:06:01.867000  2020-11-26T10:06:01.867000  XCTEND tim=12568091341788 e=2 dep=0          36 5kpbj024phrws EXEC     /*Begin.Work*/ SELECT ...
2020-11-26T10:06:01.868981  2020-11-26T10:06:01.868981  XCTEND tim=12568091343769 e=1 dep=0          42               XCTEND

For post-processing I only need to remove everything starting with XCTEND e and 45 characters long.


sed -E 's/XCTEND t.{37}//'

The remaining post-processing can be done manually.

With little additional help, i can generate the output desired.
Still I have some wishes for mrskew extensions, but they are not part of this post.

2020-06-20

comparison between SQL*TRACE and ASH - get the right information

I'm quite interested in analyzing and improving Performance in Oracle databases.

There are different approaches. A lot was discussed already if tracing or sampling is appropriate. Namely SQL*Trace or ASH/AWR.

I will not favor one side here, just pick one aspect and compare both methods.

Todays aspect is to compare how a specific user experience can be identified. As samples (ASH) are always enabled, it means to filter the right data out of this heap of data; In SQL*Trace it's to scope what or where SQL*Trace should be enabled.

As I'm a lazy guy, I refer to Cary Millsaps Mastering Oracle Trace Data. In my copy of the book, Chapter 3 (Tracing Oracle) he shows an ordered list of tracing preferences. I will follow this list:

If the application set a distinct client ID for each business task execution ...
SQL*Trace can be enabled by DBMS_MONITOR.CLIENT_ID_TRACE_ENABLE
ASH can be filtered on column CLIENT_ID
If the application set a distinct module and action name ...
SQL*Trace can be enabled by DBMS_MONITOR.SERV_MODE_ACT_TRACE_ENABLE
ASH can be filtered on columns MODULE and ACTION
If the program run long enough to be identified ... and still run long enough ...
SQL*Trace can be enabled for this session by DBMS_MONITOR.SESSION_TRACE_ENABLE
ASH can be filtered on columns SESSION_ID and SESSION_SERIAL#
If there is any x such as SYS_CONTEXT('USERENV', x) ...
SQL*Trace can be enabled by an AFTER LOGON trigger
ASH use this AFTER LOGON trigger to set CLIENT_ID, MODULE or ACTION and then filter on any of those
If a new service can be created for the problematic experience ...
SQL*Trace can be enabled by DBMS_MONITOR.SERV_MODE_ACT_TRACE_ENABLE
ASH can be filtered on column SERVICE_HASH (with a little join of DBA_SERVICES)
If nothing else helps ...
SQL*Trace can be enabled for everything a short period of time by DBMS_MONITOR.DATABASE_TRACE_ENABLE and the user experience can be identified later
ASH is on all the time, still the user experience mist be identified by the same investigation effort

This comparison seems to be equal between SQL*Trace and ASH sampling. But I know of a small but specific difference: Wherever the user experience can not be accurately enabled by any setting, it must be filtered by other means. Here SQL*Trace has a big advantage, especially when investigated by a toolset like mrskew in Method-R Workbench: Some activities can be identified by specific start and stop statements, e.g. getting some parameters from a configuration table with a SELECT at the beginning, and write a trace line at the end. mrskew can be used to generate an artificial experience-ID for every experience between these 2 markers. In ASH, as the sampling might not catch any of these statements for a specific execution. This needs additional triggers on the statements to set any CLIENT_ID, MODULE or ACTION as in point 4 above. For sure more work in this case, especially for SELECTs.

My high level summary on identifying the users experience: both SQL*Trace and ASH are comparable. The big advantage of ASH is it's always on implementation, whereas SQL*Trace has advantages when it's about specific SQLs as markers.

berxblog