Oracle SQL Trace files are a very valuable source of information - especially when it comes to performance analysis.
But people often are afraid of the impact of tracing to performance.
Luckily, Oracle chose quite useful defaults - at least when using DBMS_MONITOR:
WAITS=>TRUE
BINDS=>FALSE
PLAN_STAT=>'FIRST_EXECUTION'
This still leaves the question about the impact of writing BINDS to the tracefile.
The best information I could find is by Cary Millsap in his book Tracing Oracle - chapter 25: Why We Recommend Not Tracing BINDS:
Here’s why that matters: every time an application binds a value to a placeholder, a BINDS=>TRUE trace causes Oracle to write 5 lines (about 180 bytes) into the trace file, using one write call per line. So, an EXEC that might consume only a few hundred microseconds when BINDS=>FALSE has to write an extra 36KB to the trace file with 2,000 extra write calls when BINDS=>TRUE:
200 placeholders × 180 bytes/placeholder = 36,000 bytes
200 placeholders × 5 lines/placeholder × 1 write/line = 1,000 writes
Even this is a great technical explanation, it does not tell us how much time it takes.
While analyzing some other questions, a tracefile with a lot of BIND entries was generated - and it significantly slowed down the application. Not a problem in this case, but something worth to check.
The tracefile I generated covers ~117 seconds - 1 minutes and 57 seconds.
mrskew just doesn't know about some of this time:
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- ---------- ------ ------ -------- -------- --------
EXEC 91.022874 89.9% 4,398 0.020696 0.000000 0.093324
SQL*Net message from client 6.423181 6.3% 4,398 0.001460 0.000334 0.518238
log file sync 3.633550 3.6% 4,397 0.000826 0.000372 0.013874
enq: SQ - contention 0.070351 0.1% 63 0.001117 0.000007 0.003597
library cache: mutex X 0.040133 0.0% 6 0.006689 0.000022 0.015928
cursor: pin S wait on X 0.027345 0.0% 3 0.009115 0.003581 0.019512
SQL*Net message to client 0.015619 0.0% 4,398 0.000004 0.000002 0.000079
CLOSE 0.014905 0.0% 4,405 0.000003 0.000000 0.000079
enq: TX - index contention 0.012613 0.0% 11 0.001147 0.000036 0.005059
buffer busy waits 0.008619 0.0% 184 0.000047 0.000003 0.002146
14 others 0.019216 0.0% 4,486 0.000004 0.000000 0.007130
--------------------------- ---------- ------ ------ -------- -------- --------
TOTAL (24) 101.288406 100.0% 26,749 0.003787 0.000000 0.518238
mrskew only aggregates waits and dbcalls but doesn't care about the time "in between". We can assume the missing 16 seconds are the amount of time used to write the binds - and call it a day.
mrprof shows some more information in the profile:
Subroutine | Duration | Calls | Duration per call (seconds) | |||||
---|---|---|---|---|---|---|---|---|
seconds | % R | mean | min | max | skew | |||
1 | CPU: EXEC dbcalls | 92.562280 | 78.9% | 17,710 | 0.005227 | · | 0.062486 | ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆█▁▁▁▁▁ |
2 | unaccounted-for within dbcalls | 9.062352 | 7.7% | 39,840 | 0.000227 | −0.025781 | 0.026413 | ▁▁▁▁▁▂▆▁▁▁▁▁▁▁▂█▄▁▁▁▁▁ |
3 | SQL*Net message from client | 6.423181 | 5.5% | 4,398 | 0.001460 | 0.000334 | 0.518238 | ▁▁▁▄█▁▂▁▁▁▁ |
4 | log file sync | 3.633550 | 3.1% | 4,397 | 0.000826 | 0.000372 | 0.013874 | ▁▁▁█▄▁▁▁▁▁▁ |
5 | 25 other subroutines | 5.663594 | 4.8% | 51,276 | 0.000110 | · | 0.019512 | ▁▁▁▁▁▁▁▁▁▁▁▁▁▄█▂▁▁▁▁▁▁ |
6 | Total (29) | 117.344957 | 100.0% | 117,621 | 0.000998 | −0.025781 | 0.518238 |
There 9 seconds are called unaccounted-for within dbcalls - the explanation how this artificial component is calculated is
mrprof synthesizes an “unaccounted-for between calls” call when there is missing time between dep = 0 database and system calls. Every adjacent pair of such calls will define some non-zero “unaccounted-for between calls” duration
That is somehow useful, but there is also a different approach possible.
In the tracefile, I don't have timing information for the BINDS, but at least I have some activity before and afterwards. e.g.
PARSE #1795421110760:c=32,e=32,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=0,tim=968215712176
BINDS #1795421110760:
Bind#0
oacdty=02 mxl=22(06) mxlc=00 mal=00 scl=00 pre=00
oacflg=10 fl2=0001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=881f7430 bln=22 avl=06 flg=09
value=6607304129
...
Bind#2
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=00 fl2=1000001 frm=00 csi=00 siz=24 off=0
kxsbbbfp=06881698 bln=22 avl=02 flg=05
value=1
EXEC #1795421110760:c=605,e=604,p=0,cr=0,cu=4,mis=0,r=1,dep=1,og=4,plh=0,tim=968215713003
I see the EXEC completed at 968215713003 and had a duration of 604 microseconds - so it started at 968215712399. The last call before was a PARSE at 968215712176. This leaves a gap of 223 μs. Not a lot, but worth to care. To make this time somehow visible, I have a little perl script :
#!/usr/bin/perl -w
#bind_synt_wait.pl
# creating a synthetic bind "WAIT" for otherwise UFBCs
use strict;
my $prev_cursor = 0 ;
my $cursor = 0 ;
my $prev_timestamp = 0 ;
my $timestamp = 0 ;
my $calc_timestamp = 0 ;
my $ela = 0 ;
my $calc_ela = 0 ;
my $to_be_processed = 0 ;
# 0 => nothing to do,
# 1 => BIND seen, but no WAIT,
# 2 => BIND & WAIT seen ==> process!
#my $tablespace;
#my $block;
#my $object_id;
while (<>) {
my $line = $_;
if ($line =~ m/.*#(\d+)[: ].*/) {
($cursor = $line) =~ s/.*#(\d+)[: ].*/$1/ ;
}
$cursor =~ tr/\n//d;
if ($line =~ m/.*tim=(\d+).*/) {
($timestamp = $line) =~ s/.*tim=(\d+).*/$1/ ;
if ($to_be_processed == 1) {
$to_be_processed = 2;
}
}
$timestamp =~ tr/\n//d;
# print " ---- timestamp = $timestamp \n";
if ($line =~ m/.*ela= (\d+) .*/) {
($ela = $line) =~ s/.*ela= (\d+) .*/$1/ ;
}
$ela =~ tr/\n//d;
if ($line =~ m/.*,e=(\d+),.*/) {
($ela = $line) =~ s/.*,e=(\d+),.*/$1/ ;
}
$ela =~ tr/\n//d;
# print " ---- ela = $ela \n";
if ($line =~ m/^BINDS/) {
$to_be_processed = 1;
}
if ($to_be_processed == 2) {
# WAIT #391723720: nam='db file scattered read' ela= 2876 file#=5 block#=129 blocks=127 obj#=71123 tim=90982740602
$calc_ela = ($timestamp - $ela) - $prev_timestamp;
if ( $calc_ela < 0) {
$calc_ela = 0 ; #$timestamp - $prev_timestamp;
}
$calc_timestamp = $timestamp-$ela;
print "WAIT #"; # as the hash is somethign special ...
print "$cursor: nam='BIND to tracefile write' ela= $calc_ela tim=$calc_timestamp\n";
$to_be_processed = 0;
}
print $line;
# will need these when generating tim & ela for BIND
$prev_cursor = $cursor;
$prev_timestamp = $timestamp;
}
which does exactly this calculation and inserts an artificial line into the tracefile:
WAIT #1795421110760: nam='BIND to tracefile write' ela= 223 tim=968215712399
Now mrskew knows about this time:
CALL-NAME DURATION % CALLS MEAN MIN MAX
--------------------------- ---------- ------ ------ -------- -------- --------
EXEC 91.022874 66.3% 4,398 0.020696 0.000000 0.093324
BIND to tracefile write 36.088810 26.3% 13,487 0.002676 0.000000 0.031517
SQL*Net message from client 6.423181 4.7% 4,398 0.001460 0.000334 0.518238
log file sync 3.633550 2.6% 4,397 0.000826 0.000372 0.013874
enq: SQ - contention 0.070351 0.1% 63 0.001117 0.000007 0.003597
library cache: mutex X 0.040133 0.0% 6 0.006689 0.000022 0.015928
cursor: pin S wait on X 0.027345 0.0% 3 0.009115 0.003581 0.019512
SQL*Net message to client 0.015619 0.0% 4,398 0.000004 0.000002 0.000079
CLOSE 0.014905 0.0% 4,405 0.000003 0.000000 0.000079
enq: TX - index contention 0.012613 0.0% 11 0.001147 0.000036 0.005059
15 others 0.027835 0.0% 4,670 0.000006 0.000000 0.007130
--------------------------- ---------- ------ ------ -------- -------- --------
TOTAL (25) 137.377216 100.0% 40,236 0.003414 0.000000 0.518238
and also the profile shows my artificial wait event:
Subroutine | Duration | Calls | Duration per call (seconds) | |||||
---|---|---|---|---|---|---|---|---|
seconds | % R | mean | min | max | skew | |||
1 | CPU: EXEC dbcalls | 92.562280 | 78.9% | 17,710 | 0.005227 | · | 0.062486 | ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆█▁▁▁▁▁ |
2 | BIND to tracefile write | 36.088810 | 30.8% | 13,487 | 0.002676 | · | 0.031517 | ▁▁▁▂▃█▁▁▁▁▁ |
3 | SQL*Net message from client | 6.423181 | 5.5% | 4,398 | 0.001460 | 0.000334 | 0.518238 | ▁▁▁▄█▁▂▁▁▁▁ |
4 | log file sync | 3.633550 | 3.1% | 4,397 | 0.000826 | 0.000372 | 0.013874 | ▁▁▁█▄▁▁▁▁▁▁ |
5 | unaccounted-for within dbcalls | −25.605864 | −21.8% | 39,840 | −0.000643 | −0.038324 | 0.026413 | ▁▁▁▁▁█▇▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁ |
6 | 25 other subroutines | 4.243000 | 3.6% | 64,522 | 0.000066 | −0.016385 | 0.019512 | ▁▁▁▁▁▁▁▁▁▁▁▁▁▄█▂▁▁▁▁▁▁ |
7 | Total (30) | 117.344957 | 100.0% | 144,354 | 0.000813 | −0.038324 | 0.518238 |
But unfortunately, mrskews TOTAL is now bigger than 117 sec, and mrprof needs to introduce negative unaccounted-for within db calls to compensate that.
The problem in this situation is the different dept and how their EXEC dbcalls are aggregated - and the artificial BIND to tracefile write doesn't fit well. It would required further modifications which would lead to further problems - and are probably not worth.
At the end, it would be fine if Oracle could also instrument the code for writing BIND information to the tracefile - this could eliminate all this manual work. Until then, it might sometimes help to slightly improve a tracefile with data already hidden inside.
Keine Kommentare:
Kommentar veröffentlichen