2008-12-17

Doing it better ... (Part 1)

I never thought something like this could be worth a blog entry, but as Tom Kyte can do this, I can also! ;-)

I had the pleasure to assist in a little project as a DBA.
The pleasure was mine, as the target (in response-time) was outstanding (in my current working circumstances) and the setup was done to achieve this in a more or less professional manner (which much more outstanding ...), so we had testcases, a kind of change-management, really load-tests, totally cooperative designers/developers. Great!

The goal of the whole thing was to 'lock' and 'free' available resources out of a pool (according to some additional 'quality'). The example I'm writing is about the process 'provide the next free ressource out of the given pool'. It should come back in about 10ms.

Here the Iterations we came through:

1) primary code:


SELECT resource into v_resource
FROM RESOURCE_POOL
WHERE v_time>FREED_TIME
AND STATUS='FREE'
AND POOL=v_pool
AND ROWNUM=1
FOR UPDATE;
UPDATE RESOURCE_POOL set
STATUS='INUSE',
FREED_TIME=null,
START_TIME=current_timestamp,
WHERE RESOURCE=v_resource;

with this comment beside these statements:
-- FOR UPDATE statement ensures that the selected row is locked until the transition is completed --
-- open 2 different sqlplus commands, issue the command and watch the last command. it waits until update is completed (or commit) on the first session
-- FOR UPDATE NOWAIT would report an error if the row cant be locked immediately --
-- FOR UPDATE SKIP LOCKED pauses other db queries - this could be the perf bottleneck ...


BRRR! Even with an index in RESOURCE_POOL.RESOURCE it's not really optimal. And, the hard fact is, it's to slow.

2) after some iterations we came to:

UPDATE RESOURCE_POOL set
STATUS='INUSE',
FREED_TIME=null,
SESSION_START_TIME=current_timestamp,
WHERE v_time>FREED_TIME
AND STATUS='FREE'
AND RESOURCE_POOL=v_pool
AND ROWNUM=1
AND v_time>FREED_TIME
RETURNING RESOURCE INTO v_resource;


As we are counting miliseconds, this WAS an advantage. And for me personally, it's more beautiful, even this is not really a tuning-goal.

Beside some tests on the physical layout (we came to the conclusion to use partitions by resource pools and local indexes on them) this saved us enough time so we could stop tuning.
Maybe I also have to tell all the data is small enough to fit into the buffer_pool.
And for the Records, it's 10.2.0.4 EE on HP-UX (Itanium), Failover ServiceGuard.

4 Kommentare:

Flado hat gesagt…

Aren't you in fact serializing at the POOL level instead of at individual RESOURCE level? I mean, all your sessions will try to reserve the same resource (they don't see each other's changes until committed, and they use the same plan) - the one that happens to be selected first (rownum=1).
If that's the case, the first one will come back in 10ms, but what about the 10th session?
That's where the SKIP LOCKED would prove handy, methinks. Have you checked that it does in fact become "the perf bottleneck"?
Failing that, you could design some kind of scheme that would allow you to select different eligible resources in each session (maybe sub-dividing pools into ranges and hashing the SID to determine a range).

Flado hat gesagt…
Dieser Kommentar wurde vom Autor entfernt.
Flado hat gesagt…

I made some tests and came up with this:
"Skip locked" does not work as intuitively expected with rownum and is undocumented.
The single-statement update serializes on the pool level, which means your transactions must be very short to decrease contention.

This serializes on the individual resource and can therefore work for longer transactions as well (sorry, no code tags allowed):

for x in (
SELECT "RESOURCE"
FROM RESOURCE_POOL
WHERE v_time>FREED_TIME
AND STATUS='FREE'
AND POOL=v_pool for update skip locked)
loop
v_resource:=x."RESOURCE";
exit;
end loop;
UPDATE RESOURCE_POOL set
STATUS='INUSE',
FREED_TIME=null,
START_TIME=sysdate
WHERE "RESOURCE"=v_resource;

Depending on how the app uses this "resource allocator", it may be feasible to just keep a resource's row locked while it is in use - and get rid of the STATUS column altogether...

Thanks for the post, I learned quite a bit from it :-) !

Schöne Feiertage!

Martin Berger hat gesagt…

flado,
we had no serializing problems, as the application had initial only one (!) connection, at some stress-tests, there where only 4 concurrent connections. So serialization problems came never to our mind an where not visible during our testcases.
My goal was to avoid any unnecessary switch between SQL and PL/SQL engine, and also to keep (soft)parsing short.
I never thought about SKIP LOCKED, and I'm glad I hadn't to do, as it looks complex in my environment.
Maybe you just want to summarise your learings somewhere, as I'm really searching for additional informations, anywhere ;-)