2022-09-12

indirect partition pruning

The nice and important thing to do when going to a conference is - beside consuming all the great presentations - talk to other attendees and exchange experiences, current issues and probable solutions. 
I am very happy I attended POUG2022! This conference is one of the best in my opinion! 


During one of the discussions I claimed Oracle can - under specific circumstances - do a partition pruning, even when the partition key isn't somewhere defined as a filter. 

It's probably easiest to show it by a simple example.

First some data to play with
CREATE TABLE RAW_DATA
   (	"ID"      NUMBER NOT NULL ENABLE, 
       	"SNAP_ID" NUMBER NOT NULL ENABLE, 
	"EVENT_TIME"  DATE   NOT NULL ENABLE, 
	"PAD1"        VARCHAR2(4000 BYTE)
   );
   
insert into RAW_DATA
select rownum
     , 1+trunc(rownum/99)
     , sysdate - (rownum/99)
     , rpad( to_char(rownum), 500, '*')
from dual
connect by level <= 10000;
commit;

CREATE TABLE SNAP 
   (	"SNAP_ID"       NUMBER NOT NULL ENABLE, 
	"BEGIN_INTERVAL_TIME" DATE NOT NULL ENABLE, 
	"END_INTERVAL_TIME"   DATE NOT NULL ENABLE, 
	 CONSTRAINT "SNAP__PK" PRIMARY KEY ("SNAP_ID") ENABLE, 
	 CONSTRAINT "SNAP__BEGIN_U" UNIQUE ("BEGIN_INTERVAL_TIME") ENABLE, 
	 CONSTRAINT "SNAP__END_U" UNIQUE ("END_INTERVAL_TIME") ENABLE, 
	 CONSTRAINT "SNAP__TIME_ASC" CHECK (begin_interval_time < end_interval_time) ENABLE --> -- for HTML parser
   )
; 

INSERT INTO SNAP
SELECT SNAP_ID, min(event_time), max(event_time) 
from RAW_DATA
group by SNAP_ID;

commit; -- 102 rows inserted

CREATE TABLE PART
   (	"ID"     NUMBER NOT NULL ENABLE, 
	"SNAP_ID"    NUMBER NOT NULL ENABLE, 
	"EVENT_TIME" DATE   NOT NULL ENABLE, 
	"PAD1" VARCHAR2(4000 BYTE) NOT NULL ENABLE, 
	 CONSTRAINT "PART_PK" PRIMARY KEY ("ID")  ENABLE
   ) 
  PARTITION BY LIST ("SNAP_ID") AUTOMATIC 
 (PARTITION "P1"  VALUES (1) SEGMENT CREATION DEFERRED )
 ;
 
select Partition_Name from user_tab_partitions where table_name='PART';

PARTITION_NAME                                                                                                                  
-------------- 
P1
SYS_P529
SYS_P530
...
SYS_P627
SYS_P628
SYS_P629

102 rows selected. 

The partitions are not that big, the data quite boring, but it's sufficient data to play around. 

Now I want to query a row from PART for a specific time-range. as the field EVENT_TIME is not the partition key, the simple query goes like this:


-- SQL1 
SELECT /*+ gather_plan_statistics */ p.* 
from PART p 
WHERE EVENT_TIME between sysdate-(1+1/98) and sysdate-1
;


Plan hash value: 2974113857

------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | Pstart| Pstop | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |       |       |      1 |00:00:00.01 |    7007 |
|*  1 |  FILTER             |      |      1 |        |       |       |      1 |00:00:00.01 |    7007 |
|   2 |   PARTITION LIST ALL|      |      1 |      1 |     1 |   102 |      1 |00:00:00.01 |    7007 |
|*  3 |    TABLE ACCESS FULL| PART |    102 |      1 |     1 |   102 |      1 |00:00:00.01 |    7007 |
------------------------------------------------------------------------------------------------------


In this simple query all partitions are scanned (Pstart => 1 to Pstop => 102). 
Let's see if we can use SNAP to make the query more efficient: 

The simplest query makes it even more expensive: 


-- SQL2
select p.* 
from snap sn, part p
where sn.snap_id = p.snap_id
  and p.event_time between sysdate-(1+1/98) and sysdate-1 ;

Plan hash value: 2574539144

------------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |          |       |       | 27817 (100)|          |       |       |
|*  1 |  FILTER                   |          |       |       |            |          |       |       |
|   2 |   NESTED LOOPS            |          |     1 |  2050 | 27817   (1)| 00:00:02 |       |       |
|   3 |    INDEX FAST FULL SCAN   | SNAP__PK |   102 |  1326 |     2   (0)| 00:00:01 |       |       |
|   4 |    PARTITION LIST ITERATOR|          |     1 |  2037 |   273   (1)| 00:00:01 |   KEY |   KEY |
|*  5 |     TABLE ACCESS FULL     | PART     |     1 |  2037 |   273   (1)| 00:00:01 |   KEY |   KEY |
------------------------------------------------------------------------------------------------------


But we are missing something:

alter table PART
  add constraint PART__SNAP_FK foreign key("SNAP_ID") references "SNAP"("SNAP_ID")
;

-- rerun SQL2 

Plan hash value: 2974113857

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        | 27818 (100)|       |       |      1 |00:00:00.01 |    7007 |
|*  1 |  FILTER             |      |      1 |        |            |       |       |      1 |00:00:00.01 |    7007 |
|   2 |   PARTITION LIST ALL|      |      1 |      1 | 27818   (1)|     1 |   102 |      1 |00:00:00.01 |    7007 |
|*  3 |    TABLE ACCESS FULL| PART |    102 |      1 | 27818   (1)|     1 |   102 |      1 |00:00:00.01 |    7007 |
-------------------------------------------------------------------------------------------------------------------


Now we are back at the original plan - but no progress is made.
Next try: provide more information about the correlation between SNAP_IDs and dates in PART and SNAP:

select /*+ gather_plan_statistics */ p.* 
from snap sn, part p
where sn.snap_id = p.snap_id
  and p.event_time between sysdate-(1+1/98) and sysdate-1 
  and p.event_time between sn.begin_interval_time and sn.end_interval_time
;


Plan hash value: 3865632887

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | Cost (%CPU)| Pstart| Pstop | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        | 27819 (100)|       |       |      1 |00:00:00.01 |      74 |
|*  1 |  FILTER                       |          |      1 |        |            |       |       |      1 |00:00:00.01 |      74 |
|   2 |   NESTED LOOPS                |          |      1 |      1 | 27819   (1)|       |       |      1 |00:00:00.01 |      74 |
|   3 |    NESTED LOOPS               |          |      1 |      1 | 27819   (1)|       |       |      1 |00:00:00.01 |      73 |
|   4 |     PARTITION LIST SUBQUERY   |          |      1 |      1 | 27818   (1)|KEY(SQ)|KEY(SQ)|      1 |00:00:00.01 |      72 |
|*  5 |      TABLE ACCESS FULL        | PART     |      1 |      1 | 27818   (1)|KEY(SQ)|KEY(SQ)|      1 |00:00:00.01 |      70 |
|*  6 |     INDEX UNIQUE SCAN         | SNAP__PK |      1 |      1 |     0   (0)|       |       |      1 |00:00:00.01 |       1 |
|*  7 |    TABLE ACCESS BY INDEX ROWID| SNAP     |      1 |      1 |     1   (0)|       |       |      1 |00:00:00.01 |       1 |
---------------------------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(SYSDATE@!-1>=SYSDATE@!-1.01020408163265306122448979591836734694)
   5 - filter(("P"."EVENT_TIME">=SYSDATE@!-1.01020408163265306122448979591836734694 AND
              "P"."EVENT_TIME"<=SYSDATE@!-1))
   6 - access("SN"."SNAP_ID"="P"."SNAP_ID")
   7 - filter(("P"."EVENT_TIME">="SN"."BEGIN_INTERVAL_TIME" AND "P"."EVENT_TIME"<="SN"."END_INTERVAL_TIME"
              AND "SN"."END_INTERVAL_TIME">=SYSDATE@!-1.01020408163265306122448979591836734694 AND
              "SN"."BEGIN_INTERVAL_TIME"<=SYSDATE@!-1))
-- >

D'OH! Now the costs increased - by the additional effort shows less Buffers! And also Pstart and Pstop show thePartitions begin and end at KEY(SQ) (no 1 .. 102 anymore). 

That's what I tried to achieve: indirect partition pruning based on a key stored in another table. 

Unfortunately the nasty 
and p.event_time between sn.begin_interval_time and sn.end_interval_time
is required as it can't be defined as an ASSERTION - even it is defined in SQL-92 already which is 30 years old. 
If you would like to see ASSERTIONs in Oracle, please upvote this Idea!