My Works: Indexing Strategy and Column Order II

Having Discussed about the index skip scan, let us explore what other alternatives the optimizer has to choose and why skip scan was considered as best.Does that have an influence in the column order?...Lets Explore

The index skip scan can be disabled by using an undocumented parameter.Let us now explore what is the plan that the optimizer chooses if we disable the same.

The table and index details are the same as that was used earlier.

TEST>alter session set "_optimizer_skip_scan_enabled"= false;

Session altered.

TEST>select * from SKIP where b=60002;

Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=CHOOSE (Cost=22 Card=1 Bytes=6)

1 0
TABLE ACCESS (FULL) OF 'SKIP' (Cost=22 Card=1 Bytes=6)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
222 consistent gets
0 physical reads
0 redo size
250 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

The above shows that optimizer has choosen a full scan of the table, also note the consistent gets.

What if we force a INDEX FAST FULL SCAN

TEST>select /*+index_ffs(skip idx_skip)*/ * from SKIP where b=60001
;

Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=CHOOSE (Cost=30 Card=1 Bytes=6)

1 0
INDEX (FAST FULL SCAN) OF 'IDX_SKIP' (NON-UNIQUE) (Cost=30 Card=1 Bytes=6)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
302 consistent gets
0 physical reads
0 redo size
250 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

The full scan of the index is favoured only when the hint is specified.The optimizer did the right thing here as expected.This can be verfied by looking at the consistent gets.The blocks to be scanned for a full table scan is less compared to the blocks to be scanned for the INDEX Full scan.

Note that consistent gets is the total number of blocks scanned including the re-read of the blocks from the PGA and not the total number of table blocks or index leaf blocks.The consistent gets depends on the array size,number of rows to be fetched and the total number of blocks.If your array size is too low then it can have a huge impact on the performance.

Now enabling the SKIP SCAN

TEST>alter session set "_optimizer_skip_scan_enabled"= true;

Session altered.

TEST>select * from SKIP where b=60002;

Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=6)

1 0
INDEX (SKIP SCAN) OF 'IDX_SKIP' (NON-UNIQUE) (Cost=5 Card=1 Bytes=6)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
249 bytes sent via SQL*Net to client
276 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

The above one is with SKIP SCAN enabled, with 4-subindexes it is better to skip scan.This is the default behaviour and oracle has done the right thing here.This is because the index here acts as a thinner version of the table where it has to scan only lesser number of blocks.

But this may not be true in all cases let us see why.Based on the above observations we may agree on the following facts.

Suppose we have

x varchar2(1) = 'M', F'
y number = 10,20,30,40,50
z date = 1 million distinct values

we have a query which satisfies

x = 'M' and y = 50 and z > sysdate

we would want to create index in the following order X,Y,Z or Y,X,Z because we can scan on Z. It would be most efficient to goto the place in the index where X=M, Y=50, Z > sysdate and simply scan from there.We can gurantee that we start from here rather than reach here by creating the index in the order specified above.

If Z were first, we would read lots of X=F values or Y!= 50 values while scanning A>sysdate and then only we would reach X=M,Y=50.

This tells us that the number of distinct values in a column in a concatenated index is not relevant when considering the position in the index.But an understanding of the values in the columns and the query is very important when we decide on creating an index and decide on the column order.

This also tells us that Skip Scan Index reduce number of indexes that need to be created to satisfy all of the combinations that one could use in the where clause,but the above case that we discussed is a very specific case where the leading edge column value is of very low cardinality.

The benefit of skip scan is that you can avoid having several indexes on the same columns,saving you update time and a bit of disk space.

In general if we have a query with the following patterns

where x = :1 and y = :2 and z = :3
where z = :3
where y = :2 and z = :3

It would be best to prefer a single index (c,b,a) over any other combination as the same can easily be used for all of the above queries.

so in general, look at the questions you have, try to minimize the indexes you need based on that.Look at the predicates in general, factor out what you want and use that as a starting point.

Blog Archive

My Works

Monday 12 November 2007

Indexing Strategy and Column Order II

No comments:

About Me