Monday, October 11, 2010

Oracle Tuning Tip#14: Index Skip Scan


TOPIC:
Index Skip Scan

DEFINITION:
Index Skip Scan is one of the index access methods supported by Oracle.

Index Skip Scan can be imposed on composite index in the certain scenarios only. Composite index is otherwise called as concatenated index or multi-column index. Composite (Concatenated) index means creating an index on more than one column.

Prior to Oracle 9i version, if the leading column of an index wasn’t specified in a predicate, the index was effectively ignored by the CBO (Cost Based Optimizer). A composite index could only be used if the first column, the leading edge, of the index was referenced in the WHERE clause of a statement.

However, if the leading column isn’t referenced now, Oracle can use the index anyways via an Index Skip Scan access path (This is applicable from Oracle 9i version onwards). So, Index Skip Scan means, if a composite index is created on 2 columns where the number of distinct values in the first column is very less and the number of distinct values in the second column is very high, and if only the second column is mentioned in the WHERE clause, then there is a high possibility that Oracle will follow Index Skip Scan on this index for that respective SQL.

Index Skip Scan is applicable to both B*Tree Unique Index and B*Tree Non-Unique Index.

LITTLE-KNOWN FACTS TO BE REMEMBERED:
  • ·         You can force an index skip scan with the /*+ index_ss */ hint if oracle doesn’t impose on its own.
  • ·         Oracle can follow Index Skip Scan only in this scenario, a composite index should be created on column-A and column-B, number of distinct values of column-A should be very less (eg., Sex column because it can have either M or F irrespective of number of records in the table), number of distinct values of column-B should be very high (eg., any normal columns like, street, pincode…), only the column-B should be mentioned in the WHERE clause of the SQL, statistics should be gathered for both index and table. Only if all the above mentioned points are satisfied, then Oracle will follow index skip scan.
ADVANTAGE:
  • ·         Only minimal number of indexes are required to be created since Oracle supports Index Skip Scan. Reason is, if Index Skip Scan was not supported, take an example where a composite index was created on column-A and column-B, only if column-B was mentioned in WHERE clause, then this index couldn’t be used by Oracle. To fix this, another index should be created on column-b alone. Since Oracle supports Index skip scan, we don’t need to create a separate index on column-B alone.
  • ·         If index skip scan is followed, then we can eliminate full table scan or full index scan thereby we can reduced the execution time of query, number of data blocks to be referred and I/O.
DISADVANTAGE:
  • ·         Query performance would go for a toss when Oracle imposed Index Skip Scan on this scenario, a composite index created on column-A and column-B, number of distinct values of column-A was very high, number of distinct values of column-B was also very high, and if only the column-B mentioned in the WHERE clause of the SQL,  then query performance would be much worse than if Oracle opted for full table(or index) scan.
HOW TO VERIFY:
How to verify whether Oracle follows index skip scan or not while executing the sql query. If a query follows this, then you will find similar execution plan like this,
--------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                         |    Rows    |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                              |            |
|   1 |  TABLE ACCESS (BY INDEX ROWID)          | EMP                          |            |
|   2 |   INDEX (SKIP SCAN)                     | SEX_DEPT_INDX                |            |

In the explain plan, whenever it follows the index skip scan, it displays the keyword (SKIP SCAN) in the operation column against the index name.

EXAMPLE:
Create an employee table and inserts 10 records. The table will look like this,

DATA TABLE:
ROWID
empid
column2
column3
Sex
deptid
AAAAA1
1
..
..
M
HR
AAAAA2
2
..
..
M
DELIVERY
AAAAA3
3
..
..
F
ADMIN
AAAAA4
4
..
..
F
SALES
AAAAA5
5
..
..
F
FINANCE
AAAAA6
6
..
..
M
ADMIN
AAAAA7
7
..
..
M
SALES
AAAAA8
8
..
..
F
HR
AAAAA9
9
..
..
F
DELIVERY
AAAAA10
10
..
..
M
FINANCE

Fire this query against this table where the requirement is to display the all the attributes of the employees whose deptid is “DELIVERY”
Select * from emp where deptid = ‘DELIVERY’;

Here, we are yet to create index. Now, to execute this query, Oracle will take 10 seconds (assume oracle takes 1 second for single data table record search).
Why it is taking 10 seconds? Reason is, Oracle follows the sequential search when it looks for the data in the data table. Here, we are looking for the deptid, ‘DELIVERY’. Since oracle is searching in the data table, it can’t directly go to 2nd and 9th records. Oracle starts from first record and traverse thro all the records till it reaches the last one. So in b/w, whichever table record has this deptid would be displayed in the output. Explain plan will look like this,

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                              | Name                                             | Rows  |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                              |                                                         |      2       |
|   1 |  TABLE ACCESS FULL                            | EMP                                                |     10      |

Problem here is, oracle has unnecessarily referred the unwanted 8 table records which are not meant to be referred for this sql. In order to overcome this issue, we have to create a composite index on sex & deptid columns.

Why are sex and deptid columns the ideal candidate to create the composite index and to impose index skip scan? Because sex column has very less number of distinct values (M & F), deptid column has very high number of distinct values (HR,DELIVERY, ADMIN, SALES & FINANCE), only deptid column is mentioned in WHERE clause of the select query, so these 2 are suitable columns to create composite index in order to impose index skip scan.

Create a composite index on this table for sex and deptid columns. (create index sex_dept_indx on emp(sex,deptid)).
Index table will logically look like this,



First column (INDEX) : it stores all the unique combination values of sex & deptid columns in the ascending order.
Second column (ROWID) : it stores the ROWID of the corresponding record.

After creating this index table, fire the same query again,
Select * from emp where deptid = ‘DELIVERY’;

When oracle executes this sql, first it looks for any index which has already been created on this “deptid” column alone. It comes to know that none of the indexes has been created on this column alone but if finds the index, “sex_dept_indx” has included this column. If oracle has to refer this index, then it must know the values of SEX column to be referred. Since SEX column is not mentioned in the query, we can take it for granted that all the distinct values of SEX column should be referred. So, Oracle implicitly converts the query into like this (without any manual intervention),
Select * from emp where sex = ‘F’ and deptid = ‘DELIVERY’
UNION ALL
Select * from emp where sex = ‘M’ and deptid = ‘DELIVERY’;

Since this converted query contains both SEX and DEPTID columns in the WHERE clause, Oracle will refer this index, “sex_dept_indx” now. So, oracle refers the index table first before hitting the actual data table. Since we are looking for the deptid:’DELIVERY’, oracle hits index table first and get the corresponding ROWIDs (AAAAA9,AAAAA2). When oracle searches for the value in this index table, it follows random search (i.e. binary search) to go to the corresponding index records. Using this binary search, Oracle is intelligent enough to go to the 2nd and 7th index records directly in the index table. So, it won’t touch the remaining 8 index records.

After the getting the required ROWIDs (AAAAA9,AAAAA2), oracle directly refers both 9th and 2nd records of the data table since it knows the exact location (ROWID of 9th table record is AAAAA9 and ROWID of 2nd table record is AAAAA2). With this, it doesn’t refer the remaining 8 table records. Now, explain table will look like this,
--------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                         |    Rows    |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                              |    2       |
|   1 |  TABLE ACCESS (BY INDEX ROWID)          | EMP                          |    2       |
|   2 |   INDEX (SKIP SCAN)                     | SEX_DEPT_INDX                |    2       |

Since this index table is used, oracle will take only 2 seconds (assume 1 second to get the ROWID of 2nd and 7th index records since it follows binary search + 1 second to retrieve the actual table data records since it knows the exact ROWIDs those are retrieved in the previous step)

Reason for this scan to be called as “Index Skip Scan” is, this is the only type of scan where consecutive index records are not retrieved. In this scan, whenever oracle refers the next required index record, it jumps from the first required index record to the second required index record. In this example, it jumps from 2nd index record to 7th index record thereby it skips 3rd, 4th, 5th and 6th index records which come between 2nd and 7th index records (pls refer blue colored arrow mark in the index table). So, whenever oracle refers multiple index records from the index table through this scan, it jumps through the required index records alone thereby skipping the reference of unwanted index records. Due to this in-built mechanism, it eliminates both full table scan and full index scan thereby increasing the speed of the query. These are the reasons this index to be called as “Index Skip Scan”.


No comments:

Post a Comment