Friday, December 27, 2019

Bind Variable Peeking Prior to 11g

Oracle has a feature called “bind variable peeking” whereby an execution plan for an SQL is generated based on the bind variable values used in the SQL. This technique sometimes created suboptimal execution plan because the plan might be perfect for the SQL containing bind variables based on which this plan was generated, but if same SQL is executed later with different values in bind variables, this current plan may not be optimal for those values, and might cause performance degradation for the SQL. I have discusses a real scenario in this article.

An example to further explain this is a table with M (male) and F (female) values in a column that contains highly skewed values. Suppose we have a table my_table with a column “gender”, and there are 1000000 records in it. 1000 records contain value “M”, and rests of the records contain value “F”. For bind variable peeking, if we have a following query where we are using a bind variable v_gender by setting its value to “F”, the optimizer will most likely do a full table scan because most of the table data will be returned from the table.
SELECT * FROM my_table WHERE gender = :v_gender

The plan with full table scan will be stored in the buffer cache. Next time if we execute same query with bind variable value set to “M”, optimizer will still use the same plan with FULL TABLE SCAN, but ideally it should use an INDEX RANGE SCAN here.
It will be vice versa if first time query is executed having bind variable value set to “M”. Optimizer would most like generate a plan with INDEX range scan, but if subsequent query executions pass bind variable with value “F”, it will still use INDEX RANG SCAN which is not a good plan to choose this time.
Let’s look at an example to understand this.
SQL> create table test as select * from dba_objects;

Table created.

--Execute Following SQL 17 times. In the end you will have 65536 rows for object_id=100.
SQL> insert into test select * from test where object_id=100;

65536 rows created.

SQL> commit;

Commit complete.

SQL> select count(*) from test;

  COUNT(*)
----------
    222501

--Create index on object_id
SQL> create index test_idx on test (object_id);

Index created.

--Gather statistics on the table and index
SQL> exec dbms_stats.gather_table_stats('salman','test',cascade=>true)

PL/SQL procedure successfully completed.


Now I will execute a select statement using bind variable to select object name with object_id=102. Optimizer would select an execution plan based on INDEX SCAN because we are fetching only 1 row out of total 222501 rows. An index scan is the best way to fetch 1 row in this case.
SQL> variable v_objid number;
SQL> exec  :v_objid:=102;

PL/SQL procedure successfully completed.

SQL> select count(object_name) from test where object_id=:v_objid;

  COUNT(*)
----------
         1
--Check SQL ID and then current plan from the cursor cache.
SQL> select sql_id from v$sql where sql_text like 'select count(object_name) from test where object_id=:v_objid';

SQL_ID
-------------
hdgnntw7rb15n

SQL> select * from table(dbms_xplan.display_cursor('hdgnntw7rb15n',0));

PLAN_TABLE_OUTPUT
------------------------
SQL_ID  hdgnntw7rb15n, child number 0
-------------------------------------
select count(object_name) from test where object_id=:v_objid

Plan hash value: 939327728

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE                      |          |     1 |    18 |            |          |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| TEST     |     1 |    18 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | TEST_IDX |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("OBJECT_ID"=:V_OBJID)


We can see that optimizer chose a plan with index scan. Now we open another session and try to execute same SQL by passing object_id=100. Ideally there should be a FULL TABLE SCAN plan for object_id=100 because this time SQL would be fetching most of the rows form the table for which index scan is an expensive operation. But, optimizer would choose the already available plan in the cache which will do an index scan.
--Open a new Session
==============
SQL> exec  :v_objid:=100

PL/SQL procedure successfully completed.

SQL> select count(object_name) from test where object_id=:v_objid;

COUNT(OBJECT_NAME)
------------------
            131072

--Value of second parameter here is "1" this time as we are checking the child cursor because this SQL was executed from a second session.
SQL>  select * from table(dbms_xplan.display_cursor('hdgnntw7rb15n',1));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  hdgnntw7rb15n, child number 1
-------------------------------------
select count(object_name) from test where object_id=:v_objid

Plan hash value: 939327728

-------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |          |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE                      |          |     1 |    18 |            |          |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| TEST     |     1 |    18 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | TEST_IDX |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("OBJECT_ID"=:V_OBJID)

As you see that for any value passed in bind variable v_objid, it will always select one and only one plan which may be good for some SQLs but may not be good for the others. If this SQL is aged out of shared pool and a hard parse happens again, optimizer will generate a plan based on the value passed to the bind variable for that new execution.
To test this let’s do a flush of shared pool so that SQL is hard parsed again, and then pass value a “100” to the bind variable and see which plan optimizer generates.
SQL> ALTER SYSTEM FLUSH SHARED_POOL;

System altered.

SQL> exec  :v_objid:=100

PL/SQL procedure successfully completed.

SQL> select count(object_name) from test where object_id=:v_objid;

COUNT(OBJECT_NAME)
------------------
            131072

--Check the current plan from the cursor cache.

select * from table(dbms_xplan.display_cursor('hdgnntw7rb15n',0));

PLAN_TABLE_OUTPUT
------------------------
SQL_ID  hdgnntw7rb15n, child number 0
-------------------------------------
select count(object_name) from test where object_id=:v_objid

Plan hash value: 1950795681

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   932 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    18 |            |          |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|*  2 |   TABLE ACCESS FULL| TEST |   134K|  2368K|   932   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("OBJECT_ID"=:V_OBJID)

Now you can see that this time optimizer chose a new plan with FULL TABLE SCAN which makes sense. But, the down side is that this plan will be used for every subsequent execution of this SQL regardless of the value of the bind variable. You may open a new session and execute same SQL with the value of 102 and you will see that same FULL TABLE SCAN will be used.
Check my this article about adaptive cursor sharing which was introduced in 11g, and how this feature tried to solve the bind variable peeking problem.

No comments:

Post a Comment

Popular Posts - All Times