oracle vs mysql 对组合索引包含in list执行计划研究(oracle部分)-凯发app官方网站

博客访问： 1156147
博文数量： 166
博客积分： 0
博客等级：民兵
技术积分： 3760
用户组：普通用户
注册时间： 2011-03-30 13:00

个人简介

about me:oracle ace pro,optimistic,passionate and harmonious. focus on oracle,mysql and other database programming,peformance tuning,db design, j2ee,linux/aix,architecture tech,etc

文章分类

全部博文（166）

cbo（54）
tuning&performan（32）
concepts&archite（4）
troubleshooting（5）
new feature（7）
sql（50）
pl/sql（8）
miscellaneous（3）
mysql（3）
未分配的博文（0）

文章存档

2024年（21）

2023年（28）

2022年（43）

2020年（62）

2014年（3）

2013年（9）

我的朋友

1.oracle组合索引包含in list的执行计划研究

先上建表语句：

点击(此处)折叠或打开

drop table t1;
drop table t2;
create table t1 as select * from dba_objects;
create table t2 as select * from dba_objects;
--建立组合索引
create index idx1_t1 on t1(object_name,owner);
create index idx1_t2 on t2(object_id,owner);
--收集统计信息
exec dbms_stats.gather_table_stats(ownname=>'dingjun123',tabname=>'t1',no_invalidate=>false);
exec dbms_stats.gather_table_stats(ownname=>'dingjun123',tabname=>'t2',no_invalidate=>false);

in (list)相当于多个or的组合，其实也是个范围条件，如果前导列是in条件，则排序键必须包含前导列才可能消除排序，排序列没有前导列，则必须前导列在where条件里是等值才可以消除排序。

但是in (list)又是个特殊的范围查询，因为可以转为or条件，这样与其他列组合，可以组合为多个等值条件，这样做索引扫描时，可以利用inlist iterator算子提高效率。

比如：
a=1 and b in (1,2) 可以转为 a=1 and (b=1 or b=2)则转为两个索引区间：
a=1 and b=1
a=1 and b=2
然后使用inlist iterator算子，两个列条件都可以参与index access。

a in (1,2) and b in (1,2)可以转为
(a=1 or a=2 ) and (b=1 or b=2)，转为4个区间：
a=1 and b=1
a=1 and b=2
a=2 and b=1
a=2 and b=2
然后利用inlist iterator算子，都可以利用a,b条件index access。

oracle处理组合索引含有in条件时，对于in条件列是索引非前导列，也会考虑选择性：
1）如果前导列选择性很好，后面的列条件是in,可能index access只有前导列，in的作为index filter，这时候不用inlist iterator算子。
2）如果前导列选择性不好，则会调用inlist iterator算子，转为多个索引区间扫描，这样in条件也会index access。

如果in条件是索引前导列，因为索引前导列要参与index access,所以基本都会采用inlist iterator算子，转为多个索引区间，这样都可以参与index access。

1)索引idx1_t1(object_name,owner),对应的非前导列owner是in条件有多个值
这时候是否走inlist iterator要看前导列的选择性是不是很好，如果很好，index access
只需要前导列即可，非前导列in的index filter
如果前导列选择性不好，则会将in list转为or,然后合并前导列，扫描多个索引区间，
使用inlist iterator算子，则多个列都可以参与index access

sql和执行计划如下：

点击(此处)折叠或打开

select * from t1 where t1.object_name='t1' and t1.owner in ('sys','dingjun123');
execution plan
----------------------------------------------------------
plan hash value: 1182251260
---------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
---------------------------------------------------------------------------------------
| 0 | select statement | | 2 | 196 | 4 (0)| 00:00:01 |
| 1 | table access by index rowid| t1 | 2 | 196 | 4 (0)| 00:00:01 |
|* 2 | index range scan | idx1_t1 | 2 | | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - access("t1"."object_name"='t1')
filter("t1"."owner"='dingjun123' or "t1"."owner"='sys')

可以从执行计划看到对于索引是(object_name,owner)的，owner非前导列用了in,没有参与index access,而是
index filter，原因是object_name选择性很好，等值查询选择性1/num_distinct=.000016423,不调用inlist iterator算子。

对应列基数和选择率如下：

点击(此处)折叠或打开

select column_name,num_distinct,density from dba_tab_col_statistics where table_name='t1' and column_name in('owner','object_name');
column_name num_distinct density
-------------------- ------------ ----------
owner 27 .037037037
object_name 60892 .000016423

如果前导列object_name也是in,因为选择性好，这里也不用第二列，前导列in转为or,调用inlist iterator：

点击(此处)折叠或打开

select * from t1 where t1.object_name in ('dba_objects','dba_tables') and t1.owner in ('sys','dingjun123');
execution plan
----------------------------------------------------------
plan hash value: 1236450337
------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------------------------
| 0 | select statement | | 4 | 528 | 5 (0)| 00:00:01 |
| 1 | inlist iterator | | | | | |
| 2 | table access by index rowid batched| t1 | 4 | 528 | 5 (0)| 00:00:01 |
|* 3 | index range scan | idx1_t1 | 1 | | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
3 - access("t1"."object_name"='dba_objects' or "t1"."object_name"='dba_tables')
filter("t1"."owner"='dingjun123' or "t1"."owner"='sys')
statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
2864 bytes sent via sql*net to client
473 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

如果都是等值的，则都参与index access:

点击(此处)折叠或打开

select * from t1 where t1.object_name ='dba_objects' and t1.owner ='sys';
1 row selected.
elapsed: 00:00:00.01
execution plan
----------------------------------------------------------
plan hash value: 2488322393
-----------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-----------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | 132 | 4 (0)| 00:00:01 |
| 1 | table access by index rowid batched| t1 | 1 | 132 | 4 (0)| 00:00:01 |
|* 2 | index range scan | idx1_t1 | 1 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - access("t1"."object_name"='dba_objects' and "t1"."owner"='sys')
statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
2692 bytes sent via sql*net to client
435 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

2)如果前导列的选择性不够好，组合的选择性才好，如果有in条件，则会转为索引扫描多个区间，
调用inlist iterator，这时候in条件可以参与index access

点击(此处)折叠或打开

drop index idx1_t1;
--创建以owner为前导列的索引，owner选择性不好：
create index idx2_t1 on t1(owner,object_name);

sql如下：

点击(此处)折叠或打开

select * from t1 where t1.object_name='dba_objects' and t1.owner in ('sys','dingjun123');

从执行计划可以看到，access里owner转为or然后 and object_name,两个列都可以index access,调用inlist iterator算子：

点击(此处)折叠或打开

elapsed: 00:00:00.00
execution plan
----------------------------------------------------------
plan hash value: 3655239226
------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------------------------
| 0 | select statement | | 2 | 264 | 5 (0)| 00:00:01 |
| 1 | inlist iterator | | | | | |
| 2 | table access by index rowid batched| t1 | 2 | 264 | 5 (0)| 00:00:01 |
|* 3 | index range scan | idx2_t1 | 2 | | 4 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
3 - access(("t1"."owner"='dingjun123' or "t1"."owner"='sys') and
"t1"."object_name"='dba_objects')
statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
2692 bytes sent via sql*net to client
451 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

如果两个列都是in list,因为前导列owner选择性不好，组合选择性好，则会转为多个索引区间，这里索引区间对应2*2=4个,如下所示：

点击(此处)折叠或打开

object_name='dba_objects' and owner='sys'
object_name='dba_objects' and owner='dingjun123'
object_name='dba_tables' and owner='sys'
object_name='dba_tables' and owner='dingjun123'

sql如下：

点击(此处)折叠或打开

select * from t1 where t1.object_name in ('dba_objects','dba_tables') and t1.owner in ('sys','dingjun123');

从执行计划可以看到，可以看到都能参与index access，走inlist iterator:

点击(此处)折叠或打开

elapsed: 00:00:00.00
execution plan
----------------------------------------------------------
plan hash value: 3655239226
------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------------------------
| 0 | select statement | | 4 | 528 | 8 (0)| 00:00:01 |
| 1 | inlist iterator | | | | | |
| 2 | table access by index rowid batched| t1 | 4 | 528 | 8 (0)| 00:00:01 |
|* 3 | index range scan | idx2_t1 | 4 | | 6 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
3 - access(("t1"."owner"='dingjun123' or "t1"."owner"='sys') and
("t1"."object_name"='dba_objects' or "t1"."object_name"='dba_tables'))
statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
16 consistent gets
0 physical reads
0 redo size
2864 bytes sent via sql*net to client
469 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed

3）多表join情况下的inlist iterator

如果sql是join，t1表是被驱动的走nested loops：
先创建索引：让object_name在前，选择性好：

点击(此处)折叠或打开

--第二个索引先删掉
drop index idx2_t1;
--创建object_name为前导列的索引：
create index idx1_t1 on t1(object_name,owner);

执行计划走nested loops，t1被驱动，t1表走idx1_t1索引，因为前导列object_name是等值条件，owner是in条件，相当于范围，
可以转为or。

在11g里这个id=6的访问谓词显示有问题，对应的owner in条件也是在index access中，单表查询时，11g里只有object_name,因为object_name
选择性好，join时都参与index access,这是算法上的问题，不过这个不影响性能。
和19c的谓词部分显示不一样，算法不一样，19c还是选择性选择性好的前导列访问：

11g都参与index access,有inlist iterator，object_name前导列是关联列，
sql和执行计划如下：

点击(此处)折叠或打开

select *
from t1,t2
where t1.object_name = t2.object_name
and t1.owner in ('sys','dingjun123')
and t2.object_id <10;
execution plan
----------------------------------------------------------
plan hash value: 3442654963
-----------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-----------------------------------------------------------------------------------------
| 0 | select statement | | 7 | 1372 | 38 (0)| 00:00:01 |
| 1 | nested loops | | | | | |
| 2 | nested loops | | 7 | 1372 | 38 (0)| 00:00:01 |
| 3 | table access by index rowid| t2 | 7 | 686 | 3 (0)| 00:00:01 |
|* 4 | index range scan | idx1_t2 | 7 | | 2 (0)| 00:00:01 |
| 5 | inlist iterator | | | | | |
|* 6 | index range scan | idx1_t1 | 2 | | 3 (0)| 00:00:01 |
| 7 | table access by index rowid | t1 | 1 | 98 | 5 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access("t2"."object_id"<10)
6 - access("t1"."object_name"="t2"."object_name" and
("t1"."owner"='dingjun123' or "t1"."owner"='sys'))
statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
43 consistent gets
0 physical reads
0 redo size
3314 bytes sent via sql*net to client
520 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
8 rows processed

19c执行计划id=5显示的谓词对应owner in ('sys','dingjun123')变成filter("t1"."owner"='dingjun123' or "t1"."owner"='sys'),
19c没有inlist iterator
index access的列只有access("t1"."object_name"="t2"."object_name")

点击(此处)折叠或打开

19c是比较准确的：
execution plan
----------------------------------------------------------
plan hash value: 1792326996
-------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-------------------------------------------------------------------------------------------------
| 0 | select statement | | 8 | 2112 | 27 (0)| 00:00:01 |
| 1 | nested loops | | 8 | 2112 | 27 (0)| 00:00:01 |
| 2 | nested loops | | 8 | 2112 | 27 (0)| 00:00:01 |
| 3 | table access by index rowid batched| t2 | 8 | 1056 | 3 (0)| 00:00:01 |
|* 4 | index range scan | idx1_t2 | 8 | | 2 (0)| 00:00:01 |
|* 5 | index range scan | idx1_t1 | 1 | | 2 (0)| 00:00:01 |
| 6 | table access by index rowid | t1 | 1 | 132 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access("t2"."object_id"<10)
5 - access("t1"."object_name"="t2"."object_name")
filter("t1"."owner"='dingjun123' or "t1"."owner"='sys')

如果19c里使用optimizer_features_enable('11.2.0.4')，则会显示和11g一样的计划，都走index access,有inlist iterator，
这么看对于in非前导列的join被驱动走nested loops的执行计划，有改进，感觉19c才是对的，因为使用inlist的cost更大，因为前导列选择性足够好：

点击(此处)折叠或打开

select/*optimizer_features_enable('11.2.0.4')*/ *
from t1,t2
where t1.object_name = t2.object_name
and t1.owner in ('sys','dingjun123')
and t2.object_id <10;
execution plan
----------------------------------------------------------
plan hash value: 3442654963
-----------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-----------------------------------------------------------------------------------------
| 0 | select statement | | 8 | 2112 | 43 (0)| 00:00:01 |
| 1 | nested loops | | | | | |
| 2 | nested loops | | 8 | 2112 | 43 (0)| 00:00:01 |
| 3 | table access by index rowid| t2 | 8 | 1056 | 3 (0)| 00:00:01 |
|* 4 | index range scan | idx1_t2 | 8 | | 2 (0)| 00:00:01 |
| 5 | inlist iterator | | | | | |
|* 6 | index range scan | idx1_t1 | 2 | | 3 (0)| 00:00:01 |
| 7 | table access by index rowid | t1 | 1 | 132 | 5 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access("t2"."object_id"<10)
6 - access("t1"."object_name"="t2"."object_name" and
("t1"."owner"='dingjun123' or "t1"."owner"='sys'))
statistics
----------------------------------------------------------
2 recursive calls
0 db block gets
45 consistent gets
0 physical reads
0 redo size
5598 bytes sent via sql*net to client
519 bytes received via sql*net from client
2 sql*net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
8 rows processed

下面删除idx1_t1,创建owner索引：
drop index idx1_t1;
--创建以owner为前导列的索引，owner选择性不好,且不是关联列，看能否全部列走索引。
drop index idx2_t1;
###这里创建三个列索引，将选择性好的放后面：
create index idx2_t1 on t1(owner,object_type,object_name);

sql如下：

点击(此处)折叠或打开

select *
from t1,t2
where t1.object_name = t2.object_name
and t1.owner in ('sys','dingjun123')
and t1.object_type in('table'，'view')
and t2.object_id <10;

从执行计划看，使用inlist iterator，和单表测试一样：
这里相当于每一行t2传递给t1,然后扫描4个索引区间，如果将选择性好的object_name放前面则不需要很多区间，
比如索引(owner,object_name,object_type),则object_type in list可能不参与index access,而走index filter:

点击(此处)折叠或打开

execution plan
----------------------------------------------------------
plan hash value: 1219290223
-------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-------------------------------------------------------------------------------------------------
| 0 | select statement | | 8 | 2112 | 59 (0)| 00:00:01 |
| 1 | nested loops | | 8 | 2112 | 59 (0)| 00:00:01 |
| 2 | nested loops | | 32 | 2112 | 59 (0)| 00:00:01 |
| 3 | table access by index rowid batched| t2 | 8 | 1056 | 3 (0)| 00:00:01 |
|* 4 | index range scan | idx1_t2 | 8 | | 2 (0)| 00:00:01 |
| 5 | inlist iterator | | | | | |
|* 6 | index range scan | idx2_t1 | 4 | | 5 (0)| 00:00:01 |
| 7 | table access by index rowid | t1 | 1 | 132 | 7 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access("t2"."object_id"<10)
6 - access(("t1"."owner"='dingjun123' or "t1"."owner"='sys') and
("t1"."object_type"='table' or "t1"."object_type"='view') and
"t1"."object_name"="t2"."object_name")

4)组合索引前导列是in多个值能否消除排序，取决于排序列是否有前导列

点击(此处)折叠或打开

drop index idx1_t1;
drop index idx2_t1;
create index idx2_t1 on t1(owner,object_name);
select * from t1 where t1.object_name in ('dba_objects','dba_tables') and t1.owner in ('sys','dingjun123')
order by owner desc,object_name desc;

因为排序列按照组合索引顺序，且access两个列都用到，排序方向一致，可以消除排序：

点击(此处)折叠或打开

execution plan
----------------------------------------------------------
plan hash value: 3833075694
-----------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-----------------------------------------------------------------------------------------
| 0 | select statement | | 4 | 528 | 8 (0)| 00:00:01 |
| 1 | inlist iterator | | | | | |
| 2 | table access by index rowid | t1 | 4 | 528 | 8 (0)| 00:00:01 |
|* 3 | index range scan descending| idx2_t1 | 4 | | 6 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
3 - access(("t1"."owner"='dingjun123' or "t1"."owner"='sys') and
("t1"."object_name"='dba_objects' or "t1"."object_name"='dba_tables'))

排序列无前导列，只有第二个列，则必须前导列是等值条件才可以消除排序，这里前导列是in有2个分支，可以看到没有消除排序，
执行计划有：sort order by
sql和执行计划如下：

点击(此处)折叠或打开

select * from t1 where t1.object_name in ('dba_objects','dba_tables') and t1.owner in ('sys','dingjun123')
order by object_name;
execution plan
----------------------------------------------------------
plan hash value: 4272791971
-------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-------------------------------------------------------------------------------------------------
| 0 | select statement | | 4 | 528 | 9 (12)| 00:00:01 |
| 1 | sort order by | | 4 | 528 | 9 (12)| 00:00:01 |
| 2 | inlist iterator | | | | | |
| 3 | table access by index rowid batched| t1 | 4 | 528 | 8 (0)| 00:00:01 |
|* 4 | index range scan | idx2_t1 | 4 | | 6 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access(("t1"."owner"='dingjun123' or "t1"."owner"='sys') and
("t1"."object_name"='dba_objects' or "t1"."object_name"='dba_tables'))

总结：在oracle里组合索引inlist,不管是单表还是关联，不管多表join的inlist列是否是关联列，都可以充分利用索引扫描，有时会将非前导列放到filter里，也是因为前导列选择性好，不会影响效率。
对于排序来说，因为in相当于范围，如果where是in条件的前导列，排序里没有前导列，则不能消除排序，如果排序列包含此前导列，且索引列方向一致，也可以消除排序。

在oracle里组合索引in在单表、多表join、排序里没有什么问题，遵循索引的leftmost prefix访问规则，inlist组合不管是否是前导列，是否单表，多表join，都能转为or，与其他索引列组合转为多个索引扫描区间，使用inlist iterator算子实现走各区间索引扫描提高效率。

下一篇：oracle vs mysql 对组合索引包含in list执行计划研究(mysql部分)_part2

阅读(2138) | 评论(0) | 转发(0) |

上一篇：oracle组合索引选择率计算及如何选择组合索引的列顺序_part2

下一篇：oracle vs mysql 对组合索引包含in list执行计划研究(mysql部分)_part2

给主人留下些什么吧！~~

| | | | |

感谢所有关心和支持过chinaunix的朋友们