oracle cbo 默认选择率及其问题凯发app官方网站的解决方案-凯发app官方网站

博客访问： 1156271
博文数量： 166
博客积分： 0
博客等级：民兵
技术积分： 3760
用户组：普通用户
注册时间： 2011-03-30 13:00

个人简介

about me:oracle ace pro,optimistic,passionate and harmonious. focus on oracle,mysql and other database programming,peformance tuning,db design, j2ee,linux/aix,architecture tech,etc

文章分类

全部博文（166）

cbo（54）
tuning&performan（32）
concepts&archite（4）
troubleshooting（5）
new feature（7）
sql（50）
pl/sql（8）
miscellaneous（3）
mysql（3）
未分配的博文（0）

文章存档

2024年（21）

2023年（28）

2022年（43）

2020年（62）

2014年（3）

2013年（9）

我的朋友

1.oracle cbo 默认选择率

本节主要分析什么时候cbo使用默认选择率，各种场景下的默认选择率。

建表语句如下：

点击(此处)折叠或打开

drop table sel_test;
create table sel_test as select * from dba_objects;
create index idx_sel_test on sel_test(object_name);

因为ctas会自动收集统计信息，这里不需要再收集（需要直方图的要收集）。

统计信息如下：

点击(此处)折叠或打开

select num_rows,sample_size,last_analyzed
from dba_tables where table_name='sel_test';
num_rows sample_size last_analyze
---------- ----------- ------------
72800 72800 15-may-24
select column_name,num_distinct,num_nulls,density,histogram
from dba_tab_col_statistics
where table_name='sel_test';
column_name num_distinct num_nulls density histogram
-------------------- ------------ ---------- ---------- ------------------------------
temporary 2 0 .5 none
generated 2 0 .5 none
secondary 1 0 1 none
namespace 21 2 .047619048 none
edition_name 0 72800 0 none
sharing 4 0 .25 none
editionable 2 47685 .5 none
oracle_maintained 2 0 .5 none
application 1 0 1 none
default_collation 1 58503 1 none
duplicated 1 0 1 none
sharded 1 0 1 none
created_appid 0 72800 0 none
created_vsnid 0 72800 0 none
modified_appid 0 72800 0 none
modified_vsnid 0 72800 0 none
owner 27 0 .037037037 none
object_name 60896 0 .000016421 none
subobject_name 564 71801 .00177305 none
object_id 72798 2 .000013737 none
data_object_id 6252 66511 .000159949 none
object_type 45 0 .022222222 none
created 1479 0 .000676133 none
last_ddl_time 1634 2 .000611995 none
timestamp 1640 2 .000609756 none
status 1 0 1 none

注意点：
对于计算null数目影响的：
adjust_sel=(num_rows-num_nulls)/num_rows

其中走索引选择率ix_sel,索引过滤选择率ix_sel_with_filters,
回表选择率如果没有额外过滤条件就是ix_sel_with_filters，有额外条件还要加上额外条件选择率*condition_filters

对于列有统计信息的，选择率=默认选择率*adjust_sel, adjust_sel=(num_rows-num_nulls)/num_rows，
如果绑定变量等无具体值的、使用函数等，可能使用默认选择率。

对于列无统计信息的，这个adjust_sel就没有了，所以就是默认选择率，如果对于num_nulls很多的列，更加不准确。

注意：以下示例使用object_name的因为null行数较少，忽略其影响，
使用subobject_name null行数多，考虑adjust_sel，选择率=default_sel * adjust_sel

1.列有统计信息情况下的选择率计算

可以用explain测试，explain不会探测绑定变量值，用的是默认选择率
注意null数目多的，还要考虑adjuest_sel。
选择率=默认选择率*adjust_sel, adjust_sel=(num_rows-num_nulls)/num_rows，

1)列等值条件选择率计算（索引列，非索引列都一样,前提是有统计信息）

如果列有统计信息，这个不管比较条件是否有具体值，其实都不是默认选择率。
如果列有统计信息情况下：

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_name = :var;
select * from table(dbms_xplan.display);
plan_table_output
----------------------------------------------------------------------------------------------------
plan hash value: 310337252
----------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
----------------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | 132 | 4 (0)| 00:00:01 |
| 1 | table access by index rowid batched| sel_test | 1 | 132 | 4 (0)| 00:00:01 |
|* 2 | index range scan | idx_sel_test | 1 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - access("object_name"=:var)

列等值选择率，这个是正常选择率，直接按照max(1/num_distinct,density)计算,
无直方图情况下1/num_distinct和density一样（这里没有考虑null,直方图等），
object_name的num_nulls=0,
对应cardinality=round((1/60896)*72800)=1

不管这个列是否走索引，等值的选择率都是sel=1/num_distinct=density,如果有直方图要复杂点，
如果考虑null，sel=(1/num_distinct)*(num_rows-num_nulls)/num_rows

非索引列object_id等值也是(1/num_distinct)*(num_rows-num_nulls)/num_rows:

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_id = :var;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 1 | 132 | 396 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 1 | 132 | 396 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter("object_id"=to_number(:var))

2)列有统计信息,<>的选择率计算不是默认选择率,而是1-sel(等值条件)

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_name <> :var;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 72799 | 9384k| 398 (2)| 00:00:01 |
|* 1 | table access full| sel_test | 72799 | 9384k| 398 (2)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter("object_name"<>:var)

<>选择率=1-sel(object_name=)=1-1/60896,结果行=round(72800-1)=82799

3)索引列like 绑定变量走索引默认选择率计算(绑定变量无peeking) ix_sel=0.009,回表ix_sel_with_filters=0.05

like实际上是范围查询，所以对于使用绑定变量的范围查询选择率是一样的，比如
like :var, > :var,< :var都是一样的选择率,当然如果>,<是有多个组合的，默认选择率有所不同，见3)

如果有具体值，会按照具体值计算选择率，不是默认选择率。

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_name like :var;
select * from table(dbms_xplan.display);
plan_table_output
----------------------------------------------------------------------------------------------------
plan hash value: 310337252
----------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
----------------------------------------------------------------------------------------------------
| 0 | select statement | | 3640 | 469k| 335 (0)| 00:00:01 |
| 1 | table access by index rowid batched| sel_test | 3640 | 469k| 335 (0)| 00:00:01 |
|* 2 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - access("object_name" like :var)
filter("object_name" like :var)

这里的rows=655,是怎么计算的？因为没有peeking,很显然是不准的，在实际应用中，
如果peeking关闭，则就是按照默认选择率，
这里的like :var,无peeking,不知道是前通配还是后通配，也就是走index range scan是
不准确的，如果采用这样的默认选择率，
可能传入的:var = '�%'也会走index range scan，这很有问题，所以生产上还是
建议打开peeking。

回头在看655是怎么产生的，因为这时候虽然object_name列有统计信息，但是因为like是
范围查询，范围查询需要知道具体的范围，
计算范围查询选择率sel=required_range/(high_value-low_value),
现在绑定变量没有具体值（和peeking关闭一样），所以不知道
required_range,因此采用默认选择率。
猜测下是多少？
sel= 655/num_rows=655/72800=.008997253,一般默认选择率到小数部分非0整数位结束，
也就是0.009
这是猜测，其实也可以通过10053查看：ix_sel: 0.009000

点击(此处)折叠或打开

access path: index (rangescan)
index: idx_sel_test
resc_io: 335.000000 resc_cpu: 2943282
ix_sel: 0.009000 ix_sel_with_filters: 0.009000
cost: 335.220618 resp: 335.220618 degree: 1

如果like绑定变量列没有索引的话就是0.05,也就是5%的选择率。可以通过10053查看：
sel = 0.050000 ，如下：

点击(此处)折叠或打开

cost of predicates:
io = nocost, cpu = 100.000000, sel = 0.050000 flag = 2048 ("sel_test"."object_name" like :b1)

也可以推导出来：sel=回表返回行/总行数=3640/72800=0.05,这就是为什么执行计划里走索引
估算行数是655，而回表行数=3640，
正常情况下索引行数是655，回表结果行数应该小于等于655，不会大这么多，主要原因就是
索引和回表的选择率不一样，索引是0.009,回表是0.05。

下面看下有索引的单个范围查询选择率：

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_name > :var;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 310337252
----------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
----------------------------------------------------------------------------------------------------
| 0 | select statement | | 3640 | 469k| 335 (0)| 00:00:01 |
| 1 | table access by index rowid batched| sel_test | 3640 | 469k| 335 (0)| 00:00:01 |
|* 2 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - access("object_name">:var)

通过范围的基数看，还是655/72800=.008997253 = 0.009

4）索引列带绑定变量至于索引范围扫描，多个范围查询的选择率 and条件ix_sel选择率

=0.0045,相当于单个范围的的0.009/2，回表ix_sel_with_filters= 0.05*0.05

多个范围查询：

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_name > :var and object_name < :var1;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 2443607680
-----------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
-----------------------------------------------------------------------------------------------------
| 0 | select statement | | 182 | 24024 | 169 (0)| 00:00:01 |
|* 1 | filter | | | | | |
| 2 | table access by index rowid batched| sel_test | 182 | 24024 | 169 (0)| 00:00:01 |
|* 3 | index range scan | idx_sel_test | 328 | | 5 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter(:var1>:var)
3 - access("object_name">:var and "object_name"<:var1)

这个和一般的and选择率计算不一样，如果按照一般的选择率计算：
sel=0.009*0.009,结果行数=72800*0.009*0.009=6，很显然和328相差巨大，328对应单个
选择率计算的655的一般，
猜测选择率=0.0045，也可以通过10053看出 ix_sel: 0.004500 ：

点击(此处)折叠或打开

access path: index (rangescan)
index: idx_sel_test
resc_io: 169.000000 resc_cpu: 1482323
ix_sel: 0.004500 ix_sel_with_filters: 0.004500
cost: 169.111110 resp: 169.111110 degree: 1

这里的回表182行，还是按照回表选择率0.05*0.05计算，对应返回行数=
round(72800*0.05*0.5)=182.

5)索引列带绑定变量走索引范围扫描，or范围查询的选择率,单个分支ix_sel=0.009,

ix_sel_with_filters=0.05,

等值的单个分支ix_sel=1/num_distinct。
如果是走索引or，走索引单个条件还是按照0.009,回表按照0.05,计算返回行选择率

按照0.05计算，
sel(or)=sel(a) sel(b)-sel(a)*sel(b),回表选择绿按照0.05计算

点击(此处)折叠或打开

explain plan for
select/*or_expand*/ * from sel_test
where object_name > :var or object_name <= :var1;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 2487264663
---------------------------------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
---------------------------------------------------------------------------------------------------------
| 0 | select statement | | 7098 | 3334k| 670 (0)| 00:00:01 |
| 1 | view | vw_ore_67cae57e | 7098 | 3334k| 670 (0)| 00:00:01 |
| 2 | union-all | | | | | |
| 3 | table access by index rowid batched| sel_test | 3640 | 469k| 335 (0)| 00:00:01 |
|* 4 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
| 5 | table access by index rowid batched| sel_test | 3458 | 445k| 335 (0)| 00:00:01 |
|* 6 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access("object_name">:var)
6 - access("object_name"<=:var1)
filter(lnnvl("object_name">:var))

走索引返回655行，是按照0.009计算，回表按照0.05计算：
sel=sel(a) sel(b)-sel(a)*sel(b)=0.05 0.05-0.05*0.05,
返回行数=72800*(0.05 0.05-0.05*0.05)=7098

像第二个分支的返回行数=3458，因为要排除前面分支的条件，对应的选择率=0.05-0.05*0.05,
所以返回行数=72800*(0.05-0.05*0.05)=3458

如果不加or_expand hints:
根据不走索引的默认选择率0.05计算：
sel=sel(a) sel(b)-sel(a)*sel(b)=0.05 0.05-0.05*0.05,
返回行数=72800*(0.05 0.05-0.05*0.05)=7098

点击(此处)折叠或打开

explain plan for
select * from sel_test
where object_name > :var or object_name <= :var1;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 7098 | 914k| 395 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 7098 | 914k| 395 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter("object_name">:var or "object_name"<=:var1)

6)非索引列范围默认选择率0.05
通过2）可以得知，不走索引的like绑定变量默认选择率=0.05，和走索引回表的选择率一样，

比如：

点击(此处)折叠或打开

explain plan for
select/*test*/ * from sel_test
where subobject_name like :var;
select * from table(dbms_xplan.display);
plan_table_output
----------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 50 | 6600 | 395 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 50 | 6600 | 395 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter("subobject_name" like :var)

这里的rows=50,不是3640，为什么呢，因为num_nulls:71801
所以选择率=0.05*adjust_sel=0.05*(72800-71801)/72800=.000686126
所以返回行=round(72800*.000686126)=50

7）函数范围查询默认选择率0.05（函数范围，无具体值，有具体值都是5%）

点击(此处)折叠或打开

create index idx1_sel_test on sel_test(object_id);
explain plan for
select * from sel_test
where object_id like :var;
select * from table(dbms_xplan.display);
plan_table_output
----------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 3640 | 469k| 396 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 3640 | 469k| 396 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter(to_char("object_id") like :var)

like也是范围查询，object_id因为是number类型，隐式类型转换使用to_char，
对应选择率=3640/72800=0.05,object_id的num_nulls=2,可以忽略。

传入具体值也是5%：

点击(此处)折叠或打开

explain plan for
select * from sel_test
where substr(object_id,1,2) > '123';
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 3640 | 469k| 396 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 3640 | 469k| 396 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter(substr(to_char("object_id"),1,2)>'123')

8）函数等值查询，选择率是0.01（函数等值，无具体值，有具体值都是1%）

点击(此处)折叠或打开

explain plan for
select * from sel_test
where substr(object_id,1,2) = :var;
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 728 | 96096 | 396 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 728 | 96096 | 396 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter(substr(to_char("object_id"),1,2)=:var)

函数等值查询对应选择率0.01,所以返回行=72800*0.01=728

传入实际值也一样：

点击(此处)折叠或打开

explain plan for
select * from sel_test
where substr(object_id,1,2) ='ta';
select * from table(dbms_xplan.display);
plan_table_output
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
plan hash value: 747827318
------------------------------------------------------------------------------
| id | operation | name | rows | bytes | cost (%cpu)| time |
------------------------------------------------------------------------------
| 0 | select statement | | 728 | 96096 | 396 (1)| 00:00:01 |
|* 1 | table access full| sel_test | 728 | 96096 | 396 (1)| 00:00:01 |
------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter(substr(to_char("object_id"),1,2)='ta')

下一篇：oracle cbo 默认选择率及其问题凯发app官方网站的解决方案_part2

阅读(25) | 评论(0) | 转发(0) |

上一篇：oracle vs mysql 对组合索引包含in list执行计划研究(mysql部分)_part2

下一篇：oracle cbo 默认选择率及其问题凯发app官方网站的解决方案_part2

给主人留下些什么吧！~~

| | | | |

感谢所有关心和支持过chinaunix的朋友们

oracle cbo 默认选择率及其问题凯发app官方网站的解决方案-凯发app官方网站

1.oracle cbo 默认选择率

1.列有统计信息情况下的选择率计算

1)列等值条件选择率计算（索引列，非索引列都一样,前提是有统计信息）

2)列有统计信息,<>的选择率计算不是默认选择率,而是1-sel(等值条件)

3)索引列like 绑定变量 走索引默认选择率计算(绑定变量无peeking) ix_sel=0.009,回表ix_sel_with_filters=0.05

4）索引列带绑定变量至于索引范围扫描，多个范围查询的选择率 and条件ix_sel选择率

=0.0045,相当于单个范围的的0.009/2，回表ix_sel_with_filters= 0.05*0.05

5)索引列带绑定变量走索引范围扫描，or范围查询的选择率,单个分支ix_sel=0.009,

ix_sel_with_filters=0.05,

等值的单个分支ix_sel=1/num_distinct。 如果是走索引or，走索引单个条件还是按照0.009,回表按照0.05,计算返回行选择率

按照0.05计算， sel(or)=sel(a) sel(b)-sel(a)*sel(b),回表选择绿按照0.05计算

6)非索引列范围默认选择率0.05 通过2）可以得知，不走索引的like绑定变量默认选择率=0.05，和走索引回表的选择率一样，

比如：

8）函数等值查询，选择率是0.01（函数等值，无具体值，有具体值都是1%）

3)索引列like 绑定变量走索引默认选择率计算(绑定变量无peeking) ix_sel=0.009,回表ix_sel_with_filters=0.05

等值的单个分支ix_sel=1/num_distinct。
如果是走索引or，走索引单个条件还是按照0.009,回表按照0.05,计算返回行选择率

按照0.05计算，
sel(or)=sel(a) sel(b)-sel(a)*sel(b),回表选择绿按照0.05计算

6)非索引列范围默认选择率0.05
通过2）可以得知，不走索引的like绑定变量默认选择率=0.05，和走索引回表的选择率一样，