oracle cbo 默认选择率及其问题凯发app官方网站的解决方案-凯发app官方网站

凯发app官方网站-凯发k8官网下载客户端中心 | | 凯发app官方网站-凯发k8官网下载客户端中心
  • 博客访问: 1156271
  • 博文数量: 166
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3760
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

about me:oracle ace pro,optimistic,passionate and harmonious. focus on oracle,mysql and other database programming,peformance tuning,db design, j2ee,linux/aix,architecture tech,etc

文章分类

全部博文(166)

文章存档

2024年(21)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

相关博文
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·

分类: oracle

2024-05-17 16:28:49

本文主要研究默认选择率,为什么要有默认选择率,因为没有存储计算对应条件选择率需要的数据,所以只能用默认的。


在oracle里,如果sql里的表无法获得需要的信息来计算选择率,可能就会使用默认选择率,常发生在以下几种情况下:
1)表有统计信息,列未收集统计信息,无法使用动态采样
2)表有统计信息,但是使用绑定变量,绑定变量无peeking
3) 表有统计信息,但是对列使用了函数等复杂条件,无法准确估算对应条件的选择率


针对1),就是更新统计信息,补全没有收集或不准的统计信息。

针对第二种情况,为了避免修改_optim_peek_user_binds参数造成的全局影响,可以针对单个sql开启_optim_peek_user_binds,使用sql profile绑定。

针对第三种情况,因为复杂条件无法计算准确选择率,利用使用扩展列统计信息、高级别动态采样、cardinality/opt_esimate hints、
sql profile等绑定执行计划。

1.oracle cbo 默认选择率

本节主要分析什么时候cbo使用默认选择率,各种场景下的默认选择率。


建表语句如下:

点击(此处)折叠或打开

  1. drop table sel_test;
  2. create table sel_test as select * from dba_objects;
  3. create index idx_sel_test on sel_test(object_name);
因为ctas会自动收集统计信息,这里不需要再收集(需要直方图的要收集)。

统计信息如下:

点击(此处)折叠或打开

  1. select num_rows,sample_size,last_analyzed
  2. from dba_tables where table_name='sel_test';
  3.   num_rows sample_size last_analyze
  4. ---------- ----------- ------------
  5.      72800 72800 15-may-24

  6. select column_name,num_distinct,num_nulls,density,histogram
  7. from dba_tab_col_statistics
  8. where table_name='sel_test';
  9. column_name num_distinct num_nulls density histogram
  10. -------------------- ------------ ---------- ---------- ------------------------------
  11. temporary 2 0 .5 none
  12. generated 2 0 .5 none
  13. secondary 1 0 1 none
  14. namespace 21 2 .047619048 none
  15. edition_name 0 72800 0 none
  16. sharing 4 0 .25 none
  17. editionable 2 47685 .5 none
  18. oracle_maintained 2 0 .5 none
  19. application 1 0 1 none
  20. default_collation 1 58503 1 none
  21. duplicated 1 0 1 none
  22. sharded 1 0 1 none
  23. created_appid 0 72800 0 none
  24. created_vsnid 0 72800 0 none
  25. modified_appid 0 72800 0 none
  26. modified_vsnid 0 72800 0 none
  27. owner 27 0 .037037037 none
  28. object_name 60896 0 .000016421 none
  29. subobject_name 564 71801 .00177305 none
  30. object_id 72798 2 .000013737 none
  31. data_object_id 6252 66511 .000159949 none
  32. object_type 45 0 .022222222 none
  33. created 1479 0 .000676133 none
  34. last_ddl_time 1634 2 .000611995 none
  35. timestamp 1640 2 .000609756 none
  36. status 1 0 1 none
注意点:
对于计算null数目影响的:
adjust_sel=(num_rows-num_nulls)/num_rows


其中走索引选择率ix_sel,索引过滤选择率ix_sel_with_filters,
回表选择率如果没有额外过滤条件就是ix_sel_with_filters,有额外条件还要加上额外条件选择率*condition_filters


对于列有统计信息的,选择率=默认选择率*adjust_sel, adjust_sel=(num_rows-num_nulls)/num_rows,
如果绑定变量等无具体值的、使用函数等,可能使用默认选择率。


对于列无统计信息的,这个adjust_sel就没有了,所以就是默认选择率,如果对于num_nulls很多的列,更加不准确。


注意:以下示例使用object_name的因为null行数较少,忽略其影响,
使用subobject_name null行数多,考虑adjust_sel,选择率=default_sel * adjust_sel

1.列有统计信息情况下的选择率计算


可以用explain测试,explain不会探测绑定变量值,用的是默认选择率
注意null数目多的,还要考虑adjuest_sel。
选择率=默认选择率*adjust_sel, adjust_sel=(num_rows-num_nulls)/num_rows,

1)列等值条件选择率计算(索引列,非索引列都一样,前提是有统计信息)

如果列有统计信息,这个不管比较条件是否有具体值,其实都不是默认选择率。
如果列有统计信息情况下:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name = :var;

  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. ----------------------------------------------------------------------------------------------------
  7. plan hash value: 310337252
  8. ----------------------------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ----------------------------------------------------------------------------------------------------
  11. | 0 | select statement | | 1 | 132 | 4 (0)| 00:00:01 |
  12. | 1 | table access by index rowid batched| sel_test | 1 | 132 | 4 (0)| 00:00:01 |
  13. |* 2 | index range scan | idx_sel_test | 1 | | 3 (0)| 00:00:01 |
  14. ----------------------------------------------------------------------------------------------------
  15. predicate information (identified by operation id):
  16. ---------------------------------------------------
  17.    2 - access("object_name"=:var)
列等值选择率,这个是正常选择率,直接按照max(1/num_distinct,density)计算,
无直方图情况下1/num_distinct和density一样(这里没有考虑null,直方图等),
object_name的num_nulls=0,
对应cardinality=round((1/60896)*72800)=1

不管这个列是否走索引,等值的选择率都是sel=1/num_distinct=density,如果有直方图要复杂点,
如果考虑null,sel=(1/num_distinct)*(num_rows-num_nulls)/num_rows

非索引列object_id等值也是(1/num_distinct)*(num_rows-num_nulls)/num_rows:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_id = :var;
  4. select * from table(dbms_xplan.display);

  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 1 | 132 | 396 (1)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 1 | 132 | 396 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("object_id"=to_number(:var))

2)列有统计信息,<>的选择率计算不是默认选择率,而是1-sel(等值条件)


点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name <> :var;

  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 72799 | 9384k| 398 (2)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 72799 | 9384k| 398 (2)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("object_name"<>:var)
<>选择率=1-sel(object_name=)=1-1/60896,结果行=round(72800-1)=82799

3)索引列like 绑定变量 走索引默认选择率计算(绑定变量无peeking) ix_sel=0.009,回表ix_sel_with_filters=0.05


like实际上是范围查询,所以对于使用绑定变量的范围查询选择率是一样的,比如
like :var, > :var,< :var都是一样的选择率,当然如果>,<是有多个组合的,默认选择率有所不同,见3)

如果有具体值,会按照具体值计算选择率,不是默认选择率。

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name like :var;

  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. ----------------------------------------------------------------------------------------------------
  7. plan hash value: 310337252

  8. ----------------------------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ----------------------------------------------------------------------------------------------------
  11. | 0 | select statement | | 3640 | 469k| 335 (0)| 00:00:01 |
  12. | 1 | table access by index rowid batched| sel_test | 3640 | 469k| 335 (0)| 00:00:01 |
  13. |* 2 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
  14. ----------------------------------------------------------------------------------------------------
  15. predicate information (identified by operation id):
  16. ---------------------------------------------------
  17.    2 - access("object_name" like :var)
  18.        filter("object_name" like :var)
这里的rows=655,是怎么计算的?因为没有peeking,很显然是不准的,在实际应用中,
如果peeking关闭,则就是按照默认选择率,
这里的like :var,无peeking,不知道是前通配还是后通配,也就是走index range scan是
不准确的,如果采用这样的默认选择率,
可能传入的:var = '�%'也会走index range scan,这很有问题,所以生产上还是
建议打开peeking。


回头在看655是怎么产生的,因为这时候虽然object_name列有统计信息,但是因为like是
范围查询,范围查询需要知道具体的范围,
计算范围查询选择率sel=required_range/(high_value-low_value),
现在绑定变量没有具体值(和peeking关闭一样),所以不知道
required_range,因此采用默认选择率。
猜测下是多少?
sel= 655/num_rows=655/72800=.008997253,一般默认选择率到小数部分非0整数位结束,
也就是0.009
这是猜测,其实也可以通过10053查看:ix_sel: 0.009000

点击(此处)折叠或打开

  1. access path: index (rangescan)
  2.     index: idx_sel_test
  3.     resc_io: 335.000000 resc_cpu: 2943282
  4.     ix_sel: 0.009000 ix_sel_with_filters: 0.009000
  5.     cost: 335.220618 resp: 335.220618 degree: 1


如果like绑定变量列没有索引的话就是0.05,也就是5%的选择率。可以通过10053查看:
sel = 0.050000 ,如下:

点击(此处)折叠或打开

  1. cost of predicates:
  2.     io = nocost, cpu = 100.000000, sel = 0.050000 flag = 2048 ("sel_test"."object_name" like :b1)


也可以推导出来:sel=回表返回行/总行数=3640/72800=0.05,这就是为什么执行计划里走索引
估算行数是655,而回表行数=3640,
正常情况下索引行数是655,回表结果行数应该小于等于655,不会大这么多,主要原因就是
索引和回表的选择率不一样,索引是0.009,回表是0.05。


下面看下有索引的单个范围查询选择率:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name > :var;

  4. select * from table(dbms_xplan.display);

  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 310337252

  8. ----------------------------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ----------------------------------------------------------------------------------------------------
  11. | 0 | select statement | | 3640 | 469k| 335 (0)| 00:00:01 |
  12. | 1 | table access by index rowid batched| sel_test | 3640 | 469k| 335 (0)| 00:00:01 |
  13. |* 2 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
  14. ----------------------------------------------------------------------------------------------------

  15. predicate information (identified by operation id):
  16. ---------------------------------------------------

  17.    2 - access("object_name">:var)

通过范围的基数看,还是655/72800=.008997253 = 0.009

4)索引列带绑定变量至于索引范围扫描,多个范围查询的选择率 and条件ix_sel选择率

=0.0045,相当于单个范围的的0.009/2,回表ix_sel_with_filters= 0.05*0.05


多个范围查询:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name > :var and object_name < :var1;

  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 2443607680

  8. -----------------------------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. -----------------------------------------------------------------------------------------------------
  11. | 0 | select statement | | 182 | 24024 | 169 (0)| 00:00:01 |
  12. |* 1 | filter | | | | | |
  13. | 2 | table access by index rowid batched| sel_test | 182 | 24024 | 169 (0)| 00:00:01 |
  14. |* 3 | index range scan | idx_sel_test | 328 | | 5 (0)| 00:00:01 |
  15. -----------------------------------------------------------------------------------------------------

  16. predicate information (identified by operation id):
  17. ---------------------------------------------------

  18.    1 - filter(:var1>:var)
  19.    3 - access("object_name">:var and "object_name"<:var1)
这个和一般的and选择率计算不一样,如果按照一般的选择率计算:
sel=0.009*0.009,结果行数=72800*0.009*0.009=6,很显然和328相差巨大,328对应单个
选择率计算的655的一般,
猜测选择率=0.0045,也可以通过10053看出    ix_sel: 0.004500 :

点击(此处)折叠或打开

  1. access path: index (rangescan)
  2.     index: idx_sel_test
  3.     resc_io: 169.000000 resc_cpu: 1482323
  4.     ix_sel: 0.004500 ix_sel_with_filters: 0.004500
  5.     cost: 169.111110 resp: 169.111110 degree: 1

这里的回表182行,还是按照回表选择率0.05*0.05计算,对应返回行数=
round(72800*0.05*0.5)=182.

5)索引列带绑定变量走索引范围扫描,or范围查询的选择率,单个分支ix_sel=0.009,

ix_sel_with_filters=0.05,

等值的单个分支ix_sel=1/num_distinct。
如果是走索引or,走索引单个条件还是按照0.009,回表按照0.05,计算返回行选择率

按照0.05计算,
sel(or)=sel(a) sel(b)-sel(a)*sel(b),回表选择绿按照0.05计算


点击(此处)折叠或打开

  1. explain plan for
  2. select/*or_expand*/ * from sel_test
  3. where object_name > :var or object_name <= :var1;

  4. select * from table(dbms_xplan.display);

  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 2487264663

  8. ---------------------------------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ---------------------------------------------------------------------------------------------------------
  11. | 0 | select statement | | 7098 | 3334k| 670 (0)| 00:00:01 |
  12. | 1 | view | vw_ore_67cae57e | 7098 | 3334k| 670 (0)| 00:00:01 |
  13. | 2 | union-all | | | | | |
  14. | 3 | table access by index rowid batched| sel_test | 3640 | 469k| 335 (0)| 00:00:01 |
  15. |* 4 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
  16. | 5 | table access by index rowid batched| sel_test | 3458 | 445k| 335 (0)| 00:00:01 |
  17. |* 6 | index range scan | idx_sel_test | 655 | | 7 (0)| 00:00:01 |
  18. ---------------------------------------------------------------------------------------------------------

  19. predicate information (identified by operation id):
  20. ---------------------------------------------------

  21.    4 - access("object_name">:var)
  22.    6 - access("object_name"<=:var1)
  23.        filter(lnnvl("object_name">:var))

走索引返回655行,是按照0.009计算,回表按照0.05计算:
sel=sel(a) sel(b)-sel(a)*sel(b)=0.05 0.05-0.05*0.05,
返回行数=72800*(0.05 0.05-0.05*0.05)=7098


像第二个分支的返回行数=3458,因为要排除前面分支的条件,对应的选择率=0.05-0.05*0.05,
所以返回行数=72800*(0.05-0.05*0.05)=3458

如果不加or_expand hints:
根据不走索引的默认选择率0.05计算:
sel=sel(a) sel(b)-sel(a)*sel(b)=0.05 0.05-0.05*0.05,
返回行数=72800*(0.05 0.05-0.05*0.05)=7098


点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where object_name > :var or object_name <= :var1;

  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 7098 | 914k| 395 (1)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 7098 | 914k| 395 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter("object_name">:var or "object_name"<=:var1)

6)非索引列范围默认选择率0.05 
通过2)可以得知,不走索引的like绑定变量默认选择率=0.05,和走索引回表的选择率一样,

比如:

点击(此处)折叠或打开

  1. explain plan for
  2. select/*test*/ * from sel_test
  3. where subobject_name like :var;
  4. select * from table(dbms_xplan.display);

  5. plan_table_output
  6. ----------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 50 | 6600 | 395 (1)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 50 | 6600 | 395 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------
  16.    1 - filter("subobject_name" like :var)
这里的rows=50,不是3640,为什么呢,因为num_nulls:71801
所以选择率=0.05*adjust_sel=0.05*(72800-71801)/72800=.000686126
所以返回行=round(72800*.000686126)=50

7)函数范围查询默认选择率0.05(函数范围,无具体值,有具体值都是5%)  


点击(此处)折叠或打开

  1. create index idx1_sel_test on sel_test(object_id);
  2. explain plan for
  3. select * from sel_test
  4. where object_id like :var;
  5. select * from table(dbms_xplan.display);

  6. plan_table_output
  7. ----------------------------------------------------------------------------------------------------
  8. plan hash value: 747827318

  9. ------------------------------------------------------------------------------
  10. | id | operation | name | rows | bytes | cost (%cpu)| time |
  11. ------------------------------------------------------------------------------
  12. | 0 | select statement | | 3640 | 469k| 396 (1)| 00:00:01 |
  13. |* 1 | table access full| sel_test | 3640 | 469k| 396 (1)| 00:00:01 |
  14. ------------------------------------------------------------------------------

  15. predicate information (identified by operation id):
  16. ---------------------------------------------------

  17.    1 - filter(to_char("object_id") like :var)
like也是范围查询,object_id因为是number类型,隐式类型转换使用to_char,
对应选择率=3640/72800=0.05,object_id的num_nulls=2,可以忽略。

传入具体值也是5%:


点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where substr(object_id,1,2) > '123';
  4. select * from table(dbms_xplan.display);

  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 3640 | 469k| 396 (1)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 3640 | 469k| 396 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter(substr(to_char("object_id"),1,2)>'123')

8)函数等值查询,选择率是0.01(函数等值,无具体值,有具体值都是1%)

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where substr(object_id,1,2) = :var;
  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 728 | 96096 | 396 (1)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 728 | 96096 | 396 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter(substr(to_char("object_id"),1,2)=:var)

函数等值查询对应选择率0.01,所以返回行=72800*0.01=728

传入实际值也一样:

点击(此处)折叠或打开

  1. explain plan for
  2. select * from sel_test
  3. where substr(object_id,1,2) ='ta';
  4. select * from table(dbms_xplan.display);
  5. plan_table_output
  6. --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  7. plan hash value: 747827318

  8. ------------------------------------------------------------------------------
  9. | id | operation | name | rows | bytes | cost (%cpu)| time |
  10. ------------------------------------------------------------------------------
  11. | 0 | select statement | | 728 | 96096 | 396 (1)| 00:00:01 |
  12. |* 1 | table access full| sel_test | 728 | 96096 | 396 (1)| 00:00:01 |
  13. ------------------------------------------------------------------------------

  14. predicate information (identified by operation id):
  15. ---------------------------------------------------

  16.    1 - filter(substr(to_char("object_id"),1,2)='ta')

  下一篇:oracle cbo 默认选择率及其问题凯发app官方网站的解决方案_part2

阅读(25) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~
")); function link(t){ var href= $(t).attr('href'); href ="?url=" encodeuricomponent(location.href); $(t).attr('href',href); //setcookie("returnouturl", location.href, 60, "/"); }
网站地图