mysql subquery cache vs oracle subquery cache子查询缓存-凯发app官方网站

博客访问： 1156265
博文数量： 166
博客积分： 0
博客等级：民兵
技术积分： 3760
用户组：普通用户
注册时间： 2011-03-30 13:00

个人简介

about me:oracle ace pro,optimistic,passionate and harmonious. focus on oracle,mysql and other database programming,peformance tuning,db design, j2ee,linux/aix,architecture tech,etc

文章分类

全部博文（166）

cbo（54）
tuning&performan（32）
concepts&archite（4）
troubleshooting（5）
new feature（7）
sql（50）
pl/sql（8）
miscellaneous（3）
mysql（3）
未分配的博文（0）

文章存档

2024年（21）

2023年（28）

2022年（43）

2020年（62）

2014年（3）

2013年（9）

我的朋友

6.oracle具有全面的子查询cache优化功能

oracle同样的查询,不管是in还是exists,对于子查询不能unnest(这里使用no_unnest hints),都有subquery cache:

oracle里常见的unnest subquery的是标量子查询，update关联子查询等走filter操作的子查询，对应执行计划子查询部分
有系统绑定变量，比如：b1，有缓存优化，如果外层关联key基数小，则子查询执行次数少。

oracle建表语句:

点击(此处)折叠或打开

create table subq_t1(a int,b int);
insert into subq_t1(a,b)
select 1,1
from dual
connect by level<=100000;
insert into subq_t1(a,b)
select 2,1level
from dual
connect by level<=100000;
insert into subq_t1(a,b)
select 3,1level
from dual
connect by level<=100000;
create table subq_t2(a int,b int);
insert into subq_t2(a,b)
select level,level
from dual
connect by level<=100000;
create index idx_subq_t2 on subq_t2(a);
subq_t1表的a只有3个值：
select count(*),count(distinct a) from subq_t1;
count(*) count(distincta)
---------- ----------------
300000 3
subq_t2的a都是不重复的：
select count(*),count(distinct a) from subq_t2;
count(*) count(distincta)
---------- ----------------
100000 100000

点击(此处)折叠或打开

set serveroutput off
set autotrace traceonly --不输出结果，太多
select/*gather_plan_statistics*/ *
from subq_t1 t1
where t1.a in
(select/*no_unnest*/ t2.a from subq_t2 t2);
set autotrace off
select sql_id,sql_text from v$sql where sql_text like '%gather_plan_statistics%subq_t2%';
select * from table(dbms_xplan.display_cursor('0hrqfk46ss2ns',null,'allstats last'));

可以看到走filter，虽然subq_t1有30w行，但是子查询只执行3次（实际执行次数依赖subq_t1.a的相同值连续情况）

执行计划id=3对于谓词有系统绑定变量： 3 - access("t2"."a"=:b1)

点击(此处)折叠或打开

plan hash value: 2995694926
--------------------------------------------------------------------------------------------
| id | operation | name | starts | e-rows | a-rows | a-time | buffers |
--------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | | 300k|00:00:00.17 | 20535 |
|* 1 | filter | | 1 | | 300k|00:00:00.17 | 20535 |
| 2 | table access full| subq_t1 | 1 | 327k| 300k|00:00:00.11 | 20529 |
|* 3 | index range scan | idx_subq_t2 | 3 | 1098 | 3 |00:00:00.01 | 6 |
--------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter( is not null)
3 - access("t2"."a"=:b1)

改为exists和in一样，因为在oracle里in/exists基本都可以相互转换：

点击(此处)折叠或打开

set serveroutput off
set autotrace traceonly --不输出结果，太多
select/*gather_plan_statistics*/ *
from subq_t1 t1
where exists
(select/*no_unnest*/ 1 from subq_t2 t2 where t1.a = t2.a);
set autotrace off
select sql_id,sql_text from v$sql where sql_text like '%gather_plan_statistics%subq_t2%';
select * from table(dbms_xplan.display_cursor('f8uyx3hynta6x',null,'allstats last'));
plan_table_output
----------------------------------------------------------------------------------------------------
sql_id f8uyx3hynta6x, child number 0
-------------------------------------
select/*gather_plan_statistics*/ * from subq_t1 t1 where exists
(select/*no_unnest*/ 1 from subq_t2 t2 where t1.a = t2.a)
plan hash value: 2995694926
--------------------------------------------------------------------------------------------
| id | operation | name | starts | e-rows | a-rows | a-time | buffers |
--------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | | 300k|00:00:00.16 | 20535 |
|* 1 | filter | | 1 | | 300k|00:00:00.16 | 20535 |
| 2 | table access full| subq_t1 | 1 | 327k| 300k|00:00:00.09 | 20529 |
|* 3 | index range scan | idx_subq_t2 | 3 | 1098 | 3 |00:00:00.01 | 6 |
--------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
1 - filter( is not null)
3 - access("t2"."a"=:b1)

oracle标量子查询，也有subquery cache优化：

点击(此处)折叠或打开

set serveroutput off
set autotrace traceonly --不输出结果，太多
select/*gather_plan_statistics*/ t1.a,(select t2.b from subq_t2 t2 where t1.a = t2.a) b
from subq_t1 t1;
set autotrace off
select sql_id,sql_text from v$sql where sql_text like '%gather_plan_statistics%subq_t2%';
select * from table(dbms_xplan.display_cursor('cwvtbwcaytnz5',null,'allstats last'));

可以看到id=1和id=2只执行3次：

点击(此处)折叠或打开

plan hash value: 438047693
-------------------------------------------------------------------------------------------------------------
| id | operation | name | starts | e-rows | a-rows | a-time | buffers |
-------------------------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | | 300k|00:00:00.09 | 20529 |
| 1 | table access by index rowid batched| subq_t2 | 3 | 1098 | 3 |00:00:00.01 | 9 |
|* 2 | index range scan | idx_subq_t2 | 3 | 439 | 3 |00:00:00.01 | 6 |
| 3 | table access full | subq_t1 | 1 | 327k| 300k|00:00:00.09 | 20529 |
-------------------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - access("t2"."a"=:b1)

甚至在oracle里semi join/anti join走nested loops的（普通nested loops没有缓存优化），都有缓存优化，可以减少被驱动表回表次数，比如：

1.semi join走nested loops有缓存优化

点击(此处)折叠或打开

set serveroutput off
set autotrace traceonly --不输出结果，太多
--去掉no_unnest
select/*gather_plan_statistics leading(t1) use_nl(t2@subq) */ *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2);

nested loops semi，子查询只执行3次：

点击(此处)折叠或打开

set autotrace off
select sql_id,sql_text from v$sql where sql_text like '%gather_plan_statistics%subq_t2%';
select * from table(dbms_xplan.display_cursor('cacaags9gcy2a',null,'allstats last'));
plan hash value: 3871264342
--------------------------------------------------------------------------------------------
| id | operation | name | starts | e-rows | a-rows | a-time | buffers |
--------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | | 300k|00:00:00.18 | 20535 |
| 1 | nested loops semi | | 1 | 327k| 300k|00:00:00.18 | 20535 |
| 2 | table access full| subq_t1 | 1 | 327k| 300k|00:00:00.13 | 20529 |
|* 3 | index range scan | idx_subq_t2 | 3 | 109k| 3 |00:00:00.01 | 6 |
--------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
3 - access("t1"."a"="t2"."a")

2.anti join同样有subquery cache

点击(此处)折叠或打开

set serveroutput off
set autotrace traceonly --不输出结果，太多
--去掉no_unnest
select/*gather_plan_statistics leading(t1) use_nl(t2@subq) */ *
from subq_t1 t1
where t1.a not in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2 where t2.a is not null) and t1.a is not null;
set autotrace off
select sql_id,sql_text from v$sql where sql_text like '%gather_plan_statistics%subq_t2%';
select * from table(dbms_xplan.display_cursor('fs848pak6turs',null,'allstats last'));

因为有缓存，not in的执行效率非常高，没有找到数据：

点击(此处)折叠或打开

plan hash value: 3776200174
--------------------------------------------------------------------------------------------
| id | operation | name | starts | e-rows | a-rows | a-time | buffers |
--------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | | 0 |00:00:00.05 | 558 |
| 1 | nested loops anti | | 1 | 327k| 0 |00:00:00.05 | 558 |
|* 2 | table access full| subq_t1 | 1 | 327k| 300k|00:00:00.06 | 553 |
|* 3 | index range scan | idx_subq_t2 | 3 | 1 | 3 |00:00:00.01 | 5 |
--------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
2 - filter("t1"."a" is not null)
3 - access("t1"."a"="t2"."a")
filter("t2"."a" is not null)

3.普通的nested loops join没有缓存优化,但是有批处理

点击(此处)折叠或打开

set serveroutput off
set autotrace traceonly --不输出结果，太多
select/*gather_plan_statistics leading(t1) use_nl(t2)*/ *
from subq_t1 t1,subq_t2 t2
where t1.a = t2.a;
set autotrace off
select sql_id,sql_text from v$sql where sql_text like '%gather_plan_statistics%subq_t2%';
select * from table(dbms_xplan.display_cursor('2qa1g6az4hmk8',null,'allstats last'));

两个nested loops就是bulk获取rowid的优化：

点击(此处)折叠或打开

plan hash value: 2089944506
------------------------------------------------------------------------------------------------------
| id | operation | name | starts | e-rows | a-rows | a-time | buffers |
------------------------------------------------------------------------------------------------------
| 0 | select statement | | 1 | | 300k|00:00:00.70 | 80537 |
| 1 | nested loops | | 1 | 327k| 300k|00:00:00.70 | 80537 |
| 2 | nested loops | | 1 | 327k| 300k|00:00:00.47 | 60536 |
| 3 | table access full | subq_t1 | 1 | 327k| 300k|00:00:00.06 | 20529 |
|* 4 | index range scan | idx_subq_t2 | 300k| 1 | 300k|00:00:00.25 | 40007 |
| 5 | table access by index rowid| subq_t2 | 300k| 1 | 300k|00:00:00.14 | 20001 |
------------------------------------------------------------------------------------------------------
predicate information (identified by operation id):
---------------------------------------------------
4 - access("t1"."a"="t2"."a")

在oracle里，subquery cache优化比较全面，不能unnest走filter的有cache优化，可以减少执行次数，semi join/anti join同样有cache优化，普通nested loops join没有cache优化但是有批量获取rowid回表的优化。

总结：
1.mysql的子查询subquery cache,仅限于无法unnest subquery,走select_type:subquery,也就是物化方式，如果外部表与子查询关联的key重复值特别多，
则可以有效减少子查询执行次数，具体减少多少，要看这个key存储的连续程度。

至于dependent subquery是mysql子查询兜底招数，虽然改写和执行计划里对外部传递的列有,但是不能减少子查询执行次数，效果比较差。

另外mysql的标量子查询只能走dependent subquery，无subquery cache优化，效率不好。

mariadb是做了subquery cache优化的：见

mysql对于不能unnest的，如果外层表有过滤条件，如果不能走索引，则条件写在子查询前还是后对驱动子查询次数有影响，
能走索引先走索引过滤无影响。

不走索引的，写在前先过滤后然后驱动子查询，写在后则是先驱动子查询后过滤外层条件，可能执行次数会多很多。

2.oracle对无法unnest的子查询、标量子查询、update关联子查询是有缓存结果优化的，一般most缓存数目在1024个key左右，
如果outer table关联key的基数(distinct数目）很小，则可以大幅度减少子查询执行次数，具体也要看数据物理分布连续性是否强，
连续性强，减少的子查询次数越少，如果很离散存储，则子查询次数会增多。详细见：
filter operation中子查询的执行次数问题分析（http://blog.chinaunix.net/uid-7655508-id-5875810.html）。

另外oracle对能够unnest的semi/anti join,走nested loops的也有缓存优化，普通join走nested loops没有缓存优化，但是有批处理优化。

oracle这方面优化做的比较全面。

3.mysql子查询优化还是不够强大，所以在mysql里让子查询能够subquery unnest较好,这样可以转为join,semi join,anti join优化，避免unnest subquery只有2种方法，
完全依赖于外层结果驱动子查询，缓存优化也只有materialization才有，标量子查询在mysql里要慎用，和o不同的是，它没有缓存优化。
在mysql里经常会使用materialization来优化子查询，因为它会按照关联key剔除重复行并且创建索引，一定程度上能够弥补不能缓存的差距。

本系列结束.

阅读(66) | 评论(0) | 转发(0) |

上一篇：mysql subquery cache vs oracle subquery cache子查询缓存_mysql部分_part2

下一篇：没有了

给主人留下些什么吧！~~

| | | | |

感谢所有关心和支持过chinaunix的朋友们