mysql subquery cache vs oracle subquery cache子查询缓存-凯发app官方网站

博客访问： 1156267
博文数量： 166
博客积分： 0
博客等级：民兵
技术积分： 3760
用户组：普通用户
注册时间： 2011-03-30 13:00

个人简介

about me:oracle ace pro,optimistic,passionate and harmonious. focus on oracle,mysql and other database programming,peformance tuning,db design, j2ee,linux/aix,architecture tech,etc

文章分类

全部博文（166）

cbo（54）
tuning&performan（32）
concepts&archite（4）
troubleshooting（5）
new feature（7）
sql（50）
pl/sql（8）
miscellaneous（3）
mysql（3）
未分配的博文（0）

文章存档

2024年（21）

2023年（28）

2022年（43）

2020年（62）

2014年（3）

2013年（9）

我的朋友

3.mysql标量子查询没有subquery cache优化，走 dependent subquery ，这个很差劲

从下列执行计划看，标量子查询只能走dependent subquery ，没有缓存优化。

点击(此处)折叠或打开

explain
select t1.a,(select t2.b from subq_t2 t2 where t1.a = t2.a) b
from subq_t1 t1;
-------------------------------------------------------------------------------------------------------------------------------
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | extra |
-------------------------------------------------------------------------------------------------------------------------------
| 1 | primary | t1 | null | all | null | null | null | null | 299886 | 100.00 | null |
| 2 | dependent subquery | t2 | null | ref | idx_subq_t2 | idx_subq_t2 | 5 | employees.t1.a | 1 | 100.00 | null |
-------------------------------------------------------------------------------------------------------------------------------
2 rows in set, 2 warnings (0.00 sec)
show warnings\g
*************************** 1. row ***************************
level: note
code: 1276
message: field or reference 'employees.t1.a' of select #2 was resolved in select #1
*************************** 2. row ***************************
level: note
code: 1003
message: /* select#1 */ select `employees`.`t1`.`a` as `a`,(/* select#2 */ select `employees`.`t2`.`b` from `employees`.`subq_t2` `t2` where (`employees`.`t1`.`a` = `employees`.`t2`.`a`)) as `b` from `employees`.`subq_t1` `t1`

使用hints也走dependent subquery：

点击(此处)折叠或打开

explain
select/*subquery(@subq materialization)*/ t1.a,(select/*qb_name(subq)*/ t2.b from subq_t2 t2 where t1.a = t2.a) b
from subq_t1 t1;
-------------------------------------------------------------------------------------------------------------------------------
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | extra |
-------------------------------------------------------------------------------------------------------------------------------
| 1 | primary | t1 | null | all | null | null | null | null | 299886 | 100.00 | null |
| 2 | dependent subquery | t2 | null | ref | idx_subq_t2 | idx_subq_t2 | 5 | employees.t1.a | 1 | 100.00 | null |
-------------------------------------------------------------------------------------------------------------------------------

树形计划，子查询走索引被驱动30w次，和预期的3,5次，相差甚远，没有缓存优化。

点击(此处)折叠或打开

*************************** 1. row ***************************
explain: -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.034..232.504 rows=300000 loops=1)
-> select #2 (subquery in projection; dependent)
-> index lookup on t2 using idx_subq_t2 (a=t1.a) (cost=0.35 rows=1) (actual time=0.003..0.004 rows=1 loops=300000)

总结：mysql标量子查询只能走dependent subquery，没有缓存优化，所以在mysql里，要少用标量子查询。

4.mysql对于不能unnest的，如果外层表有过滤条件，如果条件不能走索引，则条件写在子查询前还是后对驱动子查询次数有影响，能走索引则没有影响。

写在前先过滤后然后驱动子查询，写在后则是先驱动子查询后过滤外层条件，可能执行次数会
多很多。

如果能走索引，那么一般按照先走索引过滤外部表，顺序没有影响。

1）先过滤，虽然这里还是对t1全表扫描，返回30w行，但是filter条件t1.b>300000先执行，
则返回0行，然后子查询不用执行。

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq intoexists)*/ *
from subq_t1 t1
where t1.b>300000 and t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g

注意filter: ((t1.b > 300000) and (t1.a,(select #2)))条件，在前的先执行

实际上这个是树形执行计划显示的问题，应该table scan on t1直接按照(t1.b > 300000)扫描，返回0行，通过执行时间可以看出是193ms,比全表扫描返回所有行小很多

点击(此处)折叠或打开

*************************** 1. row ***************************
explain: -> filter: ((t1.b > 300000) and <in_optimizer>(t1.a,<exists>(select #2))) (cost=30156.85 rows=99952) (actual time=218.451..218.451 rows=0 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.030..193.624 rows=300000 loops=1)
-> select #2 (subquery in condition; dependent)
-> limit: 1 row(s) (never executed)
-> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (never executed)

2）先驱动子查询后过滤，t1返回30w行，子查询执行30w次，很拉跨

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq intoexists)*/ *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)
and t1.b>300000\g

条件：filter: ((t1.a,(select #2)) and (t1.b > 300000))
这里的table scan on t1执行时间是250ms,比前面的先过滤的193ms大。这里是真的返回30w行。

点击(此处)折叠或打开

*************************** 1. row ***************************
explain: -> filter: (<in_optimizer>(t1.a,<exists>(select #2)) and (t1.b > 300000)) (cost=30156.85 rows=99952) (actual time=1046.440..1046.440 rows=0 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.030..250.191 rows=300000 loops=1)
-> select #2 (subquery in condition; dependent)
-> limit: 1 row(s) (actual time=0.002..0.002 rows=1 loops=300000)
-> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (actual time=0.002..0.002 rows=1 loops=300000)

3）同样的不能unnest的materialization和intoexists一样，只不过materialization有缓存优化

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq materialization)*/ *
from subq_t1 t1
where t1.b>300000 and t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g

t1.b>300000写在子查询前，子查询执行0次：

点击(此处)折叠或打开

*************************** 1. row ***************************
explain: -> filter: ((t1.b > 300000) and <in_optimizer>(t1.a,t1.a in (select #2))) (cost=30156.85 rows=99952) (actual time=204.768..204.768 rows=0 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.026..182.554 rows=300000 loops=1)
-> select #2 (subquery in condition; run only once)
-> filter: ((t1.a = `<materialized_subquery>`.a)) (cost=20078.95..20078.95 rows=1) (never executed)
-> limit: 1 row(s) (cost=20078.85..20078.85 rows=1) (never executed)
-> index lookup on <materialized_subquery> using <auto_distinct_key> (a=t1.a) (never executed)
-> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (never executed)
-> index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (never executed)

t1.b>300000写在子查询后，因为有缓存，子查询执行4次驱动物化表：

sql：

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq materialization)*/ *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)
and t1.b>300000\g

子查询执行4次：

点击(此处)折叠或打开

*************************** 1. row ***************************
explain: -> filter: (<in_optimizer>(t1.a,t1.a in (select #2)) and (t1.b > 300000)) (cost=30156.85 rows=99952) (actual time=371.866..371.866 rows=0 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..199.214 rows=300000 loops=1)
-> select #2 (subquery in condition; run only once)
-> filter: ((t1.a = `<materialized_subquery>`.a)) (cost=20078.95..20078.95 rows=1) (actual time=27.640..27.640 rows=1 loops=4)
-> limit: 1 row(s) (cost=20078.85..20078.85 rows=1) (actual time=27.639..27.639 rows=1 loops=4)
-> index lookup on <materialized_subquery> using <auto_distinct_key> (a=t1.a) (actual time=27.638..27.638 rows=1 loops=4)
-> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (actual time=110.525..110.525 rows=100000 loops=1)
-> covering index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (actual time=0.007..61.938 rows=100000 loops=1)

4）能够unnest的，过滤条件写在前后无影响

t1.b>300000写在子查询后，也是先过滤，子查询执行0次：

点击(此处)折叠或打开

explain analyze
select *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)
and t1.b>300000\g
*************************** 1. row ***************************
explain: -> nested loop inner join (cost=1000689631.09 rows=10006494790) (actual time=220.631..220.631 rows=0 loops=1)
-> filter: ((t1.b > 300000) and (t1.a is not null)) (cost=30156.85 rows=99952) (actual time=220.630..220.630 rows=0 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..198.257 rows=300000 loops=1)
-> single-row index lookup on <subquery2> using <auto_distinct_key> (a=t1.a) (never executed)
-> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (never executed)
-> filter: (t2.a is not null) (cost=10067.55 rows=100113) (never executed)
-> index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (never executed)

5)如果外部表条件有索引，则先走索引过滤，没有影响

建立索引：

点击(此处)折叠或打开

create index idx_sub1_t1 on subq_t1(b);

现在完全没有区别，外部表都是先index range scan on t1 using idx_sub1_t1，返回0行，子查询不需要执行。

t1.b>300000在子查询前，走索引，也是先过滤，子查询执行0次：

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq intoexists)*/ *
from subq_t1 t1
where t1.b>300000 and t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g
*************************** 1. row ***************************
explain: -> filter: <in_optimizer>(t1.a,<exists>(select #2)) (cost=0.71 rows=1) (actual time=0.057..0.057 rows=0 loops=1)
-> index range scan on t1 using idx_sub1_t1 over (300000 < b), with index condition: (t1.b > 300000) (cost=0.71 rows=1) (actual time=0.029..0.029 rows=0 loops=1)
-> select #2 (subquery in condition; dependent)
-> limit: 1 row(s) (never executed)
-> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (never executed)
1 row in set (0.00 sec)

t1.b>300000在子查询后，走索引，也是先过滤，子查询也是执行0次：

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq intoexists)*/ *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)
and t1.b>300000\g
*************************** 1. row ***************************
explain: -> filter: <in_optimizer>(t1.a,<exists>(select #2)) (cost=0.71 rows=1) (actual time=0.022..0.022 rows=0 loops=1)
-> index range scan on t1 using idx_sub1_t1 over (300000 < b), with index condition: (t1.b > 300000) (cost=0.71 rows=1) (actual time=0.021..0.021 rows=0 loops=1)
-> select #2 (subquery in condition; dependent)
-> limit: 1 row(s) (never executed)
-> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (never executed)
1 row in set (0.01 sec)

5.能够unnest的是利用join/semi join/anti join，能够有5种方式优化子查询，没有缓存优化

5种子查询策略：pullout,firstmatch,loosescan,materialization,dupsweedout

点击(此处)折叠或打开

explain analyze
select *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g

执行计划可以看到，驱动物化表30w次。因为外层重复结果多，这时候效率还不如unnest subquery materialization

点击(此处)折叠或打开

*************************** 1. row ***************************
explain: -> nested loop inner join (cost=3002308857.25 rows=30022487118) (actual time=133.304..555.223 rows=300000 loops=1)
-> filter: (t1.a is not null) (cost=30156.85 rows=299886) (actual time=0.028..282.177 rows=300000 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..258.794 rows=300000 loops=1)
-> single-row index lookup on <subquery2> using <auto_distinct_key> (a=t1.a) (actual time=0.001..0.001 rows=1 loops=300000)
-> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (actual time=133.266..133.266 rows=100000 loops=1)
-> filter: (t2.a is not null) (cost=10067.55 rows=100113) (actual time=0.007..77.712 rows=100000 loops=1)
-> covering index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (actual time=0.007..69.800 rows=100000 loops=1)

no unnest subquery materialization，因为子查询只需要执行4次，执行时间346ms,比前面的555ms高。

点击(此处)折叠或打开

explain analyze
select/*subquery(@subq materialization)*/ *
from subq_t1 t1
where t1.a in
(select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g
*************************** 1. row ***************************
explain: -> filter: <in_optimizer>(t1.a,t1.a in (select #2)) (cost=30156.85 rows=299886) (actual time=111.111..346.171 rows=300000 loops=1)
-> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..189.058 rows=300000 loops=1)
-> select #2 (subquery in condition; run only once)
-> filter: ((t1.a = `<materialized_subquery>`.a)) (cost=20078.95..20078.95 rows=1) (actual time=27.775..27.775 rows=1 loops=4)
-> limit: 1 row(s) (cost=20078.85..20078.85 rows=1) (actual time=27.773..27.773 rows=1 loops=4)
-> index lookup on <materialized_subquery> using <auto_distinct_key> (a=t1.a) (actual time=27.773..27.773 rows=1 loops=4)
-> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (actual time=111.064..111.064 rows=100000 loops=1)
-> covering index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (actual time=0.007..62.216 rows=100000 loops=1)

在oracle里semi join,anti join有缓存优化。

阅读(479) | 评论(0) | 转发(0) |

上一篇：mysql subquery cache vs oracle subquery cache子查询缓存_mysql部分_part1

下一篇：mysql subquery cache vs oracle subquery cache子查询缓存_oracle部分_part3

给主人留下些什么吧！~~

| | | | |

感谢所有关心和支持过chinaunix的朋友们