mysql subquery cache vs oracle subquery cache子查询缓存-凯发app官方网站

凯发app官方网站-凯发k8官网下载客户端中心 | | 凯发app官方网站-凯发k8官网下载客户端中心
  • 博客访问: 1156267
  • 博文数量: 166
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 3760
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-30 13:00
个人简介

about me:oracle ace pro,optimistic,passionate and harmonious. focus on oracle,mysql and other database programming,peformance tuning,db design, j2ee,linux/aix,architecture tech,etc

文章分类

全部博文(166)

文章存档

2024年(21)

2023年(28)

2022年(43)

2020年(62)

2014年(3)

2013年(9)

相关博文
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·
  • ·

分类: mysql/postgresql

2024-05-23 16:11:26

接:mysql subquery cache vs oracle subquery cache子查询缓存_mysql部分_part1

3.mysql标量子查询没有subquery cache优化,走 dependent subquery ,这个很差劲


从下列执行计划看,标量子查询只能走dependent subquery ,没有缓存优化。

点击(此处)折叠或打开

  1. explain
  2. select t1.a,(select t2.b from subq_t2 t2 where t1.a = t2.a) b
  3. from subq_t1 t1;
  4. -------------------------------------------------------------------------------------------------------------------------------
  5. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | extra |
  6. -------------------------------------------------------------------------------------------------------------------------------
  7. | 1 | primary | t1 | null | all | null | null | null | null | 299886 | 100.00 | null |
  8. | 2 | dependent subquery | t2 | null | ref | idx_subq_t2 | idx_subq_t2 | 5 | employees.t1.a | 1 | 100.00 | null |
  9. -------------------------------------------------------------------------------------------------------------------------------
  10. 2 rows in set, 2 warnings (0.00 sec)

  11. show warnings\g
  12. *************************** 1. row ***************************
  13.   level: note
  14.    code: 1276
  15. message: field or reference 'employees.t1.a' of select #2 was resolved in select #1
  16. *************************** 2. row ***************************
  17.   level: note
  18.    code: 1003
  19. message: /* select#1 */ select `employees`.`t1`.`a` as `a`,(/* select#2 */ select `employees`.`t2`.`b` from `employees`.`subq_t2` `t2` where (`employees`.`t1`.`a` = `employees`.`t2`.`a`)) as `b` from `employees`.`subq_t1` `t1`

使用hints也走dependent subquery:


点击(此处)折叠或打开

  1. explain
  2. select/*subquery(@subq materialization)*/ t1.a,(select/*qb_name(subq)*/ t2.b from subq_t2 t2 where t1.a = t2.a) b
  3. from subq_t1 t1;

  4. -------------------------------------------------------------------------------------------------------------------------------
  5. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | extra |
  6. -------------------------------------------------------------------------------------------------------------------------------
  7. | 1 | primary | t1 | null | all | null | null | null | null | 299886 | 100.00 | null |
  8. | 2 | dependent subquery | t2 | null | ref | idx_subq_t2 | idx_subq_t2 | 5 | employees.t1.a | 1 | 100.00 | null |
  9. -------------------------------------------------------------------------------------------------------------------------------

树形计划,子查询走索引被驱动30w次,和预期的3,5次,相差甚远,没有缓存优化。

点击(此处)折叠或打开

  1. *************************** 1. row ***************************
  2. explain: -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.034..232.504 rows=300000 loops=1)
  3. -> select #2 (subquery in projection; dependent)
  4.     -> index lookup on t2 using idx_subq_t2 (a=t1.a) (cost=0.35 rows=1) (actual time=0.003..0.004 rows=1 loops=300000)

总结:mysql标量子查询只能走dependent subquery,没有缓存优化,所以在mysql里,要少用标量子查询。


4.mysql对于不能unnest的,如果外层表有过滤条件,如果条件不能走索引,则条件写在子查询前还是后对驱动子查询次数有影响,能走索引则没有影响。


写在前先过滤后然后驱动子查询,写在后则是先驱动子查询后过滤外层条件,可能执行次数会
多很多。

如果能走索引,那么一般按照先走索引过滤外部表,顺序没有影响。

1)先过滤,虽然这里还是对t1全表扫描,返回30w行,但是filter条件t1.b>300000先执行,
则返回0行,然后子查询不用执行。


点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq intoexists)*/ *
  3. from subq_t1 t1
  4. where t1.b>300000 and t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g

注意filter: ((t1.b > 300000) and (t1.a,(select #2)))条件,在前的先执行

实际上这个是树形执行计划显示的问题,应该table scan on t1直接按照(t1.b > 300000)扫描,返回0行,通过执行时间可以看出是193ms,比全表扫描返回所有行小很多


点击(此处)折叠或打开

  1. *************************** 1. row ***************************
  2. explain: -> filter: ((t1.b > 300000) and <in_optimizer>(t1.a,<exists>(select #2))) (cost=30156.85 rows=99952) (actual time=218.451..218.451 rows=0 loops=1)
  3.     -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.030..193.624 rows=300000 loops=1)
  4.     -> select #2 (subquery in condition; dependent)
  5.         -> limit: 1 row(s) (never executed)
  6.             -> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (never executed)

2)先驱动子查询后过滤,t1返回30w行,子查询执行30w次,很拉跨

点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq intoexists)*/ *
  3. from subq_t1 t1
  4. where t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)
  6. and t1.b>300000\g

条件:filter: ((t1.a,(select #2)) and (t1.b > 300000)) 
这里的table scan on t1执行时间是250ms,比前面的先过滤的193ms大。这里是真的返回30w行。

点击(此处)折叠或打开

  1. *************************** 1. row ***************************
  2. explain: -> filter: (<in_optimizer>(t1.a,<exists>(select #2)) and (t1.b > 300000)) (cost=30156.85 rows=99952) (actual time=1046.440..1046.440 rows=0 loops=1)
  3.     -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.030..250.191 rows=300000 loops=1)
  4.     -> select #2 (subquery in condition; dependent)
  5.         -> limit: 1 row(s) (actual time=0.002..0.002 rows=1 loops=300000)
  6.             -> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (actual time=0.002..0.002 rows=1 loops=300000)

3)同样的不能unnest的materialization和intoexists一样,只不过materialization有缓存优化

点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq materialization)*/ *
  3. from subq_t1 t1
  4. where t1.b>300000 and t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g

t1.b>300000写在子查询前,子查询执行0次:


点击(此处)折叠或打开

  1. *************************** 1. row ***************************
  2. explain: -> filter: ((t1.b > 300000) and <in_optimizer>(t1.a,t1.a in (select #2))) (cost=30156.85 rows=99952) (actual time=204.768..204.768 rows=0 loops=1)
  3.     -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.026..182.554 rows=300000 loops=1)
  4.     -> select #2 (subquery in condition; run only once)
  5.         -> filter: ((t1.a = `<materialized_subquery>`.a)) (cost=20078.95..20078.95 rows=1) (never executed)
  6.             -> limit: 1 row(s) (cost=20078.85..20078.85 rows=1) (never executed)
  7.                 -> index lookup on <materialized_subquery> using <auto_distinct_key> (a=t1.a) (never executed)
  8.                     -> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (never executed)
  9.                         -> index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (never executed)

t1.b>300000写在子查询后,因为有缓存,子查询执行4次驱动物化表:

sql:

点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq materialization)*/ *
  3. from subq_t1 t1
  4. where t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)
  6. and t1.b>300000\g

子查询执行4次:

点击(此处)折叠或打开

  1. *************************** 1. row ***************************
  2. explain: -> filter: (<in_optimizer>(t1.a,t1.a in (select #2)) and (t1.b > 300000)) (cost=30156.85 rows=99952) (actual time=371.866..371.866 rows=0 loops=1)
  3.     -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..199.214 rows=300000 loops=1)
  4.     -> select #2 (subquery in condition; run only once)
  5.         -> filter: ((t1.a = `<materialized_subquery>`.a)) (cost=20078.95..20078.95 rows=1) (actual time=27.640..27.640 rows=1 loops=4)
  6.             -> limit: 1 row(s) (cost=20078.85..20078.85 rows=1) (actual time=27.639..27.639 rows=1 loops=4)
  7.                 -> index lookup on <materialized_subquery> using <auto_distinct_key> (a=t1.a) (actual time=27.638..27.638 rows=1 loops=4)
  8.                     -> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (actual time=110.525..110.525 rows=100000 loops=1)
  9.                         -> covering index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (actual time=0.007..61.938 rows=100000 loops=1)

4)能够unnest的,过滤条件写在前后无影响

t1.b>300000写在子查询后,也是先过滤,子查询执行0次:


点击(此处)折叠或打开

  1. explain analyze
  2. select *
  3. from subq_t1 t1
  4. where t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)
  6. and t1.b>300000\g
  7. *************************** 1. row ***************************
  8. explain: -> nested loop inner join (cost=1000689631.09 rows=10006494790) (actual time=220.631..220.631 rows=0 loops=1)
  9.     -> filter: ((t1.b > 300000) and (t1.a is not null)) (cost=30156.85 rows=99952) (actual time=220.630..220.630 rows=0 loops=1)
  10.         -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..198.257 rows=300000 loops=1)
  11.     -> single-row index lookup on <subquery2> using <auto_distinct_key> (a=t1.a) (never executed)
  12.         -> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (never executed)
  13.             -> filter: (t2.a is not null) (cost=10067.55 rows=100113) (never executed)
  14.                 -> index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (never executed)

5)如果外部表条件有索引,则先走索引过滤,没有影响

建立索引:

点击(此处)折叠或打开

  1. create index idx_sub1_t1 on subq_t1(b);

现在完全没有区别,外部表都是先index range scan on t1 using idx_sub1_t1,返回0行,子查询不需要执行。

t1.b>300000在子查询前,走索引,也是先过滤,子查询执行0次:

点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq intoexists)*/ *
  3. from subq_t1 t1
  4. where t1.b>300000 and t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g
  6. *************************** 1. row ***************************
  7. explain: -> filter: <in_optimizer>(t1.a,<exists>(select #2)) (cost=0.71 rows=1) (actual time=0.057..0.057 rows=0 loops=1)
  8.     -> index range scan on t1 using idx_sub1_t1 over (300000 < b), with index condition: (t1.b > 300000) (cost=0.71 rows=1) (actual time=0.029..0.029 rows=0 loops=1)
  9.     -> select #2 (subquery in condition; dependent)
  10.         -> limit: 1 row(s) (never executed)
  11.             -> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (never executed)

  12. 1 row in set (0.00 sec)

t1.b>300000在子查询后,走索引,也是先过滤,子查询也是执行0次:

点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq intoexists)*/ *
  3. from subq_t1 t1
  4. where t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)
  6. and t1.b>300000\g
  7. *************************** 1. row ***************************
  8. explain: -> filter: <in_optimizer>(t1.a,<exists>(select #2)) (cost=0.71 rows=1) (actual time=0.022..0.022 rows=0 loops=1)
  9.     -> index range scan on t1 using idx_sub1_t1 over (300000 < b), with index condition: (t1.b > 300000) (cost=0.71 rows=1) (actual time=0.021..0.021 rows=0 loops=1)
  10.     -> select #2 (subquery in condition; dependent)
  11.         -> limit: 1 row(s) (never executed)
  12.             -> covering index lookup on t2 using idx_subq_t2 (a=<cache>(t1.a)) (never executed)

  13. 1 row in set (0.01 sec)

5.能够unnest的是利用join/semi join/anti join,能够有5种方式优化子查询,没有缓存优化


5种子查询策略:pullout,firstmatch,loosescan,materialization,dupsweedout

点击(此处)折叠或打开

  1. explain analyze
  2. select *
  3. from subq_t1 t1
  4. where t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g

执行计划可以看到,驱动物化表30w次。因为外层重复结果多,这时候效率还不如unnest subquery materialization

点击(此处)折叠或打开

  1. *************************** 1. row ***************************
  2. explain: -> nested loop inner join (cost=3002308857.25 rows=30022487118) (actual time=133.304..555.223 rows=300000 loops=1)
  3.     -> filter: (t1.a is not null) (cost=30156.85 rows=299886) (actual time=0.028..282.177 rows=300000 loops=1)
  4.         -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..258.794 rows=300000 loops=1)
  5.     -> single-row index lookup on <subquery2> using <auto_distinct_key> (a=t1.a) (actual time=0.001..0.001 rows=1 loops=300000)
  6.         -> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (actual time=133.266..133.266 rows=100000 loops=1)
  7.             -> filter: (t2.a is not null) (cost=10067.55 rows=100113) (actual time=0.007..77.712 rows=100000 loops=1)
  8.                 -> covering index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (actual time=0.007..69.800 rows=100000 loops=1)

no unnest subquery materialization,因为子查询只需要执行4次,执行时间346ms,比前面的555ms高。


点击(此处)折叠或打开

  1. explain analyze
  2. select/*subquery(@subq materialization)*/ *
  3. from subq_t1 t1
  4. where t1.a in
  5. (select/*qb_name(subq)*/ t2.a from subq_t2 t2)\g
  6. *************************** 1. row ***************************
  7. explain: -> filter: <in_optimizer>(t1.a,t1.a in (select #2)) (cost=30156.85 rows=299886) (actual time=111.111..346.171 rows=300000 loops=1)
  8.     -> table scan on t1 (cost=30156.85 rows=299886) (actual time=0.027..189.058 rows=300000 loops=1)
  9.     -> select #2 (subquery in condition; run only once)
  10.         -> filter: ((t1.a = `<materialized_subquery>`.a)) (cost=20078.95..20078.95 rows=1) (actual time=27.775..27.775 rows=1 loops=4)
  11.             -> limit: 1 row(s) (cost=20078.85..20078.85 rows=1) (actual time=27.773..27.773 rows=1 loops=4)
  12.                 -> index lookup on <materialized_subquery> using <auto_distinct_key> (a=t1.a) (actual time=27.773..27.773 rows=1 loops=4)
  13.                     -> materialize with deduplication (cost=20078.85..20078.85 rows=100113) (actual time=111.064..111.064 rows=100000 loops=1)
  14.                         -> covering index scan on t2 using idx_subq_t2 (cost=10067.55 rows=100113) (actual time=0.007..62.216 rows=100000 loops=1)

在oracle里semi join,anti join有缓存优化。


阅读(479) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~
")); function link(t){ var href= $(t).attr('href'); href ="?url=" encodeuricomponent(location.href); $(t).attr('href',href); //setcookie("returnouturl", location.href, 60, "/"); }
网站地图