当前位置：移动技术网 > IT编程>数据库>Mysql > MySQL在关联复杂情况下所能做出的一些优化

MySQL在关联复杂情况下所能做出的一些优化

2017年12月12日 | 移动技术网IT编程 | 我要评论

昨天处理了一则复杂关联sql的优化，这类sql的优化往往考虑以下四点：

第一.查询所返回的结果集，通常查询返回的结果集很少，是有信心进行优化的；

第二.驱动表的选择至关重要，通过查看执行计划，可以看到优化器选择的驱动表,从执行计划中的rows可以大致反映出问题的所在；

第三.理清各表之间的关联关系，注意关联字段上是否有合适的索引；

第四.使用straight_join关键词来强制表之间的关联顺序，可以方便我们验证某些猜想；

sql:
执行时间：

mysql> select c.yh_id,
-> c.yh_dm,
-> c.yh_mc,
-> c.mm,
-> c.yh_lx,
-> a.jg_id,
-> a.jg_dm,
-> a.jg_mc,
-> a.jgxz_dm,
-> d.js_dm yh_js
-> from a, b, c
-> left join d on d.yh_id = c.yh_id
-> where a.jg_id = b.jg_id
-> and b.yh_id = c.yh_id
-> and a.yx_bj = ‘y'
-> and c.sc_bj = ‘n'
-> and c.yx_bj = ‘y'
-> and c.sc_bj = ‘n'
-> and c.yh_dm = '006939748xx' ;

1 row in set (0.75 sec)

这条sql查询实际只返回了一行数据，但却执行耗费了750ms，查看执行计划：

mysql> explain
-> select c.yh_id,
-> c.yh_dm,
-> c.yh_mc,
-> c.mm,
-> c.yh_lx,
-> a.jg_id,
-> a.jg_dm,
-> a.jg_mc,
-> a.jgxz_dm,
-> d.js_dm yh_js
-> from a, b, c
-> left join d on d.yh_id = c.yh_id
-> where a.jg_id = b.jg_id
-> and b.yh_id = c.yh_id
-> and a.yx_bj = ‘y'
-> and c.sc_bj = ‘n'
-> and c.yx_bj = ‘y'
-> and c.sc_bj = ‘n'
-> and c.yh_dm = '006939748xx' ;

+—-+————-+——-+——–+——————+———+———+————–+——-+————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+—-+————-+——-+——–+——————+———+———+————–+——-+————-+
| 1 | simple | a | all | primary,index_jg | null | null | null | 52616 | using where |
| 1 | simple | b | ref | primary | primary | 98 | test.a.jg_id | 1 | using index |
| 1 | simple | c | eq_ref | primary | primary | 98 | test.b.yh_id | 1 | using where |
| 1 | simple | d | index | null | primary | 196 | null | 54584 | using index |
+—-+————-+——-+——–+——————+———+———+————–+——-+————-+

可以看到执行计划中有两处比较显眼的性能瓶颈：

| 1 | simple | a | all | primary,index_jg | null | null | null | 52616 | using where |

| 1 | simple | d | index | null | primary | 196 | null | 54584 | using index |

由于d是left join的表，所以驱动表不会选择d表，我们在来看看a,b,c三表的大小：

mysql> select count(*) from c;
+———-+
| count(*) |
+———-+
| 53731 |
+———-+

mysql> select count(*) from a;
+———-+
| count(*) |
+———-+
| 53335 |
+———-+

mysql> select count(*) from b;
+———-+
| count(*) |
+———-+
| 105809 |
+———-+

由于b表的数据量大于其他的两表，同时b表上基本没有查询过滤条件，所以驱动表选择b的可能排除；

优化器实际选择了a表作为驱动表，而为什么不是c表作为驱动表？我们来分析一下：

第一阶段：a表作为驱动表
a–>b–>c–>d:
(1):a.jg_id=b.jg_id—>(b索引:primary key (`jg_id`,`yh_id`) )

(2):b.yh_id=c.yh_id—>(c索引:primary key (`yh_id`))

(3):c.yh_id=d.yh_id—>(d索引:primary key (`js_dm`,`yh_id`))
由于d表上没有yh_id的索引，索引在d表上添加索引：

alter table d add index ind_yh_id(yh_id);

执行计划：

+—-+————-+——-+——–+——————+———–+———+————–+——-+————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+—-+————-+——-+——–+——————+———–+———+————–+——-+————-+
| 1 | simple | a | all | primary,index_jg | null | null | null | 52616 | using where |
| 1 | simple | b | ref | primary | primary | 98 | test.a.jg_id | 1 | using index |
| 1 | simple | c | eq_ref | primary | primary | 98 | test.b.yh_id | 1 | using where |
| 1 | simple | d | ref | ind_yh_id | ind_yh_id | 98 | test.b.yh_id | 272 | using index |
+—-+————-+——-+——–+——————+———–+———+————–+——-+————-+

执行时间：

1 row in set (0.77 sec)

在d表上添加索引后，d表的扫描行数下降到272行（最开始为：54584 ）

| 1 | simple | d | ref | ind_yh_id | ind_yh_id | 98 | test.b.yh_id | 272 | using index |

第二阶段：c表作为驱动表

d
^
|
c–>b–>a
由于在c表上有yh_dm过滤性很高的筛选条件，所以我们在yh_dm上创建一个索引：

mysql> select count(*) from c where yh_dm = '006939748xx';
+———-+
| count(*) |
+———-+
| 2 |
+———-+

添加索引：

alter table c add index ind_yh_dm(yh_dm)

查看执行计划：

+—-+————-+——-+——–+——————-+———–+———+————–+——-+————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+—-+————-+——-+——–+——————-+———–+———+————–+——-+————-+
| 1 | simple | a | all | primary,index_jg | null | null | null | 52616 | using where |
| 1 | simple | b | ref | primary | primary | 98 | test.a.jg_id | 1 | using index |
| 1 | simple | c | eq_ref | primary,ind_yh_dm | primary | 98 | test.b.yh_id | 1 | using where |
| 1 | simple | d | ref | ind_yh_id | ind_yh_id | 98 | test.b.yh_id | 272 | using index |
+—-+————-+——-+——–+——————-+———–+———+————–+——-+————-+

执行时间：

1 row in set (0.74 sec)

在c表上添加索引后，索引还是没有走上，执行计划还是以a表作为驱动表，所以我们这里来分析一下为什么还是以a表作为驱动表？

1):c.yh_id=b.yh_id—>( primary key (`jg_id`,`yh_id`) )

a.如果以c表为驱动表，则c表与b表在关联的时候，由于在b表没有yh_id字段的索引，由于b表的数据量很大，所以优化器认为这里如果以c表作为驱动表，则会与b表产生较大的关联（这里可以使用straight_join强制使用c表作为驱动表）；
b.如果以a表为驱动表，则a表与b表在关联的时候，由于在b表上有jg_id字段的索引，所以优化器认为以a作为驱动表的代价是小于以c作为驱动板的代价；
所以我们如果要以c表为驱动表，只需要在b上添加yh_id的索引：

alter table b add index ind_yh_id(yh_id);

2):b.jg_id=a.jg_id—>( primary key (`jg_id`) )

3):c.yh_id=d.yh_id—>( key `ind_yh_id` (`yh_id`) )
执行计划：

+—-+————-+——-+——–+——————-+———–+———+————–+——+————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+—-+————-+——-+——–+——————-+———–+———+————–+——+————-+
| 1 | simple | c | ref | primary,ind_yh_dm | ind_yh_dm | 57 | const | 2 | using where |
| 1 | simple | d | ref | ind_yh_id | ind_yh_id | 98 | test.c.yh_id | 272 | using index |
| 1 | simple | b | ref | primary,ind_yh_id | ind_yh_id | 98 | test.c.yh_id | 531 | using index |
| 1 | simple | a | eq_ref | primary,index_jg | primary | 98 | test.b.jg_id | 1 | using where |
+—-+————-+——-+——–+——————-+———–+———+————–+——+————-+

执行时间：

1 row in set (0.00 sec)

可以看到执行计划中的rows已经大大降低，执行时间也由原来的750ms降低到0 ms级别；

您可能感兴趣的文章:

如对本文有疑问，点击进行留言回复！！

Lua脚本:Linux下如何安装Lua环境和解决依赖问题

Lua脚本介绍Lua是一款基于C编写的脚本语言设计目的:通过灵活嵌入应用程中从而为应用程序提供灵活的扩展和定制功... [阅读全文]
SSD原理及Pytorch代码解读——网络架构（二）：特征提取网络及总体计算过程

特征提取网络前面我们已经知道了SSD采用PriorBox机制，也知道了SSD多层特征图来做物体检测，浅层的特征图... [阅读全文]
redis数据库1

大纲：理论：数据库分类Redis重要特性redis应用场景实验：安装redisredis基本操作命令redis持... [阅读全文]
第十五周作业作业

1、导入hellodb.sql生成数据库(1)在students表中，查询年龄大于25岁，且为男性的同学的名字和... [阅读全文]
一致性Hash分析

Hash算法应用场景Hash算法在很多分布式集群产品中都有应用，比如分布式集群架构Redis、Hadoop、El... [阅读全文]
【嵌入式系统】工作模式与任务特权级

【嵌入式系统】工作模式与任务特权级1、工作模式中断模式：调用中断服务程序(ISR, Interrupt Serv... [阅读全文]
集合类 ArrayList 和 LinkedList 直接用 Innodb（MySQL 5.0

比如说对于一个简单的年龄字段，严谨来说应该使用 tinyint（1字节）或者 smallint(2字节)，但是你... [阅读全文]
暑期记录

本周学习MySQL安装使用yum安装所需软件包[root@localhost ~]# yum -y instal... [阅读全文]
MySQL-主从同步配置架构的实现

mysql主从同步原理：master记录数据更改操作-启用binlog日志-设置binlog日志格式-设置ser... [阅读全文]
Kafka控制器

1.ZookeeperZookeeper对Kafka集群的管理操作主要是用了它的两个功能节点（临时节点【zook... [阅读全文]

网友评论


验证码：

MySQL在关联复杂情况下所能做出的一些优化

2017年12月12日 | 移动技术网IT编程 | 我要评论

您可能感兴趣的文章:

相关文章:

网友评论