left join左表一定是驱动表吗?

时间:2022-07-22
本文章向大家介绍left join左表一定是驱动表吗?,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

left join左表一定是驱动表吗?

日常工作中,遇到很多left join的SQL,今天对left join的这种语法进行简单讲解。刚开始接触MySQL的时候,我也认为使用left join的时候,是左表驱动右表的,但是随着对MySQL理解的深入,时间长了发现这个理解是错误的。

我们先来看个例子:

mysql> create table a(f1 int, f2 int, index(f1))engine=innodb;
Query OK, 0 rows affected (0.01 sec)

mysql> create table b(f1 int, f2 int)engine=innodb;
Query OK, 0 rows affected (0.01 sec)

mysql> insert into a values(1,1),(2,2),(3,3),(4,4),(5,5),(6,6);
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> insert into b values(3,3),(4,4),(5,5),(6,6),(7,7),(8,8);
Query OK, 6 rows affected (0.02 sec)
Records: 6  Duplicates: 0  Warnings: 0

首先我们创建2个表,表a和表b,2个表的结构一致,其中表a的f1字段有索引,表b没有索引。

来看下面两条SQL

select * from a left join b on(a.f1=b.f1)

and (a.f2=b.f2); /*SQL 1*/

select * from a left join b on(a.f1=b.f1)

where (a.f2=b.f2);/*SQL 2*/

mysql> select * from a left join b on(a.f1=b.f1) and (a.f2=b.f2);
+------+------+------+------+
| f1   | f2   | f1   | f2   |
+------+------+------+------+
|    3 |    3 |    3 |    3 |
|    4 |    4 |    4 |    4 |
|    5 |    5 |    5 |    5 |
|    6 |    6 |    6 |    6 |
|    1 |    1 | NULL | NULL |
|    2 |    2 | NULL | NULL |
+------+------+------+------+
6 rows in set (0.01 sec)

mysql> select * from a left join b on(a.f1=b.f1) where (a.f2=b.f2);
+------+------+------+------+
| f1   | f2   | f1   | f2   |
+------+------+------+------+
|    3 |    3 |    3 |    3 |
|    4 |    4 |    4 |    4 |
|    5 |    5 |    5 |    5 |
|    6 |    6 |    6 |    6 |
+------+------+------+------+
4 rows in set (0.01 sec)

从结果可以看到,这两条SQL返回的值是不一样的,SQL1中,将表b中不存在的记录用null来进行表示,和表a中的记录进行了连接查询。SQL2中,只将表a和表b中共有的记录进行了连接查询。

SQL2的原因可以解释为a.f2 != b.f2,所以不满足where条件。

接下来,来看这两个SQL的执行计划,先看SQL1:

mysql> explain select * from a left join b on(a.f1=b.f1) and (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | a     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | NULL                                               |
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    6 |   100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)

SQL1的执行计划中不难看出来,表a作为了驱动表,表b作为了被驱动表,之所以做出这个判断,是因为在连接查询的执行计划中,每个表都会对应一条记录,这些记录的id列的值是相同的,出现在前边的表表示驱动表,出现在后边的表表示被驱动表。由于表b上的f1没有索引,所以使用了BNL算法,如果表b上的f1有索引,则会使用INLJ算法。

再来看SQL 2的执行计划:

mysql> explain select * from a left join b on(a.f1=b.f1) where (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref       | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL      |    6 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | ref  | f1            | f1   | 5       | test.b.f1 |    1 |    16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

从这个执行计划中可以看到,表b作为了这个SQL的驱动表,表a作为了被驱动表,这个SQL的执行过程是这样的:顺序扫描表b,并将表b的字段放入join buffer,对于join buffer中表b的每一行用b.f1到表a中去查,匹配到记录后判断a.f2=b.f2是否满足,满足条件的话就作为结果集的一部分返回。

SQL2中,因为where条件中,NULL跟任何值执行等值判断和不等值判断的结果,都是NULL,而where null不会输出任何结果集,如下:

mysql> select * from a where null;
Empty set (0.00 sec)

mysql> select * from a ;
+------+------+
| f1   | f2   |
+------+------+
|    1 |    1 |
|    2 |    2 |
|    3 |    3 |
|    4 |    4 |
|    5 |    5 |
|    6 |    6 |
+------+------+
6 rows in set (0.00 sec)

所以,where条件得到的结果集中不会包含null值相关的列。

我们再来看看SQL2的explain结果中的warnings:

mysql> explain select * from a left join b on(a.f1=b.f1) where (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref       | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL      |    6 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | ref  | f1            | f1   | 5       | test.b.f1 |    1 |    16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`a`.`f1` AS `f1`,`test`.`a`.`f2` AS `f2`,`test`.`b`.`f1` AS `f1`,`test`.`b`.`f2` AS `f2` from `test`.`a` join `test`.`b` where ((`test`.`a`.`f1` = `test`.`b`.`f1`) and (`test`.`a`.`f2` = `test`.`b`.`f2`)) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

可以看到,MySQL的优化器把这条语句的left join改写成了join,然后因为表a的f1上有索引,就把表b作为驱动表,这样就可以用上表a的f1索引。

这个例子说明了两点

1、即使我们在SQL语句中写成left join,执行过程还是有可能不是从左到右连接的。也就是说,使用left join时,左边的表不一定是驱动表。

2、如果需要left join的语义,就不能把被驱动表的字段放在where条件里面做等值判断或不等值判断,必须都写在on里面。

如果我们将上面SQL中的left join写成join呢?

select * from a join b on(a.f1=b.f1)

and (a.f2=b.f2); /*SQL 1*/

select * from a join b on(a.f1=b.f1)

where (a.f2=b.f2);/*SQL 2*/

mysql> explain select * from a  join b on(a.f1=b.f1) and (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref       | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL      |    6 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | ref  | f1            | f1   | 5       | test.b.f1 |    1 |    16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`a`.`f1` AS `f1`,`test`.`a`.`f2` AS `f2`,`test`.`b`.`f1` AS `f1`,`test`.`b`.`f2` AS `f2` from `test`.`a` join `test`.`b` where ((`test`.`a`.`f2` = `test`.`b`.`f2`) and (`test`.`a`.`f1` = `test`.`b`.`f1`)) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select * from a  join b on(a.f1=b.f1) where (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref       | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
|  1 | SIMPLE      | b     | NULL       | ALL  | NULL          | NULL | NULL    | NULL      |    6 |   100.00 | Using where |
|  1 | SIMPLE      | a     | NULL       | ref  | f1            | f1   | 5       | test.b.f1 |    1 |    16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`a`.`f1` AS `f1`,`test`.`a`.`f2` AS `f2`,`test`.`b`.`f1` AS `f1`,`test`.`b`.`f2` AS `f2` from `test`.`a` join `test`.`b` where ((`test`.`a`.`f1` = `test`.`b`.`f1`) and (`test`.`a`.`f2` = `test`.`b`.`f2`)) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

可以通过结果看到,优化器将这两条SQL改写成了相同的结果。