mysql> use school;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from exams;
+-----+-----+------------+-------+
| uid | cid | time | score |
+-----+-----+------------+-------+
| 1 | 1 | 2021-04-09 | 99 |
| 1 | 2 | 2021-04-10 | 80 |
| 2 | 2 | 2021-04-10 | 90 |
| 2 | 3 | 2021-04-12 | 85 |
| 3 | 1 | 2021-04-09 | 56 |
| 3 | 2 | 2021-04-10 | 93 |
| 3 | 3 | 2021-04-12 | 89 |
| 3 | 4 | 2021-04-11 | 100 |
| 4 | 4 | 2021-04-11 | 99 |
| 5 | 2 | 2021-04-10 | 59 |
| 5 | 3 | 2021-04-12 | 94 |
| 5 | 4 | 2021-04-11 | 95 |
+-----+-----+------------+-------+
12 rows in set (0.01 sec)
mysql> select * from course;
+-----+-----------------+--------+
| cid | cname | credit |
+-----+-----------------+--------+
| 1 | C++基础课程 | 5 |
| 2 | C++高级课程 | 10 |
| 3 | C++项目开发 | 8 |
| 4 | C++算法课程 | 12 |
+-----+-----------------+--------+
4 rows in set (0.01 sec)
mysql> select * from student;
+-----+----------+-----+-----+
| uid | name | age | sex |
+-----+----------+-----+-----+
| 1 | zhangsan | 18 | M |
| 2 | gaoyang | 20 | W |
| 3 | chenwei | 22 | M |
| 4 | linfeng | 21 | W |
| 5 | liuxiang | 19 | W |
+-----+----------+-----+-----+
5 rows in set (0.01 sec)
mysql> desc student;
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| uid | int unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(50) | NO | | NULL | |
| age | tinyint unsigned | NO | | NULL | |
| sex | enum('M','W') | NO | | NULL | |
+-------+------------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mysql> desc course;
+--------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+------------------+------+-----+---------+----------------+
| cid | int unsigned | NO | PRI | NULL | auto_increment |
| cname | varchar(50) | NO | | NULL | |
| credit | tinyint unsigned | NO | | NULL | |
+--------+------------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)
mysql> desc exams;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| uid | int unsigned | NO | PRI | NULL | |
| cid | int unsigned | NO | PRI | NULL | |
| time | date | NO | | NULL | |
| score | float | NO | | NULL | |
+-------+--------------+------+-----+---------+-------+
4 rows in set (0.01 sec)
mysql>
现在需要查看zhangsan同学的详细信息,选修课程id为2的信息和对应的成绩,需要进行多表查询
分析:
zhangsan同学的详细信息
select uid,name,age,sex from student where uid = 1;
zhangsan选修的课程信息
select cid,cname,credit from course where cid = 2;
成绩信息
select uid,cid,time,score from exams where uid = 1 and cid = 2;
现在,我们需要将上面的三条sql语句结合起来,进行多表查询.
首先,我们需要知道的是,内连接是需要区分大表和小表的,通过数据量进行区分,数据量大的就为大表,数据量小的就为小表,小表总是先进行全局扫描,然后拿着扫描得到的数据到大表中匹对
(所以,小表使用索引的意义是不大的),最后得到需要的数据
下面是内连接的常用方式:
select emp.name dept.name from emp
inner join dept on emp.dept_id = dept.id
inner join ...;
需要注意的点是:inner join
左边的表并不是随意取的,假设将表出现的顺序按A,B,C...这样的顺序起别名,例如这里的emp
别名为A,后面的dept
表别名为B,后面以此类推,这里的A,必须要能够跟后续出现的表有关联,有联系,例如:A表应该要有B表,C表关联的字段,这样才能建立联系,如有B,C表的id字段,这样才能使用on
建立联系。上面的student
,course
,exams
三张表,exams
表中有 uid
字段,可以与 student
表关联;同时,它也有 cid
字段,可以与 course
表关联。因此,exams
表作为起点,可以有效地将 student
和 course
表通过 uid
和 cid
连接起来。所以这里的exams
表应该为A表,至于B,C表的顺序并没有要求,SQL 查询优化器在执行查询时会自动决定最优的执行顺序
所以,最终的sql语句如下:
select c.uid,c.name,c.age,c.sex, b.cid,b.cname,b.credit,a.time,a.score from exams a
inner join course b on a.cid = b.cid
inner join student c on a.uid = c.uid
where a.uid = 1 and a.cid = 2;
测试结果:
mysql> select c.uid,c.name,c.age,c.sex, b.cid,b.cname,b.credit,a.time,a.score from exams a
-> inner join course b on a.cid = b.cid
-> inner join student c on a.uid = c.uid
-> where a.uid = 1 and a.cid = 2;
+-----+----------+-----+-----+-----+-----------------+--------+------------+-------+
| uid | name | age | sex | cid | cname | credit | time | score |
+-----+----------+-----+-----+-----+-----------------+--------+------------+-------+
| 1 | zhangsan | 18 | M | 2 | C++高级课程 | 10 | 2021-04-10 | 80 |
+-----+----------+-----+-----+-----+-----------------+--------+------------+-------+
1 row in set (0.02 sec)
mysql>
如果是查询zhangsan同学的详细信息,所有选修课程的信息和对应的成绩,则sql应该如下:
select c.uid,c.name,c.age,c.sex, b.cid,b.cname,b.credit,a.time,a.score from exams a
inner join course b on a.cid = b.cid
inner join student c on a.uid = c.uid
where a.uid = 1;
测试结果:
mysql> select c.uid,c.name,c.age,c.sex, b.cid,b.cname,b.credit,a.time,a.score from exams a
-> inner join course b on a.cid = b.cid
-> inner join student c on a.uid = c.uid
-> where a.uid = 1;
+-----+----------+-----+-----+-----+-----------------+--------+------------+-------+
| uid | name | age | sex | cid | cname | credit | time | score |
+-----+----------+-----+-----+-----+-----------------+--------+------------+-------+
| 1 | zhangsan | 18 | M | 1 | C++基础课程 | 5 | 2021-04-09 | 99 |
| 1 | zhangsan | 18 | M | 2 | C++高级课程 | 10 | 2021-04-10 | 80 |
+-----+----------+-----+-----+-----+-----------------+--------+------------+-------+
2 rows in set (0.02 sec)
mysql>
如果是要查看课程id为2,且成绩大于等于90的信息
select c.uid,c.name,c.age,c.sex, b.cid,b.cname,b.credit,a.time,a.score from exams a
inner join course b on a.cid = b.cid
inner join student c on a.uid = c.uid
where a.cid = 2 and a.score >= 90;
测试结果:
mysql> select c.uid,c.name,c.age,c.sex, b.cid,b.cname,b.credit,a.time,a.score from exams a
-> inner join course b on a.cid = b.cid
-> inner join student c on a.uid = c.uid
-> where a.cid = 2 and a.score >= 90;
+-----+---------+-----+-----+-----+-----------------+--------+------------+-------+
| uid | name | age | sex | cid | cname | credit | time | score |
+-----+---------+-----+-----+-----+-----------------+--------+------------+-------+
| 2 | gaoyang | 20 | W | 2 | C++高级课程 | 10 | 2021-04-10 | 90 |
| 3 | chenwei | 22 | M | 2 | C++高级课程 | 10 | 2021-04-10 | 93 |
+-----+---------+-----+-----+-----+-----------------+--------+------------+-------+
2 rows in set (0.00 sec)
mysql>
通过课程分组,计算每个课程的选课人数,按降序排序
select b.cid,b.cname,b.credit,count(*) number from exams a
inner join course b on a.cid = b.cid
group by a.cid
order by number desc;
测试结果:
mysql> select b.cid,b.cname,b.credit,count(*) number from exams a
-> inner join course b on a.cid = b.cid
-> group by a.cid
-> order by number desc;
+-----+-----------------+--------+--------+
| cid | cname | credit | number |
+-----+-----------------+--------+--------+
| 2 | C++高级课程 | 10 | 4 |
| 3 | C++项目开发 | 8 | 3 |
| 4 | C++算法课程 | 12 | 3 |
| 1 | C++基础课程 | 5 | 2 |
+-----+-----------------+--------+--------+
4 rows in set (0.00 sec)
mysql>
通过课程分组,计算每个课程成绩大于等于90分的选课人数,按降序排序
select b.cid,b.cname,b.credit,count(*) number from exams a
inner join course b on a.cid = b.cid
where a.score >= 90
group by a.cid
order by number desc;
测试结果:
mysql> select b.cid,b.cname,b.credit,count(*) number from exams a
-> inner join course b on a.cid = b.cid
-> where a.score >= 90
-> group by a.cid
-> order by number desc;
+-----+-----------------+--------+--------+
| cid | cname | credit | number |
+-----+-----------------+--------+--------+
| 4 | C++算法课程 | 12 | 3 |
| 2 | C++高级课程 | 10 | 2 |
| 1 | C++基础课程 | 5 | 1 |
| 3 | C++项目开发 | 8 | 1 |
+-----+-----------------+--------+--------+
4 rows in set (0.01 sec)
mysql>
cid=2这门课程考试成绩的最高分的学生信息和课程信息
select c.uid, c.name, c.age, c.sex, b.cid, b.cname, b.credit, a.score
from exams a
inner join course b on a.cid = b.cid
inner join student c on a.uid = c.uid
where a.cid = 2
and a.score = (
select max(score)
from exams
where cid = 2
);
测试结果
mysql> select c.uid, c.name, c.age, c.sex, b.cid, b.cname, b.credit, a.score
(score)
from exams
where cid = 2
); -> from exams a
-> inner join course b on a.cid = b.cid
-> inner join student c on a.uid = c.uid
-> where a.cid = 2
-> and a.score = (
-> select max(score)
-> from exams
-> where cid = 2
-> );
+-----+---------+-----+-----+-----+-----------------+--------+-------+
| uid | name | age | sex | cid | cname | credit | score |
+-----+---------+-----+-----+-----+-----------------+--------+-------+
| 3 | chenwei | 22 | M | 2 | C++高级课程 | 10 | 93 |
+-----+---------+-----+-----+-----+-----------------+--------+-------+
1 row in set (0.12 sec)
+-------+
| score |
+-------+
| 80 |
| 90 |
| 93 |
| 59 |
+-------+
4 rows in set (0.00 sec)
mysql>
每门课程考试的平均成绩+课程信息
select b.cid,b.cname,b.credit,ROUND(AVG(a.score), 2) average_score from exams a
inner join course b on a.cid = b.cid
group by a.cid
order by average_score desc;
测试结果:
mysql> select b.cid,b.cname,b.credit,ROUND(AVG(a.score), 2) average_score from exams a
-> inner join course b on a.cid = b.cid
-> group by a.cid
-> order by average_score desc;
+-----+-----------------+--------+---------------+
| cid | cname | credit | average_score |
+-----+-----------------+--------+---------------+
| 4 | C++算法课程 | 12 | 98 |
| 3 | C++项目开发 | 8 | 89.33 |
| 2 | C++高级课程 | 10 | 80.5 |
| 1 | C++基础课程 | 5 | 77.5 |
+-----+-----------------+--------+---------------+
4 rows in set (0.01 sec)
mysql>
厉害点的用法:
在学习limit时,我们知道limit的偏移是需要时间的,我们知道可以使用下面这种方式过滤调偏移的时间,这种方式是最理想的
select * from t_user where id > 上一页最后一条数据的id值 limit 10;
问题是,我们有时候并不知道最后一条数据的索引值导致无法使用where,那么我们该如何提高搜索效率呢?
例如下面这种情况
mysql> select count(*) from t_user;
+----------+
| count(*) |
+----------+
| 2000000 |
+----------+
1 row in set (3.07 sec)
mysql> desc t_user;
+----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+----------------+
| id | int | NO | PRI | NULL | auto_increment |
| email | varchar(255) | YES | | NULL | |
| password | varchar(255) | YES | | NULL | |
+----------+--------------+------+-----+---------+----------------+
3 rows in set (0.11 sec)
mysql> select * from t_user limit 1000000, 10;
+---------+------------------+----------+
| id | email | password |
+---------+------------------+----------+
| 1000001 | [email protected] | 154985 |
| 1000002 | [email protected] | 845017 |
| 1000003 | [email protected] | 154986 |
| 1000004 | [email protected] | 845018 |
| 1000005 | [email protected] | 154987 |
| 1000006 | [email protected] | 845019 |
| 1000007 | [email protected] | 154988 |
| 1000008 | [email protected] | 845020 |
| 1000009 | [email protected] | 154989 |
| 1000010 | [email protected] | 845021 |
+---------+------------------+----------+
10 rows in set (0.78 sec)
mysql> select id from t_user limit 1000000, 10;
+---------+
| id |
+---------+
| 1000001 |
| 1000002 |
| 1000003 |
| 1000004 |
| 1000005 |
| 1000006 |
| 1000007 |
| 1000008 |
| 1000009 |
| 1000010 |
+---------+
10 rows in set (0.27 sec)
mysql>
总所周知,select的字段越多,查询的速度越慢,例如这里只查询id和查询全部数据,两种的时间差是非常明显的。在这个优化查询逻辑的例子中,所以我们的目标是:在查询全部字段的情况下,实现与仅查询 id
相似的效率。通过使用 INNER JOIN
和子查询,我们可以利用索引来加速查询,减少 LIMIT
偏移带来的时间开销。
我们的想法:
- 利用索引查询 ID:首先,通过
LIMIT
和OFFSET
来快速获取目标记录的id
列表。因为id
列通常有索引,所以仅查询id
的效率会非常高,即使偏移量很大(例如LIMIT 1000000, 10
)。 - 使用
INNER JOIN
加速主表查询:然后,我们将这些id
列表作为一个小的临时表(即子查询结果),用INNER JOIN
与原表进行连接。由于我们只连接到已知的id
集合,这个过程非常高效,并且可以避免扫描大数据集。 - 只读取必要的数据行:通过这种方法,我们能够在查询全部字段的同时,仅从原表中检索那些与
id
列表匹配的行,避免了全表扫描。
select * from t_user a
inner join (select id from t_user limit 1000000,10) b on a.id = b.id;
测试结果
mysql> select * from t_user a
-> inner join (select id from t_user limit 1000000,10) b on a.id = b.id;
+---------+------------------+----------+---------+
| id | email | password | id |
+---------+------------------+----------+---------+
| 1000001 | [email protected] | 154985 | 1000001 |
| 1000002 | [email protected] | 845017 | 1000002 |
| 1000003 | [email protected] | 154986 | 1000003 |
| 1000004 | [email protected] | 845018 | 1000004 |
| 1000005 | [email protected] | 154987 | 1000005 |
| 1000006 | [email protected] | 845019 | 1000006 |
| 1000007 | [email protected] | 154988 | 1000007 |
| 1000008 | [email protected] | 845020 | 1000008 |
| 1000009 | [email protected] | 154989 | 1000009 |
| 1000010 | [email protected] | 845021 | 1000010 |
+---------+------------------+----------+---------+
10 rows in set (0.30 sec)
mysql>
mysql> explain select * from t_user a inner join (select id from t_user limit 1000000,10) b on a.id = b.id;
+----+-------------+------------+------------+--------+---------------+---------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+---------------+---------+---------+------+---------+----------+-------------+
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 1000010 | 100.00 | NULL |
| 1 | PRIMARY | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | b.id | 1 | 100.00 | NULL |
| 2 | DERIVED | t_user | NULL | index | NULL | PRIMARY | 4 | NULL | 1977287 | 100.00 | Using index |
+----+-------------+------------+------------+--------+---------------+---------+---------+------+---------+----------+-------------+
3 rows in set, 1 warning (0.07 sec)
mysql>
通过explain我们可以发现,这条sql首先进行了小表的扫描,扫描的行数大概是1000000行,将返回的十条数据作为一张临时表,然后提供给主表查询数据,主表使用索引的方式查询
inner jorn
的过滤条件放在where后面和放在on连接条件里的效果是一致的,MySQL可能将and优化为where以使用索引
select a.*,b.* from exams a inner join
student b on a.uid = b.uid
and b.uid = 3;
select a.*,b.* from exams a inner join
student b on a.uid = b.uid
where b.uid = 3;
测试结果:
mysql> select a.*,b.* from exams a inner join
-> student b on a.uid = b.uid
-> where b.uid = 3;
+-----+-----+------------+-------+-----+---------+-----+-----+
| uid | cid | time | score | uid | name | age | sex |
+-----+-----+------------+-------+-----+---------+-----+-----+
| 3 | 1 | 2021-04-09 | 56 | 3 | chenwei | 22 | M |
| 3 | 2 | 2021-04-10 | 93 | 3 | chenwei | 22 | M |
| 3 | 3 | 2021-04-12 | 89 | 3 | chenwei | 22 | M |
| 3 | 4 | 2021-04-11 | 100 | 3 | chenwei | 22 | M |
+-----+-----+------------+-------+-----+---------+-----+-----+
4 rows in set (0.00 sec)
mysql> select a.*,b.* from exams a inner join
-> student b on a.uid = b.uid
-> and b.uid = 3;
+-----+-----+------------+-------+-----+---------+-----+-----+
| uid | cid | time | score | uid | name | age | sex |
+-----+-----+------------+-------+-----+---------+-----+-----+
| 3 | 1 | 2021-04-09 | 56 | 3 | chenwei | 22 | M |
| 3 | 2 | 2021-04-10 | 93 | 3 | chenwei | 22 | M |
| 3 | 3 | 2021-04-12 | 89 | 3 | chenwei | 22 | M |
| 3 | 4 | 2021-04-11 | 100 | 3 | chenwei | 22 | M |
+-----+-----+------------+-------+-----+---------+-----+-----+
4 rows in set (0.00 sec)
mysql> explain select a.*,b.* from exams a inner join student b on a.uid = b.uid where b.uid = 3;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | b | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | a | NULL | ref | PRIMARY | PRIMARY | 4 | const | 4 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.01 sec)
mysql> explain select a.*,b.* from exams a inner join student b on a.uid = b.uid and b.uid = 3;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | b | NULL | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | a | NULL | ref | PRIMARY | PRIMARY | 4 | const | 4 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.06 sec)