作者:whisper
链接:http://proprogrammar.com:443/article/633
声明:请尊重原作者的劳动,如需转载请注明出处
编写一个 SQL 查询,查找所有至少连续出现三次的数字。
+----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+
例如,给定上面的
Logs
表,1
是唯一连续出现至少三次的数字。+-----------------+ | ConsecutiveNums | +-----------------+ | 1 | +-----------------+
select distinct l1.Num as ConsecutiveNums
from Logs l1, Logs l2, Logs l3
where l1.Id = l2.Id - 1 and l2.Id = l3.Id - 1 and l1.Num = l2.Num and l2.Num = l3.Num;
同表连接,连接上一行,上上一行
再看另一种比较快的解法
select distinct Num as ConsecutiveNums
from (
select Num,
case
when @prev = Num then @count := @count + 1
when (@prev := Num) is not null then @count := 1
end as CNT
from Logs, (select @prev := null,@count := null) as t
) as temp
where temp.CNT >= 3
使用了变量,有个技巧,(@prev:=Num) is not null,这里在赋值的同时进行了判断,即赋值后会返回赋值的结果
几个朋友来到电影院的售票处,准备预约连续空余座位。
你能利用表 cinema ,帮他们写一个查询语句,获取所有空余座位,并将它们按照 seat_id 排序后返回吗?
| seat_id | free | |---------|------| | 1 | 1 | | 2 | 0 | | 3 | 1 | | 4 | 1 | | 5 | 1 |
对于如上样例,你的查询语句应该返回如下结果。
| seat_id | |---------| | 3 | | 4 | | 5 |
select distinct c1.seat_id
from(
select seat_id
from cinema
where free = 1
) c1 inner join cinema c2 on c2.free = 1 and (c1.seat_id = c2.seat_id + 1 or c1.seat_id = c2.seat_id - 1)
order by c1.seat_id
两张空余座位表,连续即当前位置的前一个位置或后一个位置也是空余的
再看另一种比较快的解法
SELECT distinct(c1.seat_id)
FROM cinema c1 JOIN cinema c2 ON abs(c2.seat_id-c1.seat_id)=1
WHERE c1.free=1 AND c2.free =1
ORDER BY seat_id
与上面的解法大同小异,只是利用绝对值函数来判断前或后一个是空位置
Table:
Failed
+--------------+---------+ | Column Name | Type | +--------------+---------+ | fail_date | date | +--------------+---------+ 该表主键为 fail_date。 该表包含失败任务的天数.
Table:
Succeeded
+--------------+---------+ | Column Name | Type | +--------------+---------+ | success_date | date | +--------------+---------+ 该表主键为 success_date。 该表包含成功任务的天数.
系统 每天 运行一个任务。每个任务都独立于先前的任务。任务的状态可以是失败或是成功。
编写一个 SQL 查询 2019-01-01 到 2019-12-31 期间任务连续同状态 period_state 的起止日期(start_date 和 end_date)。即如果任务失败了,就是失败状态的起止日期,如果任务成功了,就是成功状态的起止日期。
最后结果按照起始日期 start_date 排序
查询结果样例如下所示:
Failed table: +-------------------+ | fail_date | +-------------------+ | 2018-12-28 | | 2018-12-29 | | 2019-01-04 | | 2019-01-05 | +-------------------+ Succeeded table: +-------------------+ | success_date | +-------------------+ | 2018-12-30 | | 2018-12-31 | | 2019-01-01 | | 2019-01-02 | | 2019-01-03 | | 2019-01-06 | +-------------------+ Result table: +--------------+--------------+--------------+ | period_state | start_date | end_date | +--------------+--------------+--------------+ | succeeded | 2019-01-01 | 2019-01-03 | | failed | 2019-01-04 | 2019-01-05 | | succeeded | 2019-01-06 | 2019-01-06 | +--------------+--------------+--------------+ 结果忽略了 2018 年的记录,因为我们只关心从 2019-01-01 到 2019-12-31 的记录 从 2019-01-01 到 2019-01-03 所有任务成功,系统状态为 "succeeded"。 从 2019-01-04 到 2019-01-05 所有任务失败,系统状态为 "failed"。 从 2019-01-06 到 2019-01-06 所有任务成功,系统状态为 "succeeded"。
select r.period_state, case when r.start_date < str_to_date('2019-01-01', '%Y-%m-%d') then str_to_date('2019-01-01', '%Y-%m-%d')
else r.start_date end start_date, case when r.end_date > str_to_date('2019-12-31', '%Y-%m-%d') then str_to_date('2019-12-31', '%Y-%m-%d')
else r.end_date end end_date
from(
select 'failed' period_state, fs.fail_date start_date, fe.fail_date end_date
from(
select fi.fail_date, @s1:=@s1+1 seq
from(
select f.fail_date
from Failed f
where not exists(
select 1
from Failed
where fail_date = date_sub(f.fail_date, interval 1 day)
)
order by f.fail_date
) fi, (select @s1:=0) st
) fs inner join (
select fi.fail_date, @s2:=@s2+1 seq
from(
select f.fail_date
from Failed f
where not exists(
select 1
from Failed
where fail_date = date_add(f.fail_date, interval 1 day)
)
order by f.fail_date
) fi, (select @s2:=0) st
) fe on fs.seq = fe.seq
union
select 'succeeded', ss.success_date, se.success_date
from(
select si.success_date, @s3:=@s3+1 seq
from(
select s.success_date
from Succeeded s
where not exists(
select 1
from Succeeded
where success_date = date_sub(s.success_date, interval 1 day)
)
order by s.success_date
) si, (select @s3:=0) st
) ss inner join (
select si.success_date, @s4:=@s4+1 seq
from(
select s.success_date
from Succeeded s
where not exists(
select 1
from Succeeded
where success_date = date_add(s.success_date, interval 1 day)
)
order by s.success_date
) si, (select @s4:=0) st
) se on ss.seq = se.seq
) r
where r.end_date >= str_to_date('2019-01-01', '%Y-%m-%d') and r.start_date <= str_to_date('2019-12-31', '%Y-%m-%d')
order by r.start_date
上面分别处理失败与成功的情况,通过union连接,对一个日期,如果不存在上一个日期的记录,就为开始日期,如果不存在下一个日期的记录,就为结束日期,同时要限制一下日期在2019年内,学习了变量的使用,not exists的使用,date_add,date_sub的使用,str_to_date的使用,通过变量生成一个额外的列用来联系开始和结束日期,来降低时间消耗
下面看另一种较快的解法
select t3.period_state as period_state,min(date) as start_date,
max(date) as end_date from
(select date,period_state,
if(@prev = period_state ,@tmp := @tmp, @tmp := @tmp + 1) as rank,
@prev := period_state as prev
from
(select date,period_state from (
select fail_date as date,'failed' as period_state from failed
where fail_date between '2019-01-01' and '2019-12-31'
union all
select success_date as date,'succeeded' as period_state from Succeeded
where success_date between '2019-01-01' and '2019-12-31'
) t1 order by t1.date asc) t2,(select @prev := null,@tmp := 0) as init) t3
group by rank
对这种比较复杂的题目,一般还是要用到变量,这里用了迭代的变量@prev,还用一个变量@tmp做加1运算,这是一种常见的用法,这里是对2019的情况排序,多加一列rank,对连续的成功或失败,rank值是一样的, 还用到了between...and,asc来排序(可省略),别名用as引出(可省略)
表:
Logs
+---------------+---------+ | Column Name | Type | +---------------+---------+ | log_id | int | +---------------+---------+ id 是上表的主键。 上表的每一行包含日志表中的一个 ID。
后来一些 ID 从 Logs 表中删除。编写一个 SQL 查询得到 Logs 表中的连续区间的开始数字和结束数字。
将查询表按照 start_id 排序。
查询结果格式如下面的例子:
Logs 表: +------------+ | log_id | +------------+ | 1 | | 2 | | 3 | | 7 | | 8 | | 10 | +------------+ 结果表: +------------+--------------+ | start_id | end_id | +------------+--------------+ | 1 | 3 | | 7 | 8 | | 10 | 10 | +------------+--------------+ 结果表应包含 Logs 表中的所有区间。 从 1 到 3 在表中。 从 4 到 6 不在表中。 从 7 到 8 在表中。 9 不在表中。 10 在表中。
select a.log_id as START_ID ,min(b.log_id) as END_ID from
(select log_id from logs where log_id-1 not in (select * from logs)) a,
(select log_id from logs where log_id+1 not in (select * from logs)) b
where b.log_id>=a.log_id
group by a.log_id
学习not in的用法,开始日期的上一天记录不存在,结束日期的下一天记录不存在,,结束日期不小于开始日期(应为不小于开始日期的所有结束日期中最小的)
下面再看另一种较快的解法
select
s.log_id as START_ID,
min(e.log_id) as END_ID
from
(select distinct a.log_id from Logs a left join Logs b on a.log_id + 1 = b.log_id where b.log_id is null) e,
(select distinct a.log_id from Logs a left join Logs b on a.log_id - 1 = b.log_id where b.log_id is null) s
where s.log_id <= e.log_id
group by 1
大同小异,思想是一样的,学习了left join和group by的用法,group by 1表示按select中的第1列分组,类似的还有group by num1, num2,order by 也有此种用法,order by num1, num2这样,表示按select中的第num1, num2列排序
亲爱的读者:有时间可以点赞评论一下
全部评论