leetcode数据库相似题目总结

通知

此博客运行在jpress系统上，如果你喜欢此博客模板，请加QQ群：1061691290（whimurmur模板/jpress插件），免费下载使用

leetcode数据库相似题目总结

879人浏览 / 0人评论 | 作者：whisper | 分类：数据库 | 标签：数据库 / leetcode |

作者：whisper

链接：http://proprogrammar.com:443/article/801

声明：请尊重原作者的劳动，如需转载请注明出处

今天状态很差（什么时候状态好过呢），因为很久没更新了，所以逼着自己写一点东西，所以就只说一下自己的解题思路，学习一下人家的解题方法

615. 平均工资：部门与公司比较

给如下两个表，写一个查询语句，求出在每一个工资发放日，每个部门的平均工资与公司的平均工资的比较结果（高 / 低 / 相同）。

表： salary
| id | employee_id | amount | pay_date   |
|----|-------------|--------|------------|
| 1  | 1           | 9000   | 2017-03-31 |
| 2  | 2           | 6000   | 2017-03-31 |
| 3  | 3           | 10000  | 2017-03-31 |
| 4  | 1           | 7000   | 2017-02-28 |
| 5  | 2           | 6000   | 2017-02-28 |
| 6  | 3           | 8000   | 2017-02-28 |
employee_id 字段是表 employee 中 employee_id 字段的外键。
| employee_id | department_id |
|-------------|---------------|
| 1           | 1             |
| 2           | 2             |
| 3           | 2             |
对于如上样例数据，结果为：
| pay_month | department_id | comparison  |
|-----------|---------------|-------------|
| 2017-03   | 1             | higher      |
| 2017-03   | 2             | lower       |
| 2017-02   | 1             | same        |
| 2017-02   | 2             | same        |
解释

在三月，公司的平均工资是 (9000+6000+10000)/3 = 8333.33...

由于部门 '1' 里只有一个 employee_id 为 '1' 的员工，所以部门 '1' 的平均工资就是此人的工资 9000 。因为 9000 > 8333.33 ，所以比较结果是 'higher'。

第二个部门的平均工资为 employee_id 为 '2' 和 '3' 两个人的平均工资，为 (6000+10000)/2=8000 。因为 8000 < 8333.33 ，所以比较结果是 'lower' 。

在二月用同样的公式求平均工资并比较，比较结果为 'same' ，因为部门 '1' 和部门 '2' 的平均工资与公司的平均工资相同，都是 7000 。

select depS.month pay_month, depS.department_id, case when depS.depAvg > comS.comAvg then 'higher'
    when depS.depAvg < comS.comAvg then 'lower' else 'same' end comparison
from(
    select e.department_id, m.month, avg(s.amount) depAvg
    from employee e inner join salary s on e.employee_id = s.employee_id
    inner join (
        select distinct date_format(pay_date, '%Y-%m') month
        from salary
    ) m on m.month = date_format(s.pay_date, '%Y-%m')
    group by e.department_id, m.month
) depS inner join (
    select m.month, avg(s.amount) comAvg
    from salary s inner join (
        select distinct date_format(pay_date, '%Y-%m') month
        from salary
    ) m on m.month = date_format(s.pay_date, '%Y-%m')
    group by m.month
) comS on depS.month = comS.month

我的写法思路很清楚，先分别求出每个部门的平均工资与公司的平均工资，然后比较，有点技巧的就是日期的处理，从年月日的日期转换个年月

看一下效率较高的写法

select t2.pay_month,t2.department_id, 
case 
    when t1.company_salary<t2.department_salary then 'higher'
    when t1.company_salary>t2.department_salary then 'lower'
    else 'same' 
end as comparison
from 
(select employee.department_id,date_format(pay_date,'%Y-%m') as pay_month,avg(amount) as department_salary from salary
left join employee on salary.employee_id=employee.employee_id group by department_id,pay_month) t2 
join
(select date_format(pay_date,'%Y-%m')as pay_month,avg(amount) as company_salary  from salary group by pay_month) t1 
on t1.pay_month=t2.pay_month
order by department_id,t2.pay_month;

与我写得思路也差不多，有一点可以注意下，group by中的pay_month并不是实际的字段，而是pay_date处理后的字段，可见group by中的排序字段可以是来自于select中处理过的字段，而并不一定要是实际表中的字段

1251. 平均售价

Table: Prices

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| start_date    | date    |
| end_date      | date    |
| price         | int     |
+---------------+---------+
(product_id，start_date，end_date) 是 Prices 表的主键。
Prices 表的每一行表示的是某个产品在一段时期内的价格。
每个产品的对应时间段是不会重叠的，这也意味着同一个产品的价格时段不会出现交叉。

Table: UnitsSold

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| purchase_date | date    |
| units         | int     |
+---------------+---------+
UnitsSold 表没有主键，它可能包含重复项。
UnitsSold 表的每一行表示的是每种产品的出售日期，单位和产品 id。

编写SQL查询以查找每种产品的平均售价。
average_price 应该四舍五入到小数点后两位。
查询结果格式如下例所示：

Prices table:
+------------+------------+------------+--------+
| product_id | start_date | end_date   | price  |
+------------+------------+------------+--------+
| 1          | 2019-02-17 | 2019-02-28 | 5      |
| 1          | 2019-03-01 | 2019-03-22 | 20     |
| 2          | 2019-02-01 | 2019-02-20 | 15     |
| 2          | 2019-02-21 | 2019-03-31 | 30     |
+------------+------------+------------+--------+
 
UnitsSold table:
+------------+---------------+-------+
| product_id | purchase_date | units |
+------------+---------------+-------+
| 1          | 2019-02-25    | 100   |
| 1          | 2019-03-01    | 15    |
| 2          | 2019-02-10    | 200   |
| 2          | 2019-03-22    | 30    |
+------------+---------------+-------+

Result table:
+------------+---------------+
| product_id | average_price |
+------------+---------------+
| 1          | 6.96          |
| 2          | 16.96         |
+------------+---------------+
平均售价 = 产品总价 / 销售的产品数量。
产品 1 的平均售价 = ((100 * 5)+(15 * 20) )/ 115 = 6.96
产品 2 的平均售价 = ((200 * 15)+(30 * 30) )/ 230 = 16.96

select u.product_id, round(sum(p.price * u.units) / sum(u.units), 2) average_price
from UnitsSold u, Prices p
where u.product_id = p.product_id and u.purchase_date >= p.start_date and u.purchase_date <= p.end_date
group by u.product_id

按产品分组，根据某时间段获取价格，平均价格=总价/总数，其中总价又是各个时间段单价*数量之和，可以想一想为什么average_price可以round(sum(p.price * u.units) / sum(u.units), 2)这样求

下面看一种效率高的

select 
product_id, round(sum(total)/sum(units),2) as average_price
from 
(
    select 
    B.product_id, A.price * B.units as total, B.units
    from Prices A
    join UnitsSold B on A.product_id = B.product_id 
    and B.purchase_date >= A.start_date 
    and B.purchase_date <= A.end_date
) A
group by product_id

这种写法还没有我写的简洁，不知道为什么效率会高，leetcode的问题吧（是不是升级服务器了）

603. 连续空余座位

几个朋友来到电影院的售票处，准备预约连续空余座位。

你能利用表 cinema ，帮他们写一个查询语句，获取所有空余座位，并将它们按照 seat_id 排序后返回吗？
| seat_id | free |
|---------|------|
| 1       | 1    |
| 2       | 0    |
| 3       | 1    |
| 4       | 1    |
| 5       | 1    |
对于如上样例，你的查询语句应该返回如下结果。
| seat_id |
|---------|
| 3       |
| 4       |
| 5       |
注意：

seat_id 字段是一个自增的整数，free 字段是布尔类型（'1' 表示空余， '0' 表示已被占据）。
连续空余座位的定义是大于等于 2 个连续空余的座位。

select distinct c1.seat_id
from(
        select seat_id
        from cinema
        where free = 1
    ) c1 inner join cinema c2 on c2.free = 1 and (c1.seat_id = c2.seat_id + 1 or c1.seat_id = c2.seat_id - 1)
order by c1.seat_id

cinema两表进行自连接，找出相邻都空的，由于找出的是组合（可能是1，2空，也可能是2，1空），所以最后去一下重

看一种效率较高的解法

SELECT DISTINCT a.seat_id
FROM cinema as a join cinema as b
ON abs(a.seat_id-b.seat_id)=1
AND a.free=1 and b.free=1
ORDER by a.seat_id

这种简单题没有多少个人发挥的空间，解法大同小异，不过这里用了一个abs函数，而且去掉了子查询，更简洁

180. 连续出现的数字

编写一个 SQL 查询，查找所有至少连续出现三次的数字。
+----+-----+
| Id | Num |
+----+-----+
| 1  |  1  |
| 2  |  1  |
| 3  |  1  |
| 4  |  2  |
| 5  |  1  |
| 6  |  2  |
| 7  |  2  |
+----+-----+
例如，给定上面的 Logs 表， 1 是唯一连续出现至少三次的数字。
+-----------------+
| ConsecutiveNums |
+-----------------+
| 1               |
+-----------------+

select distinct l1.Num as ConsecutiveNums
from Logs l1, Logs l2, Logs l3
where l1.Id = l2.Id - 1 and l2.Id = l3.Id - 1 and l1.Num = l2.Num and l2.Num = l3.Num;

既然连续出现三次，那就自连接两次，id相邻且数字相同，同样由于查出的是组合，要去重一下

下面看一下另一种解法

select distinct Num as ConsecutiveNums
from (
    select Num, if(@pre=Num, @a:=@a+1, @a:=1) as n, @pre:=Num midVal
    from Logs, (select @a:=0, @pre:='a') t
    ) as temp
where n >= 3
order by Num desc;

利用两个变量处理相邻每一行，一个用于存值比较，一个用于计数，这里补充一下，对于select中的结果，如字段，变量，如果用于select结果中的select子查询，是不可以用于多层select子查询中的，也就是说只能用在最外层，这是工作中发现一个事情

613. 直线上的最近距离

表 point 保存了一些点在 x 轴上的坐标，这些坐标都是整数。

写一个查询语句，找到这些点中最近两个点之间的距离。
| x   |
|-----|
| -1  |
| 0   |
| 2   |
最近距离显然是 '1' ，是点 '-1' 和 '0' 之间的距离。所以输出应该如下：
| shortest|
|---------|
| 1       |
注意：每个点都与其他点坐标不同，表 table 不会有重复坐标出现。

进阶：如果这些点在 x 轴上从左到右都有一个编号，输出结果时需要输出最近点对的编号呢？

select p.dis shortest
from(
    select abs(p1.x - p2.x) dis
    from point p1, point p2
    where p1.x != p2.x
) p
order by p.dis
limit 1

自连接求出每两点之间的距离，排个序，取最小的

看另一种解法

select convert(sub,decimal(10,0))  'shortest' from (select @sub:=(case when @pre is null then 0 else x-@pre end) sub, @pre:=x from `point` ,(select @sub:=0,@pre:=null) n order by x) s order by sub limit 1,1

还是使用两个变量，没有用连接，这里可以看出对select结果中sub的处理是在order by x排序之后，要注意一下最后limit 1, 1，因为第一个是0，要排除一下

612. 平面上的最近距离

表 point_2d 保存了所有点（多于 2 个点）的坐标 (x,y) ，这些点在平面上两两不重合。

写一个查询语句找到两点之间的最近距离，保留 2 位小数。
| x  | y  |
|----|----|
| -1 | -1 |
| 0  | 0  |
| -1 | -2 |
最近距离在点 (-1,-1) 和(-1,2) 之间，距离为 1.00 。所以输出应该为：
| shortest |
|----------|
| 1.00     |
注意：任意点之间的最远距离小于 10000 。

select round(sqrt(p2.pow2), 2) shortest
from(
    select (p1.x - p2.x) * (p1.x - p2.x) + (p1.y - p2.y) * (p1.y - p2.y) pow2
    from point_2d p1, point_2d p2
    where p1.x != p2.x or p1.y != p2.y
) p2
order by p2.pow2
limit 1

我的思路都很直接粗暴，先求出所有点的距离，然后排序取最小的，这里有个技巧是先求距离的平方，最后再开方取距离

看另一种解法

select round(min(sqrt(pow(a.x-b.x,2)+pow(a.y-b.y,2))),2) as shortest
from point_2d as a, point_2d as b 
where a.x!=b.x or a.y!=b.y

这里直接用min函数求最小，而不是像上面分步来做，先求距离，再取最小，所以多掌握一些常用函数还是很有用的，把问题交给sql内部去处理，而不是自己来处理

亲爱的读者：有时间可以点赞评论一下

我慒慒懂懂过了一年，这一年似乎没有改变，守着一个人的世界，空空如也

点赞(1) 打赏

全部评论

还没有评论！

注册登陆

搜索

泽被生民，一路走好

月份	原创文章数
202212	3
202103	10
202102	14
202010	3
202009	3
202008	8
202007	7
202006	10
202005	11
202004	22
202003	52
202002	44
202001	83
201912	52
201911	29
201910	41
201909	99
201908	35
201907	73
201906	121
201811	1
201810	2
201804	1
201803	1
201802	1
201707	1

月份	转载文章数
202212	4
202103	2
202102	3
202009	5
202008	3
202007	17
202005	17
202004	23
202003	11
202002	2
201912	13
201909	1