Table: Failed
+--------------+---------+ | Column Name | Type | +--------------+---------+ | fail_date | date | +--------------+---------+ fail_date is the primary key (column with unique values) for this table. This table contains the days of failed tasks.
Table: Succeeded
+--------------+---------+ | Column Name | Type | +--------------+---------+ | success_date | date | +--------------+---------+ success_date is the primary key (column with unique values) for this table. This table contains the days of succeeded tasks.
A system is running one task every day. Every task is independent of the previous tasks. The tasks can fail or succeed.
Write a solution to report the period_state
for each continuous interval of days in the period from 2019-01-01
to 2019-12-31
.
period_state
is 'failed'
if tasks in this interval failed or 'succeeded'
if tasks in this interval succeeded. Interval of days are retrieved as start_date
and end_date.
Return the result table ordered by start_date
.
The result format is in the following example.
Example 1:
Input: Failed table: +-------------------+ | fail_date | +-------------------+ | 2018-12-28 | | 2018-12-29 | | 2019-01-04 | | 2019-01-05 | +-------------------+ Succeeded table: +-------------------+ | success_date | +-------------------+ | 2018-12-30 | | 2018-12-31 | | 2019-01-01 | | 2019-01-02 | | 2019-01-03 | | 2019-01-06 | +-------------------+ Output: +--------------+--------------+--------------+ | period_state | start_date | end_date | +--------------+--------------+--------------+ | succeeded | 2019-01-01 | 2019-01-03 | | failed | 2019-01-04 | 2019-01-05 | | succeeded | 2019-01-06 | 2019-01-06 | +--------------+--------------+--------------+ Explanation: The report ignored the system state in 2018 as we care about the system in the period 2019-01-01 to 2019-12-31. From 2019-01-01 to 2019-01-03 all tasks succeeded and the system state was "succeeded". From 2019-01-04 to 2019-01-05 all tasks failed and the system state was "failed". From 2019-01-06 to 2019-01-06 all tasks succeeded and the system state was "succeeded".
Solution 1: Union + Window Function + Group By
We can merge the two tables into one table with a field st
representing the status, where failed
indicates failure and succeeded
indicates success. Then, we can use a window function to group the records with the same status into one group, and calculate the difference between each date and its rank within the group as pt
, which serves as the identifier for the same continuous status. Finally, we can group by st
and pt
, and calculate the minimum and maximum dates for each group, and sort by the minimum date.
# Write your MySQL query statement below
WITH
T AS (
SELECT fail_date AS dt, 'failed' AS st
FROM Failed
WHERE YEAR(fail_date) = 2019
UNION ALL
SELECT success_date AS dt, 'succeeded' AS st
FROM Succeeded
WHERE YEAR(success_date) = 2019
)
SELECT
st AS period_state,
MIN(dt) AS start_date,
MAX(dt) AS end_date
FROM
(
SELECT
*,
SUBDATE(
dt,
RANK() OVER (
PARTITION BY st
ORDER BY dt
)
) AS pt
FROM T
) AS t
GROUP BY 1, pt
ORDER BY 2;