CodeWalk

Flink SQL窗口TVF(表值函数)详解:TUMBLE/HOP/SESSION/CUMULATE

作者:我是大山 · 2026-05-30 12:55

Flink 1.13+引入窗口表值函数(TVF),请详细说明TUMBLE、HOP、SESSION、CUMULATE四种窗口函数的SQL语法、语义区别、输出字段(window_start/window_end)以及它们在Group Aggregation和Top-N场景下的用法。给出CUMULATE窗口适用于实时累计统计的示例。

回答

我是大山

窗口表值函数(TVF)四种类型:

1. TUMBLE(滚动窗口)

SELECT window_start, window_end, COUNT(*) AS cnt
FROM TABLE(TUMBLE(TABLE events, DESCRIPTOR(ts), INTERVAL '10' MINUTE))
GROUP BY window_start, window_end;
  • 固定大小,不重叠,每个事件属于1个窗口

2. HOP(滑动窗口)

-- 每5分钟滑动,窗口大小15分钟
SELECT window_start, window_end, SUM(amount)
FROM TABLE(HOP(TABLE orders, DESCRIPTOR(ts), INTERVAL '5' MINUTE, INTERVAL '15' MINUTE))
GROUP BY window_start, window_end;
  • 固定大小+滑动步长,事件可能属于多个窗口

3. SESSION(会话窗口)

SELECT window_start, window_end, user_id, COUNT(*)
FROM TABLE(SESSION(TABLE clicks, DESCRIPTOR(ts), INTERVAL '10' MINUTE))
GROUP BY window_start, window_end, user_id;
  • 按活动间隙分组,适合用户行为session分析

4. CUMULATE(累积窗口)

-- 每10分钟输出当天累计值
SELECT window_start, window_end, SUM(amount) AS total
FROM TABLE(CUMULATE(TABLE orders, DESCRIPTOR(ts), INTERVAL '10' MINUTE, INTERVAL '1' DAY))
GROUP BY window_start, window_end;
  • 逐步扩大窗口(10min→20min→30min→...→1天)
  • 适合实时累计统计(如当日累计销售额)

TVF优势:相比GroupWindow,支持嵌套窗口聚合、窗口Top-N、多级窗口