首先,我们来看看Oracle对于其的解释:
可见,pivot是数据仓库中的关键技术,它利用交叉查询(crosstabulation query)将行转换为列。
基本语法如下:
- SELECT ....
- FROM <table-expr>
- PIVOT
- (
- aggregate-function(<column>)
- FOR <pivot-column> IN (<value1>, <value2>,..., <valuen>)
- ) AS <alias>
- WHERE .....
下面我们来通过具体的案例对其进行阐述。
首先,构造案例所需的数据,
1> 创建视图,以EMP表的数据作为源数据。
- CREATE VIEW emp_view AS
- SELECT
- deptno,job,to_char(hiredate,'yyyy') hiredate,
- count(*) cnt,sum(sal) sum_sal
- FROM emp
- GROUP BY deptno,job,to_char(hiredate,'yyyy');
其中,deptno为部门号,job为工作的类型(即工种),hiredate为雇佣的日期,cnt为特定部门,特定工种在特定年份雇佣的员工的总数,sum_sal为特定部门,特定工种,特定年份雇佣的员工的工资的总和。
2> 视图的数据如下:
- SQL> select * from emp_view;
- DEPTNO JOB HIRE CNT SUM_SAL
- ---------- --------- ---- ---------- ----------
- CLERK 1980 1 800
- ANALYST 1981 1 3000
- ANALYST 1987 1 3000
- CLERK 1981 1 950
- MANAGER 1981 1 2850
- MANAGER 1981 1 2450
- SALESMAN 1981 4 5600
- MANAGER 1981 1 2975
- PRESIDENT 1981 1 5000
- CLERK 1982 1 1300
- CLERK 1987 1 1100
- rows selected.
应用场景一:
基本的Pivot转换
例1:
- SELECT * FROM
- ( SELECT deptno,hiredate,cnt
- FROM emp_view
- ) PIVOT (SUM(cnt)
- FOR hiredate IN ('1980' AS "1980",'1981' AS "1981",
- '1982' AS "1982",'1987' AS "1987"))
- ORDER BY deptno;
- DEPTNO 1980 1981 1982 1987
- ---------- ---------- ---------- ---------- ----------
- 2 1
- 1 2 2
- 6
- rows selected.
例2:
- SELECT * FROM
- ( SELECT deptno,job,cnt
- FROM emp_view
- ) PIVOT (SUM(cnt)
- FOR job IN ('CLERK','ANALYST','MANAGER','SALESMAN','PRESIDENT'))
- ORDER BY deptno;
- DEPTNO 'CLERK' 'ANALYST' 'MANAGER' 'SALESMAN' 'PRESIDENT'
- ---------- ---------- ---------- ---------- ---------- -----------
- 1 1 1
- 2 2 1
- 1 1 4
- rows selected.
两例以不同的列进行统计,前者是hiredate,后者是job。
除此之外,前者用了别名,后面没有用别名,两者的显示效果也是不一样的。
应用场景二:
对多列进行Pivot转换
- SELECT * FROM
- ( SELECT deptno,job,hiredate,cnt
- FROM emp_view
- ) PIVOT (SUM(cnt)
- FOR (job,hiredate) IN
- (('CLERK','1980') AS clerk_1980,
- ('CLERK','1981') AS clerk_1981,
- ('ANALYST','1987') AS analyst_1987,
- ('MANAGER','1981') AS manager_1981
- )
- )
- ORDER by deptno;
- DEPTNO CLERK_1980 CLERK_1981 ANALYST_1987 MANAGER_1981
- ---------- ---------- ---------- ------------ ------------
- 1
- 1 1 1
- 1 1
- rows selected.
限于篇幅,FOR (job,hiredate) IN语句中没有列出更多组合,只列出了四组,当然,我们可以根据实际场景需要罗列更多的组合。
从本例中可以看出,对两个列进行Pivot转换可从三个维度呈现统计结果。
应用场景三:
用Pivot实现多个聚合
- SELECT * FROM
- ( SELECT deptno,hiredate,cnt,sum_sal
- FROM emp_view
- ) PIVOT ( SUM(cnt) AS cnt,
- SUM(sum_sal) AS sum_sal
- FOR hiredate IN ('1980','1981','1982','1987'))
- ORDER BY deptno;
- DEPTNO '1980'_CNT '1980'_SUM_SAL '1981'_CNT '1981'_SUM_SAL '1982'_CNT '1982'_SUM_SAL '1987'_CNT '1987'_SUM_SAL
- ---------- ---------- -------------- ---------- -------------- ---------- -------------- ---------- --------------
- 2 7450 1 1300
- 1 800 2 5975 2 4100
- 6 9400
- rows selected.
'1981'_CNT指的是1981年雇佣的员工的总数,'1981'_SUM_SAL指的是1981年雇佣员工所开出的工资。
具体到本例中,即1981年10号部门招聘了2位员工,开出的工资合计为7450元,20号部门招聘了2位员工,开出的工资合计为5975元,30号部门招聘了6名员工,开出的工资合计为9400元,依次类推。
既然有pivot将行转换为列,同样也有unpivot操作将聚合后的列转换为行。
UNPIVOT
以上述应用场景三的结果作为源数据进行操作
- CREATE TABLE T1 AS
- SELECT * FROM
- ( SELECT deptno,hiredate,cnt,sum_sal
- FROM emp_view
- ) PIVOT ( SUM(cnt) AS cnt,
- SUM(sum_sal) AS sum_sal
- FOR hiredate IN ('1980' AS "1980",'1981' AS "1981",
- '1982' AS "1982",'1987' AS "1987"))
- ORDER BY deptno
表T1的结果为:
- SQL> select * from t1;
- DEPTNO 1980_CNT 1980_SUM_SAL 1981_CNT 1981_SUM_SAL 1982_CNT 1982_SUM_SAL 1987_CNT 1987_SUM_SAL
- ---------- ---------- ------------ ---------- ------------ ---------- ------------ ---------- ------------
- 2 7450 1 1300
- 1 800 2 5975 2 4100
- 6 9400
- rows selected.
首先进行一维unpivot
- SELECT deptno,DECODE(hiredate,'1980_CNT','1980','1981_CNT','1981','1982_CNT','1982','1987_CNT','1987') AS hiredate,cnt
- FROM T1
- UNPIVOT INCLUDE NULLS
- ( cnt
- FOR hiredate IN ("1980_CNT","1981_CNT","1982_CNT","1987_CNT"));
- DEPTNO HIRE CNT
- ---------- ---- ----------
- 1980
- 1981 2
- 1982 1
- 1987
- 1980 1
- 1981 2
- 1982
- 1987 2
- 1980
- 1981 6
- 1982
- 1987
- rows selected.
输出的结果为不同部门在不同年份的雇佣人数,
注意:上述SQL语句中UNPIVOT后加了INCLUDE NULLS,当然也可以指定为EXCLUDE NULLS,即排除cnt为空的值,如果不指定,则默认为EXCLUDE NULLS。
UNPIVOT后不指定INCLUDE NULLS的输入结果为:
- DEPTNO HIRE CNT
- ---------- ---- ----------
- 1981 2
- 1982 1
- 1980 1
- 1981 2
- 1987 2
- 1981 6
- rows selected.
下面,我们再进行二维unpivot
- SELECT deptno,hiredate,cnt,sum_sal
- FROM T1
- UNPIVOT
- ( (cnt,sum_sal)
- FOR hiredate IN (("1980_CNT","1980_SUM_SAL") AS 1980,
- ("1981_CNT","1981_SUM_SAL") AS 1981,
- ("1982_CNT","1982_SUM_SAL") AS 1982,
- ("1987_CNT","1987_SUM_SAL") AS 1987));
- DEPTNO HIREDATE CNT SUM_SAL
- ---------- ---------- ---------- ----------
- 1981 2 7450
- 1982 1 1300
- 1980 1 800
- 1981 2 5975
- 1987 2 4100
- 1981 6 9400
- rows selected.
输入结果为T1表列转行的结果。
参考文档:
SQL for Analysis and Reporting