How to Pivot Data in SQL?

时间:2021-08-01
本文章向大家介绍How to Pivot Data in SQL?,主要包括How to Pivot Data in SQL?使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

1. Introduction

If you think "pivot data" is too terminology, we can also describe the same thing as interchange rows and columns data. Not necessaryly but normally, it will also have data aggregation during the process.

Different people may have different idea when "pivot data" appears to them:

If you are an Excel expert, you may think it as creating an excel pivot table and dragging data tab with mouse.

If you use R with tidyverse packages, you may think it as changing tidy data into un-tidy data. It is like making a long table wide or making a wide table long. This process can be done by pivot_wider() or pivot_longer().

If you use Python and pandas package, this can be done by pd.pivot_table() function. I think they also has a function called pd.melt() but I don't use it much.

I won't go to detail about above cases because in this article we will focus on SQL: How we can pivot data in SQL?

2. The problem
Suppose we have a table like below(we call it Products):

What we can do if we want it change to table like below(we call it Results):

Ideally speaking, the second kind of table should never exists, because it is not suitable for further analyse. But in some situation it is also useful when representing the result.

3. SQL process on pivot

3.1 use "group by" with conditional summarise.

A pivot table is nothing miracle. It is only a few independent "group by" summarise results join together form left to right.

We can repeat this process step by step by ourself.

select product_id, sum(price) as store1
from Products
where store = 'store1'
group by product_id

Now we will have:

We can repeat this process for each store one by one. This is what actually a pivot program or a pivot function from some library will do under the hood.

A "group by" summarise with "where" clause is so call "conditional summarise". Another way of writting "conditional summarise" equally is using "case when".

For example, the first query, we can rewrite it as:

select product_id, sum(case store when 'store1' then price else null end) as store1
from Products
group by product_id

Now we will have a same result:

This method is a little bit more useful because it can help us avoid using "where" clause for each case independently. So that we can write them in one query.

select product_id, 
    sum(case store when 'store1' then price end) as store1,
    sum(case store when 'store2' then price end) as store2,
    sum(case store when 'store3' then price end) as store3
from Products
group by product_id

 

3.2 use pivot() function

There is function call pivot() in MS SQL Server, it can be used like below:

select *
from Products pivot(sum(price) for store in (store1, store2, store3)) pt

We will have our expecting result a well:

This is a useful function but do not use it before you really understand what pivot is doing. Otherwise you will confuse youself easily. You can see how this short function is hidding big infomation to us.

3.3 the backward question

We may also meet some situations that we want to make a backward manipulation. That is, pivot the Result table to Products table.

It is a tricky one but after you know the process you will easy to re-use in future.

select product_id, "store1" as store, store1 as price
from Results
where store1 is not null
union
select product_id, "store2" as store, store2 as price
from Results
where store2 is not null
union
select product_id, "store3" as store, store3 as price
from Results
where store3 is not null

The "union" keyword can help us combine table together upside down. That is what we need in this situation.

 

  

原文地址:https://www.cnblogs.com/drvongoosewing/p/14885542.html