用Python實現(xiàn)數(shù)據(jù)的透視表的方法

2024-09-09 19:02:25

字體：大中小

供稿：網(wǎng)友

在處理數(shù)據(jù)時，經(jīng)常需要對數(shù)據(jù)分組計算均值或者計數(shù)，在Microsoft Excel中，可以通過透視表輕易實現(xiàn)簡單的分組運算。而對于更加復(fù)雜的分組運算，Python中pandas包可以幫助我們實現(xiàn)。

1 數(shù)據(jù)

首先引入幾個重要的包：

import pandas as pdimport numpy as npfrom pandas import DataFrame,Series

通過代碼構(gòu)造數(shù)據(jù)集：

data=DataFrame({'key1':['a','b','c','a','c','a','b','a','c','a','b','c'],'key2':['one','two','three','two','one','one','three','one','two','three','one','two'],'num1':np.random.rand(12),'num2':np.random.randn(12)})

得到數(shù)據(jù)集如下：

data key1 key2  num1  num20 a one 0.268705 0.0840911 b two 0.876707 0.2177942 c three 0.229999 0.5744023 a two 0.707990 -1.4444154 c one 0.786064 0.3432445 a one 0.587273 1.2123916 b three 0.927396 1.5053727 a one 0.295271 -0.4976338 c two 0.292721 0.0988149 a three 0.369788 -1.157426

2 交叉表―分類計數(shù)

按照不同類進(jìn)行計數(shù)統(tǒng)計是最常見透視功能，可以通

（1）crosstab

#函數(shù)：crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, dropna=True, normalize=False)

crosstab的index和columns是必須要指定復(fù)制的參數(shù)：

pd.crosstab(data.key1,data.key2)

結(jié)果如下：

key2 one three twokey1     a  3  1 1b  0  1 1c  1  1 1

想要在邊框處增加匯總項可以指定margin的值為True：

pd.crosstab(data.key1,data.key2,margins=True)

結(jié)果：

key2 one three two Allkey1      a  3  1 1 5b  1  1 1 3c  1  1 2 4All  5  3 4 12

（2）pivot_table

函數(shù)：

pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')

使用pivot_table函數(shù)同樣可以實現(xiàn)，運算函數(shù)默認(rèn)值aggfunc='mean',指定為aggfunc='count'即可：

data.pivot_table('num1',index='key1',columns='key2',aggfunc='count')

結(jié)果相同：

key2 one three twokey1     a  3  1 1b  1  1 1c  1  1 2

（3）groupby

通過groupby相對來說會更加復(fù)雜，首先需要對data按照key1和key2進(jìn)行聚類，然后進(jìn)行count運算，再將key2的index重塑為columns：

data.groupby(['key1','key2'])['num1'].count().unstack()

結(jié)果：

key2 one three twokey1     a  3  1 1b  1  1 1c  1  1 2

3 其它透視表運算

（1）pivot_table

pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')

要進(jìn)行何種運算，只需要指定aggfunc即可。

默認(rèn)計算均值：

data.pivot_table(index='key1',columns='key2')

out：

上一篇：python里dict變成list實例方法

下一篇：python實現(xiàn)的多任務(wù)版udp聊天器功能案例

學(xué)習(xí)交流

如何查找有故障的配件

如何查找有故障的配件...

熱門圖片

猜你喜歡的新聞

猜你喜歡的關(guān)注

麻豆小视频在线观看_中文黄色一级片_久久久成人精品_成片免费观看视频大全_午夜精品久久久久久久99热浪潮_成人一区二区三区四区

用Python實現(xiàn)數(shù)據(jù)的透視表的方法