54

2024-06-29 06:15| 来源: 网络整理| 查看: 265

54_Pandas将DataFrame、Series转换为字典 (to_dict)

pandas.DataFrame、pandas.Series可以使用to_dict()方法转换为字典（dict类型对象）。

对于pandas.DataFrame，参数orient可以用来指定pandas.DataFrame的行标签索引、列标签列和值如何分配给字典的键和值。

在 pandas.Series 的情况下，它被转换为以标签作为键的字典。

此处解释以下内容。

pandas.DataFrame to_dict() 方法指定字典的格式：Argument orient转换为 dict 以外的类型：Argument into 从 pandas.DataFrame 的任意两列生成字典pandas.Series to_dict 方法转换为 dict 转换为 dict 以外的类型：Argument into

创建以下 pandas.DataFrame 作为示例。

import pandas as pd import pprint from collections import OrderedDict df = pd.DataFrame({'col1': [1, 2, 3], 'col2': ['a', 'x', '啊']}, index=['row1', 'row2', 'row3']) print(df) # col1 col2 # row1 1 a # row2 2 x # row3 3 啊

它导入 pprint 以使输出更易于查看，并导入 OrderedDict 以通过参数解释类型规范。

pandas.DataFrame to_dict() 方法

当从 pandas.DataFrame 调用 to_dict() 方法时，默认情况下它将转换为字典（dict 类型对象），如下所示。

d = df.to_dict() pprint.pprint(d) # {'col1': {'row1': 1, 'row2': 2, 'row3': 3}, # 'col2': {'row1': 'a', 'row2': 'x', 'row3': '啊'}} print(type(d)) # 指定字典的格式：Argument orient

通过参数orient，可以指定pandas.DataFrame行标签（行名）索引、列标签（列名）列、值值如何分配给字典键和值的格式。

dict

如果 orient=‘dict’，key 是列标签，value 是行标签和值的字典。如果省略了 orient 参数（默认），则为这种格式。

{column -> {index -> value}}

d_dict = df.to_dict(orient='dict') pprint.pprint(d_dict) # {'col1': {'row1': 1, 'row2': 2, 'row3': 3}, # 'col2': {'row1': 'a', 'row2': 'x', 'row3': '啊'}} print(d_dict['col1']) # {'row1': 1, 'row2': 2, 'row3': 3} print(type(d_dict['col1'])) #

list

如果 orient=‘list’，key 是列标签，value 是值列表。行名信息丢失。

{column -> [values]}

d_list = df.to_dict(orient='list') pprint.pprint(d_list) # {'col1': [1, 2, 3], 'col2': ['a', 'x', '啊']} print(d_list['col1']) # [1, 2, 3] print(type(d_list['col1'])) #

series

如果 orient=‘series’，键是列标签，值是 pandas.Series，带有行标签和值。

{column -> Series(values)}

d_series = df.to_dict(orient='series') pprint.pprint(d_series) # {'col1': row1 1 # row2 2 # row3 3 # Name: col1, dtype: int64, # 'col2': row1 a # row2 x # row3 啊 # Name: col2, dtype: object} print(d_series['col1']) # row1 1 # row2 2 # row3 3 # Name: col1, dtype: int64 print(type(d_series['col1'])) #

split

如果orient=‘split’，键为’index’、‘columns’、‘data’，values为行标签、列标签和值列表。

{index -> [index], columns -> [columns], data -> [values]}

d_split = df.to_dict(orient='split') pprint.pprint(d_split) # {'columns': ['col1', 'col2'], # 'data': [[1, 'a'], [2, 'x'], [3, '啊']], # 'index': ['row1', 'row2', 'row3']} print(d_split['columns']) # ['col1', 'col2'] print(type(d_split['columns'])) #

records

如果 orient=‘records’，它将是一个列表，其元素是字典，其中 key 是列标签，value 是值。行名信息丢失。

[{column -> value}, ... , {column -> value}]

l_records = df.to_dict(orient='records') pprint.pprint(l_records) # [{'col1': 1, 'col2': 'a'}, {'col1': 2, 'col2': 'x'}, {'col1': 3, 'col2': '啊'}] print(type(l_records)) # print(l_records[0]) # {'col1': 1, 'col2': 'a'} print(type(l_records[0])) #

index

如果 orient=‘index’，则 key 是行标签，value 是列标签和值的字典。

{index -> {column -> value}}

d_index = df.to_dict(orient='index') pprint.pprint(d_index) # {'row1': {'col1': 1, 'col2': 'a'}, # 'row2': {'col1': 2, 'col2': 'x'}, # 'row3': {'col1': 3, 'col2': '啊'}} print(d_index['row1']) # {'col1': 1, 'col2': 'a'} print(type(d_index['row1'])) # 转换为 dict 以外的类型：Argument into

通过为参数指定类型，它可以转换为子类，例如 OrderedDict，而不是字典（dict 类型）。

字典值value中存储的字典类型也将是指定的类型。

od = df.to_dict(into=OrderedDict) pprint.pprint(od) # OrderedDict([('col1', OrderedDict([('row1', 1), ('row2', 2), ('row3', 3)])), # ('col2', # OrderedDict([('row1', 'a'), ('row2', 'x'), ('row3', '啊')]))]) print(type(od)) # print(od['col1']) # OrderedDict([('row1', 1), ('row2', 2), ('row3', 3)]) print(type(od['col1'])) # 从 pandas.DataFrame 的任意两列生成字典

还可以通过从索引和数据列中选择任意两列来创建字典。使用 dict() 和 zip()。

print(df.index) # Index(['row1', 'row2', 'row3'], dtype='object') print(df['col1']) # row1 1 # row2 2 # row3 3 # Name: col1, dtype: int64 d_col = dict(zip(df.index, df['col1'])) print(d_col) # {'row1': 1, 'row2': 2, 'row3': 3} pandas.Series to_dict 方法转换为 dict

以下面的 pandas.Series 为例。

print(df) # col1 col2 # row1 1 a # row2 2 x # row3 3 啊 s = df['col1'] print(s) # row1 1 # row2 2 # row3 3 # Name: col1, dtype: int64 print(type(s)) #

当你在 pandas.Series 中调用 to_dict() 方法时，会创建一个字典，其中标签是键，值是值。

d = s.to_dict() print(d) # {'row1': 1, 'row2': 2, 'row3': 3} print(type(d)) # 转换为 dict 以外的类型：Argument into

即使使用 pandas.Series 的 to_dict() 方法，通过在参数中指定类型 into，您也可以将其转换为子类，例如 OrderedDict，而不是字典（dict 类型）。

od = df['col1'].to_dict(OrderedDict) print(od) # OrderedDict([('row1', 1), ('row2', 2), ('row3', 3)]) print(type(od)) #

【本文地址】

54

54

今日新闻

推荐新闻