Pandas中的info()函数与describe()函数

您所在的位置：网站首页 › describe怎么说 › Pandas中的info()函数与describe()函数

Pandas中的info()函数与describe()函数

2023-08-14 13:28| 来源: 网络整理| 查看: 265

对于这两个函数，我首先抛出官网的解释：info()函数和describe()函数 1. i n f o ( ) 函数 \color{red}{1.\,\,\,info()函数} 1.info()函数 info()函数用于打印DataFrame的简要摘要，显示有关DataFrame的信息，包括索引的数据类型dtype和列的数据类型dtype，非空值的数量和内存使用情况。 1.1 info()函数参数介绍

DataFrame.info (self, verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)

ParametersValueselfself只有在类的方法中才会有，其他函数或方法是不必带self的。有关self的更多内容，指路 → \to →https://www.cnblogs.com/huangbiquan/p/7741016.htmlverbose：bool, optional“verbose”中文译为“冗长的”，该参数决定是否打印完整的摘要。如果为True，显示所有列的信息；如果为False，那么会省略一部分。**默认情况下，**遵循pandas.options.display.max_info_columns中的设置。buf：writable buffer, defaults to sys.stdout该参数决定将输出发送到哪里。默认情况下，输出打印到sys.stdout。如果需要进一步处理输出，请传递可写缓冲区。可将DataFrame.info()存储为变量，指路 → \to →https://blog.csdn.net/qq_34105362/article/details/90056765。max_col：sint, optional该参数使得从“详细输出”转换为“缩减输出”，如果DataFrame的列数超过max_cols，则缩减输出。默认情况下，使用pandas.options.display.max_info_columns中的设置。memory_usage：bool, str, optional该参数决定是否应显示DataFrame元素（包括索引）的总内存使用情况。默认情况下为True。 True始终显示内存使用情况；False永远不会显示内存使用情况。null_counts：bool, optional该参数决定是否显示非空计数。值为True始终显示计数，而值为False则不显示计数。默认情况下，仅当Dataframe小于pandas.options.display.max_info_rows和pandas.options.display.max_info_columns时才显示。 1.2 info()函数举例 #（1）定义一个Dataframe int_values = [1, 2, 3, 4, 5] text_values = ['alpha', 'beta', 'gamma', 'delta', 'epsilon'] float_values = [0.0, 0.25, 0.5, 0.75, 1.0] df = pd.DataFrame({"int_col": int_values, "text_col": text_values, "float_col": float_values}) df

Output：

int_col text_col float_col 0 1 alpha 0.00 1 2 beta 0.25 2 3 gamma 0.50 3 4 delta 0.75 4 5 epsilon 1.00 #（2）利用info()函数 df.info(verbose=True)

Output：

RangeIndex: 5 entries, 0 to 4 Data columns (total 3 columns): int_col 5 non-null int64 text_col 5 non-null object float_col 5 non-null float64 dtypes: float64(1), int64(1), object(1) memory usage: 248.0+ bytes 2. d e s c r i b e ( ) 函数 \color{red}{2.\,\,\,describe()函数} 2.describe()函数 describe()函数用于生成描述性统计信息。描述性统计数据：数值类型的包括均值，标准差，最大值，最小值，分位数等；类别的包括个数，类别的数目，最高数量的类别及出现次数等；输出将根据提供的内容而有所不同。 2.1 describe()函数参数介绍

DataFrame.describe (self: ~FrameOrSeries, percentiles=None, include=None, exclude=None)

项目Valuepercentiles：list-like of numbers, optional该参数决定要包含在输出中的百分位数。所有值都应介于0和1之间。默认值为[.25，.5，.75]，它返回第25、50和75个百分位数。include：‘all’, list-like of dtypes or None (default), optional该参数决定要包含在结果中的数据类型的白名单。‘all’：所有列将包含在输出中。 dtypes的列表：将结果限制为提供的数据类型。默认情况下，结果将包括所有数字列。exclude：list-like of dtypes or None (default), optional,该参数决定要从结果中忽略的数据类型的黑名单。dtypes的列表：从结果中排除提供的数据类型。默认情况下，结果将不排除任何内容。 2.2 info()函数举例

2.2.1 Describing a numeric Series.

s = pd.Series([1, 2, 3]) s.describe()

Output：

count 3.0 mean 2.0 std 1.0 min 1.0 25% 1.5 50% 2.0 75% 2.5 max 3.0 dtype: float64

2.2.2 Describing a categorical Series.

s = pd.Series(['a', 'a', 'b', 'c']) s.describe()

Output：

count 4 unique 3 top a freq 2 dtype: object

2.2.3 Describing a DataFrame. By default only numeric fields are returned.

df = pd.DataFrame({'categorical': pd.Categorical(['d','e','f']), 'numeric': [1, 2, 3], 'object': ['a', 'b', 'c'] }) df.describe()

Output：

numeric count 3.0 mean 2.0 std 1.0 min 1.0 25% 1.5 50% 2.0 75% 2.5 max 3.0

2.2.4 Describing all columns of a DataFrame regardless of data type.

df.describe(include='all')

Output：

categorical numeric object count 3 3.0 3 unique 3 NaN 3 top f NaN c freq 1 NaN 1 mean NaN 2.0 NaN std NaN 1.0 NaN min NaN 1.0 NaN 25% NaN 1.5 NaN 50% NaN 2.0 NaN 75% NaN 2.5 NaN max NaN 3.0 NaN

【本文地址】

Pandas中的info()函数与describe()函数

Pandas中的info()函数与describe()函数

今日新闻

推荐新闻