python

您所在的位置:网站首页 self的后缀 python

python

2023-05-25 05:51| 来源: 网络整理| 查看: 265

 

新建数据文件如c_data.xlsx(后缀为.xlsx),右键重命名,直接将文件后缀名一并修改,修改为“c_data.csv”

读取文件里的数据

data = pd.read_csv('E:/python_workspace/data_space/c_data.csv')

发现报错信息如下:

Traceback (most recent call last): File "E:/python_workspace/Demo/pandas_pratices.py", line 3, in data = pd.read_csv('E:/python_workspace/data_space/c_data.csv') File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 688, in read_csv return _read(filepath_or_buffer, kwds) File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 460, in _read data = parser.read(nrows) File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 1198, in read ret = self._engine.read(nrows) File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 2157, in read data = self._reader.read(nrows) File "pandas\_libs\parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read File "pandas\_libs\parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory File "pandas\_libs\parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows File "pandas\_libs\parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas\_libs\parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 7, saw 2

 

网上找了好几个解决教程都发现解决不了。

后面察觉到会不会一开始新建的文件为.xlsx后缀直接修改成.csv文件后,其实本质上还是一个excel文件,导致read_csv报错。

 

打开c_data.csv文件,将其另存为后缀为.csv文件,命名为“a_data.csv”,后面进行测试对比。

 

(1)demo1:首先我们假设这个c_data.csv文件即使被修改后缀了,仍然是一个excel文件,用read_excel来读取

import pandas as pd data = pd.read_excel('E:/python_workspace/data_space/c_data.csv')print(data)

运行后,控制台并没有报错,而是输出读取结果

id name score 0 1 小米 78.01 1 2 小白 88.02 2 3 小新 99.03 3 4 小圆 99.04 4 5 小羊 NaN

 

(2)demo2:用read_csv读取新的csv文件“a_data.csv”

import pandas as pd data = pd.read_csv('E:/python_workspace/data_space/a_data.csv') print(data)

运行后,上面那个报错已经没有出现,只是出现编码问题

Traceback (most recent call last): File "pandas\_libs\parsers.pyx", line 1119, in pandas._libs.parsers.TextReader._convert_tokens File "pandas\_libs\parsers.pyx", line 1244, in pandas._libs.parsers.TextReader._convert_with_dtype File "pandas\_libs\parsers.pyx", line 1259, in pandas._libs.parsers.TextReader._string_convert File "pandas\_libs\parsers.pyx", line 1450, in pandas._libs.parsers._string_box_utf8UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 2: invalid continuation byte During handling of the above exception, another exception occurred: Traceback (most recent call last): File "E:/python_workspace/MyWriter/CDA_Demo/pandas_pratices.py", line 3, in data = pd.read_csv('E:/python_workspace/data_space/a_data.csv') File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 688, in read_csv return _read(filepath_or_buffer, kwds) File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 460, in _read data = parser.read(nrows) File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 1198, in read ret = self._engine.read(nrows) File "E:\python_workspace\MyWriter\venv\lib\site-packages\pandas\io\parsers.py", line 2157, in read data = self._reader.read(nrows) File "pandas\_libs\parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read File "pandas\_libs\parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory File "pandas\_libs\parsers.pyx", line 941, in pandas._libs.parsers.TextReader._read_rows File "pandas\_libs\parsers.pyx", line 1073, in pandas._libs.parsers.TextReader._convert_column_data File "pandas\_libs\parsers.pyx", line 1126, in pandas._libs.parsers.TextReader._convert_tokens File "pandas\_libs\parsers.pyx", line 1244, in pandas._libs.parsers.TextReader._convert_with_dtype File "pandas\_libs\parsers.pyx", line 1259, in pandas._libs.parsers.TextReader._string_convert File "pandas\_libs\parsers.pyx", line 1450, in pandas._libs.parsers._string_box_utf8 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 2: invalid continuation byte

 

将编码设置补充进去

import pandas as pd data = pd.read_csv('E:/python_workspace/data_space/a_data.csv', encoding='gbk') print(data)

运行后,控制台输出

id name score 0 1 小米 78.01 1 2 小白 88.02 2 3 小新 99.03 3 4 小圆 99.04 4 5 小羊 NaN

 

综合上面两个测试,可得出:后缀为.xlsx的文件重命名修改成.csv文件后,使用pandas来读取时,仍会被识别为一个excel文件,导致read_csv报错。

 

本篇到此结束~

 



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3