python实现统计文本当中单词数量 |
您所在的位置:网站首页 › 超过的单词怎么写英文 › python实现统计文本当中单词数量 |
title: python实现统计文本当中单词数量
date: 2018-6-30 15:12:43
categories: Python
tags:
- python
关于用实现统计文本当中单词数量这个功能,代码进行一步一步的升级。 我做个回顾,或许以后还能写出更符合标准的代码。 1 刚看完《python编程:从入门到实践》的时候写的代码学习python的时候在《python编程:从入门到实践书中第10章中学习了分析文本,当时写出了统计一个单词出现的频率: # 10-10 常见单词 def row_count(filename): try: with open(filename) as f_obj: content = f_obj.read() except FileNotFoundError: msg = "The file " + filename + " does not exist." print(msg) else: content = content.replace(',', ' ') content = content.replace('.', ' ') content = content.replace('-', ' ') content = content.strip().lower() words = content.split() # 统计row单词出现在文本中的次数 number = words.count('row') print('row : %d' % number) filename = 'Heart of Darkness.txt' row_count(filename)运行结果为: row : 9这个代码只是实现一个单词的出现次数的统计。 并且还有一些问题。比如还有[a、(b这样的标点符号出现在单词中。 2 写完一个单词的统计,又扩展了对所有单词都进行统计并排序当时写完课后作业时,想到了能不能对所有单词都进行统计并进行排序呢,于是上网查了一些资料,写出了下面的代码: from operator import itemgetter def words_list(filename): try: with open(filename) as f_obj: content = f_obj.read() except FileNotFoundError: msg = "The file " + filename + " does not exist." print(msg) else: content = content.replace(',', ' ') content = content.replace('.', ' ') content = content.replace('!', ' ') content = content.replace('-', ' ') content = content.replace('_', ' ') content = content.replace('(', ' ') content = content.replace(')', ' ') content = content.strip() words = [word.lower() for word in content.split()] return words def count_results(filename): words_count = {} words = words_list(filename) words_count = words_count.fromkeys(words) for word in words_count.keys(): number = words.count(word) words_count[word] = number words_count = sorted(words_count.items(), key=itemgetter(1), reverse=True) return words_count if __name__ == '__main__': filename = 'Heart of Darkness.txt' words_count = count_results(filename) for word, word_count in words_count[:10]: print('{0: |
今日新闻 |
推荐新闻 |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |