通过sklearn使用tf

2024-06-22 03:33| 来源: 网络整理| 查看: 265

Demo1

TfidfTransformer + CountVectorizer = TfidfVectorizer

from sklearn.feature_extraction.text import TfidfVectorizer, TfidfTransformer corpus = [ 'This This is the first document.', 'This This is the second second document.', 'And the third one.', 'Is this the first document?', ] tfidf_model = TfidfVectorizer() tfidf_matrix = tfidf_model.fit_transform(corpus) word_dict = tfidf_model.get_feature_names() print(word_dict) print(tfidf_matrix)

['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this'] (0, 1) 0.3493402123185688 (0, 2) 0.431504661587479 (0, 6) 0.2856085141790751 (0, 3) 0.3493402123185688 (0, 8) 0.6986804246371376 (1, 5) 0.7717016211057586 (1, 1) 0.24628357422338598 (1, 6) 0.20135295972313796 (1, 3) 0.24628357422338598 (1, 8) 0.49256714844677196 (2, 4) 0.5528053199908667 (2, 7) 0.5528053199908667 (2, 0) 0.5528053199908667 (2, 6) 0.2884767487500274 (3, 1) 0.4387767428592343 (3, 2) 0.5419765697264572 (3, 6) 0.35872873824808993 (3, 3) 0.4387767428592343 (3, 8) 0.4387767428592343

参数设置

关于参数：

input：string{'filename', 'file', 'content'}

【本文地址】

通过sklearn使用tf

通过sklearn使用tf

今日新闻

推荐新闻