安装相关库:
conda install NLTK
then import nltk # 调用
首次在python中执行nltk.download()
可以设置相关的下载地址
无法支持向量转换,可以通过jieba进行分词和向量转换,最后再用nltk处理
#-*- coding:utf-8 -*- import nltk text = 'Join thousands of learners from around the world who are improving their English listening skills with our online courses. Join thousands of learners from around the world who are improving their English listening skills with our online courses.' # 必须后面句号后面有空格才能分句 sens = nltk.sent_tokenize(text,language='english') print(sens) words = [] for sent in sens: words.append(nltk.word_tokenize(sent)) print(words) # 词性标注 tags = [] for token in words: tags.append(nltk.pos_tag(token)) print(tags) textzh = '本人喜欢折腾,倒腾大数据和AI人工智能滴一些技术。' sens_zh = nltk.sent_tokenize(textzh) # 目测无法处理中文,且句号后要加空格 print(sens_zh)