question about the tokenizer and keywords extractor tool #13

jiangliqin · 2021-12-27T06:43:08Z

Hi,I use the default jieba tokenizer tool and gensim/jieba keywords extractor tool to preprocess the corppus,but my result is not as good as you ,for example:
mine:['杨清', '孩子', '网友', '母亲', '小孩', '失望透顶', '父母', '发消息']
your:[ "王乐乐", "杨清柠", "奶粉", "外孙", "分手", "孩子"]

could you explain the tokenizer and keywords extractor tool that you use for more detail?

yahiko-l · 2022-03-14T07:04:23Z

stop words??

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the tokenizer and keywords extractor tool #13

question about the tokenizer and keywords extractor tool #13

jiangliqin commented Dec 27, 2021

yahiko-l commented Mar 14, 2022

question about the tokenizer and keywords extractor tool #13

question about the tokenizer and keywords extractor tool #13

Comments

jiangliqin commented Dec 27, 2021

yahiko-l commented Mar 14, 2022