Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

性能指标:在初次打开阶段时间较长,后续逐渐变好,所以这是为啥呢? #1

Open
Valuebai opened this issue Nov 16, 2019 · 2 comments

Comments

@Valuebai
Copy link
Owner

性能问题——加载jieba分词的model需要1s左右

性能指标:在初次打开阶段时间较长,后续逐渐变好,所以这是为啥呢?
——已经定位原因,首次加载jieba分词时loading了1.309s导致的

Building prefix dict from the default dictionary ...
Dumping model to file cache C:\Users\AppData\Local\Temp\jieba.cache
Loading model cost 1.309 seconds.
Prefix dict has been built succesfully.

解决:

  • 如果不希望每次都加载词库,可以让jieba初始化后再后台一直运行:
  • 比如在flask中使用的时候应该在初始化app文件中初始化jieba,然后其他程序再调用初始化后的,这个之后讲flask的时候会讲到
jieba 采用延迟加载,import jieba和 jieba.Tokenizer()不会立即触发词典的加载,
一旦有必要才开始加载词典构建前缀字典。如果你想手工初始 jieba,也可以手动初始化。

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import jieba
jieba.initialize()

【Me】https://github.com/Valuebai/

【参考】
1、在分布式环境Spark中关闭jieba延时加载等优化方法 (3):https://blog.csdn.net/macanv/article/details/87860691
2、jieba延迟加载问题解决:https://blog.csdn.net/yjs17125/article/details/81739382

@Valuebai
Copy link
Owner Author

P.S. 另外的原因

这个服务器是海外的,也会影响数据的返回
http://139.180.217.25:8188/TextSummarization/

阿里云服务器,则是正常的
http://39.100.3.165:8188/TextSummarization/

@mango99
Copy link

mango99 commented Oct 18, 2020

你好,�这个项目打开后界面怎么只有一个模版呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants