-
Notifications
You must be signed in to change notification settings - Fork 45
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Replace PaiEas LLM with LLI-integration and upgrade python to 3.11 (#148) * Replace PaiEas LLM with LLI-integration and upgrade python version to 3.11 * Replace MyFCDashScope with OpenAILike class * Fix pyproject dependency * bug fix (#149) * Support postgresql load user dict (#150) * make format * Allow not install extension pg_jieba * table name data_default * Convert raptor processor to TransformComponent (#151) * udpate raptor using transform * modify raptor with transform * modify raptor and dataloader --------- Co-authored-by: Yue Fei <[email protected]> * Add clip model (#130) * Update * Add clip model * Fix oss cache * Fix cache * Pdf reader upload image * Add multimodal * Update config * Use two embedding * Add text_image node * Add tests * Fix tests * fix multi_modal_vector --------- Co-authored-by: 燃夏 <[email protected]> * Fix docker base image (#152) * change insert to be sync (#153) * Personal/ranxia/fix image readme (#155) * fix multi_modal and readme * fix multi_modal and readme * fix multi_modal and readme * fix multi_modal image (#156) * Support Agentic RAG with intent and functioncalling (#154) * Add intent detection module * Remove LlmQuery class * Support API * Refactor agent module and format toml * Refactor module tool * Refactor query api * Add demo and UI * remove * Fix reviews * Add test for intent and api * Add web search (#161) * Add web search * Fix lint * Fix bug * Update timeout * Fix bug * Fix jieba bug (#163) * Support PAI-EAS MultiModal LLM (#168) * Support minicpm * Fix issue * Bugfix: PaiEas LLM endpoint & max_tokens (#171) * Fix dashscope interface (#172) * Fix dashscope llm * Fix bug * Fix test bug (#174) * add minerU (#160) * add minerU * add minerU * add minerU * Fix nodes id and simi_topK * remove image url from text * remove image url from text * remove image url from text * Support FAQ query w/o image (#162) * Support FAQ query w/o image * Using LLM when query w/o images * Personal/ranxia/mineru enhancement (#164) * remove repeat nodes * show multiple pictures in media * show multiple pictures in media * Install miner with poetry (#165) * fix retriever * Support OSS Data Loader (#166) * Support oss data loader * Skip file which has been uploaded * Support oss prefix via api * 1. change image size (#167) 2. limit image number 3. fix retriever answer ui format * adjust image score (#169) * merge feature * merge feature * merge feature * merge feature * Fix bug (#173) * Support chunk text-overflow display (#170) * Fix bugs * Support text-overflow * Support text-overflow * Support load MinerU config file automatically (#175) * Support load MinerU config file automatically * Modify * Direct writing the config rather than copying * Fix multi_modal build docker (#176) * fix load_config (#177) * change multimodal prompt (#178) * Test Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix test bug (#174) (#179) Co-authored-by: Yue Fei <[email protected]> * Fix Dockerfile (#180) * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix Dockerfile * Fix docker env (#181) * Fix Dockerfile * Fix bugs * Fix docker env * Fix docker env * Fix docker env (#183) * Fix Dockerfile * Fix bugs * Fix docker env * Fix docker env * Fix docker env * Fix docker env * Fix docker env * Bugfix * Bugfix for EAS (#184) * Fix Dockerfile * Fix bugs * Fix docker env * Fix docker env * Fix docker env * Fix docker env * Fix docker env * Bugfix * Bugfix * Fix detectron link (#182) * Update detectron dependency (#185) * Update dependency * udpate poetry lock * fix multimodal_config and prompt (#186) * fix MinerU readme (#189) * Add timeout and more logs (#188) * Personal/ranxia/fix miner u readme (#190) * fix MinerU readme * fix MinerU readme * Personal/ranxia/fix miner u readme (#191) * fix MinerU readme * fix MinerU readme * fix MinerU config * fix MinerU bug (#192) * Personal/ranxia/fix test and review bug (#193) * fix MinerU bug * fix MinerU bug * fix MinerU bug * fix MinerU bug * fix MinerU bug * fix MinerU bug * fix MinerU bug --------- Co-authored-by: 筱文 <[email protected]> Co-authored-by: Yue Fei <[email protected]> * fix multimodal readme and config (#195) * nl2sql refactoring (#194) * change insert to be sync * add nl2sql * nl2sql setting * nl2sql setting * fix test bug * fix bugs * data analysis retriever and synthesizer * fix tests bugs * add data_analysis ui * update poetry.lock * remove unnecessary comment * add fault tolerance if no file provided * add minor fault tolerance * add upload_datasheet * nl2sql refactor and add db ui * restore retriever & synthesizer * update poetry.lock * Fix list merge * bug fix * add default display --------- Co-authored-by: 陆逊 <[email protected]> * Personal/xi/nl2sql UI (#196) * change insert to be sync * add nl2sql * nl2sql setting * nl2sql setting * fix test bug * fix bugs * data analysis retriever and synthesizer * fix tests bugs * add data_analysis ui * update poetry.lock * remove unnecessary comment * add fault tolerance if no file provided * add minor fault tolerance * add upload_datasheet * nl2sql refactor and add db ui * restore retriever & synthesizer * update poetry.lock * Fix list merge * bug fix * add default display * data_analysis ui update --------- Co-authored-by: 陆逊 <[email protected]> * Personal/ranxia/change max new tokens (#199) * set multimodal llm max_new_tokens * set multimodal llm max_new_tokens * Add trace (#197) * Add trace * Fix bug * Push to hangzhou region by default * 修复tables和descriptions默认配置bug (#198) * change insert to be sync * add nl2sql * nl2sql setting * nl2sql setting * fix test bug * fix bugs * data analysis retriever and synthesizer * fix tests bugs * add data_analysis ui * update poetry.lock * remove unnecessary comment * add fault tolerance if no file provided * add minor fault tolerance * add upload_datasheet * nl2sql refactor and add db ui * restore retriever & synthesizer * update poetry.lock * Fix list merge * bug fix * add default display * data_analysis ui update * fix table & description & query_output bugs * fix inconsistency between frontend and backend data structures --------- Co-authored-by: 陆逊 <[email protected]> * Fix nginx routing (#200) * Fix nginx routing (#202) * Fix nginx routing * Fix nginx config * add data_analysis doc (#201) Co-authored-by: Yue Fei <[email protected]> * Resolve conflict --------- Co-authored-by: wwxxzz <[email protected]> Co-authored-by: aero-xi <[email protected]> Co-authored-by: zt2645802240 <[email protected]> Co-authored-by: 燃夏 <[email protected]>
- Loading branch information
1 parent
1af37d2
commit dac813e
Showing
127 changed files
with
13,901 additions
and
5,032 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
# 模型配置 | ||
|
||
在web界面 Settings 中,左侧下方选择需要的LLM,如果选择DashScope(通义API),推荐使用qwen-max模型;如果选择PaiEas开源部署,推荐使用qwen2-72b-instruct模型 | ||
|
||
点击右侧button更新使用的模型 | ||
|
||
![llm_selection](/docs/figures/data_analysis/llm_selection.png) | ||
|
||
点击web界面上方Data Analysis,进入到数据分析页面,支持两种类型的数据分析:连接数据库(mysql)分析 和 上传表格文件(excel/csv)分析 | ||
|
||
![data_analysis_overview](/docs/figures/data_analysis/data_analysis_overview.png) | ||
|
||
# 数据库分析配置 | ||
|
||
## 数据库连接 | ||
|
||
连接数据库,选择左上方数据分析类型为 database,出现数据库连接配置界面,如下图: | ||
|
||
![db_config](/docs/figures/data_analysis/db_config.png) | ||
|
||
其中, | ||
|
||
- Dialect为数据库类别,当前支持mysql,默认mysql | ||
- Username和Passoword分别为用户名和密码 | ||
- Host为本地或远程数据库url,Port为接口,默认3306 | ||
- DBname为需要分析的目标数据库名称 | ||
- Tables为需要分析的数据表,格式为:table_A, table_B,... ,默认为空,使用目标数据库中所有数据表 | ||
- Descriptions为针对目标数据库中每张表的补充描述,比如对表中字段的进一步解释,可以提升数据分析效果,格式为:{"table_A":"字段a表示xxx,字段b数据的格式为yyy","table_B":"这张表主要用于zzz"},注意:需要使用英文输入法下的字典格式(英文双引号,冒号,逗号),默认为空 | ||
|
||
填好以上信息后,点击左侧下方Connect Database按钮,看到Connection info如下图,表示连接成功,可以在右侧chatbot中进行提问 | ||
|
||
![db_connect](/docs/figures/data_analysis/db_connect.png) | ||
|
||
如果需要更新数据库,重新填写以上信息,点击Connect Dtabase即可 | ||
|
||
## 查询效果优化 | ||
|
||
针对数据表中字段含义不清晰,或者字段存储内容格式不清晰等问题,可以在Descriptions中增加相应描述,帮助llm更准确提取数据表内容,此处以公开数据集Spider中my_pets数据库为例,其中pets表数据如下: | ||
|
||
![table_example](/docs/figures/data_analysis/table_example.png) | ||
|
||
问答效果对比: | ||
|
||
当描述为空时,对问题“有几只狗”生成的sql查询语句为:SELECT COUNT(\*) FROM pets WHERE PetType = '狗',查询不到 | ||
|
||
![db_query_no_desc](/docs/figures/data_analysis/db_query_no_desc.png) | ||
|
||
增加简单描述后,生成的sql查询语句为:SELECT COUNT(\*) FROM pets WHERE PetType = 'Dog',可以准确回答 | ||
|
||
![db_query_desc](/docs/figures/data_analysis/db_query_desc.png) | ||
|
||
如果查询效果有明显改善,可以将相应的补充描述在数据库中作为相应table或column的comment持久化添加 | ||
|
||
# 表格文件分析配置 | ||
|
||
表格文件配置相对简单,选择左上方的分析类型为:datafile,出现以下界面 | ||
|
||
![sheet_upload](/docs/figures/data_analysis/sheet_upload.png) | ||
|
||
点击左侧中部的上传,一次上传一份表格文件(excel或csv格式),上传成功后,左侧下方会出现文件的前几行预览,如下图所示: | ||
|
||
![sheet_data_preview](/docs/figures/data_analysis/sheet_data_preview.png) | ||
|
||
上传表格文件后可以直接在右侧chatbot中提问,如需更换表格,重新上传所需表格即可 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.