This project is a homework when I study in the AI School 2020, Microsoft, you can find the paper in Method to Compare Two Funds Issued Before and After Stock Crash, the idea is to elimate the influence by the fluctuation in overall stock market, for example, funds issued in May 2020 gain a lot just because the bull market, it troubles us when we compare it to some old issues.
git clone https://github.com/wangershi/analyzeChineseFund.git
You should run the scripts in Python3 and install below packages.
pip install -r .\requirements.txt
Use below commands to crawl the data.
python src/crawlFundData.py crawlAllFundData --ifCrawlBasicInformation=True --ifCrawlPortfolio=True --ifCrawlHistoricalValue=True
Dump historical data into bin using Qlib.
python src/dump_bin.py dump_all --csv_path data/historicalValue --qlib_dir data/bin --freq day --date_field_name Date --exclude_fields Dividends --fund_to_specify_date 000934
Prepare data to train.
python src/trainGBDT.py prepareTrainDataset --ifSavePortfolioIndex=False
To find more details, please refer to dataPrepare.
You can train the model and evaluate it like this.
python src/trainGBDT.py trainModel
Get the adjusted factor to latest day.
python src/trainGBDT.py testModel
We get the adjustFactorToLatestDay to dayInStandard.
I try to use optuna to fine tune automatically, but the result is not good, so I quit it.
python src/trainGBDT.py autoFineTune
After we get the adjusted factor, we can evaluate it again.
python src/analyzeData.py getAverageSlopeForFundsInSameRange --ifUseAdjustFactorToLatestDay=True
The model flatten the distribution of average return.
The standard deviation of average return drop from 0.0520 to 0.0175.
Besides, we can get the return and risk after adjusted.
python src/analyzeData.py analyzeHistoricalValue --ifUseNewIssues=True --ifUseOldIssues=True --ifUseWatchList=False --ifUseAdjustFactorToLatestDay=True --ifPrintFundCode=False