Skip to content

Changes between December, 12th and December, 18th

Compare
Choose a tag to compare
@KarelZe KarelZe released this 18 Dec 19:58
· 293 commits to main since this release
20791d1

What's Changed

Empirical Study โš—๏ธ

  • Add clnv results ๐ŸŽฏ by @KarelZe in #82. Add results for CLNV method as discussed in the meeting with @CaroGrau .
  • Add learning curves for CatBoost ๐Ÿˆ by @KarelZe in #83. Helps to detect overfitting/underfitting. Learning curves are now also logged/tracked.
  • Improve accuracy [~1.2 %] by @KarelZe in #79. Most of the time was spent on improving the first model's accuracy (gbm).. Planned to improve by 4 %, achieved an improvement of 1.2 % compared to the previous week. Obtaining this improvement required a deep dive into gradient boosting, the catboost library and a bit of feature engineering. Roughly 1/3 of the improvement in accuracy comes from improved feature engineering, 1/3s from early stopping, and 1/3 from larger ensembles/fine-grained quantization/sample weighting. I tried to link quantization found in gradient boosting with quantile transformation from feature engineering, but it didn't work out. Did some sanity checks like comparing implementation with lightgbm, time-consistency analysis or updated adversarial validation,
  • Also, spent quite a bit of time researching on feature engineering techniques, focusing on features that can not be synthesized by neural nets or tree-based approaches.

Writing ๐Ÿ“–

  • Add reworked TOC and drafts ๐ŸŽ† by @KarelZe in #80 as requested by @CaroGrau.
  • Draft for chapters trees, ordered boosting, and imputation๐ŸŒฎ by @KarelZe in #81. Continued research and drafting chapters on decision trees, gradient boosting, and feature scaling and imputation. Requires more work e. g., derivations of loss function in gradient boosting for classification was more involved than I expected. The draft is not as streamlined as it could be.

Outlook ๐ŸŽ†

  • Focus on drafting chapters only on gradient boosting, basic transformer architectures and specialized architectures.
  • Train transformers until meeting with @CaroGrau, but spent no time optimizing/improving them.

Full Changelog: v0.2.5...v0.2.6