Changes between November, 21st and November, 27th
What's Changed
Empirical Study ⚗️
- Add TabTransformer baseline 🤖 by @KarelZe in #34. Involved implementation and documentation of the model, early stopping, data set and data loader. Most notably, I was able to speed up the implementation of https://github.com/kathrinse/TabSurvey/ by factor x9.8 (see notebook through an improved data loader, decoupling of training and data loading, and mixed precision support. Also tested were fused operations, pre-loading, and the use of pinned memory. An analysis with the PyTorch profiler reveals that the GPU is now less idle. Training on the entire data set is theoretically possible.
- Fix classical rules🐞 by @KarelZe in #41. The issue came up during last week's discussion with @CaroGrau. The differences in accuracy are tiny. Usually < 1 %.
- Add test cases for classical classifier ⛑️ by @KarelZe in #42. Tests are formal e. g., correct shapes of predictions or fitting behaviour.
- Add implementation of
CLNV
method 🏖️ by @KarelZe in #43 - Add tests for TabTransformer ⛑️ by @KarelZe in #44. Test for shapes of predictions, for parameter updates and convergence.
Writing 📖
- Add questions for this weeks meeting ❓ by @KarelZe in #39
- Researched techniques and new papers on speeding up transformers
Outlook 🔭
- Read more again and minimize the stack of open papers (40+)
- Better connect existing ideas in zettelkasten
- Finish exploratory data analysis i. e., include new features, refactor to training data only, and do CV to better understand features
- Improve test coverage i. e., data loader and classical rules
Full Changelog: v0.2.2...v0.2.3