A PyTorch implementation of MCA based on PRICAI 2022 paper Weakly-supervised Temporal Action Localization with Multi-head Cross-modal Attention.
Git clone the corresponding repos and replace the files provided by us, then run the code according to readme
of
corresponding repos.
For example, to train HAM-Net on THUMOS14 dataset:
git clone https://github.com/asrafulashiq/hamnet.git
mv AGCT/hamnet/* hamnet/
python main.py
To evaluate HAM-Net on THUMOS14 dataset:
python main.py --test --ckpt [checkpoint_path]
The models are trained on one NVIDIA GeForce TITAN X GPU (12G). All the hyper-parameters are the default values.
Method | THUMOS14 | Download | |||||||
---|---|---|---|---|---|---|---|---|---|
[email protected] | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | [email protected] | mAP@AVG | ||
HAM-Net | 66.8 | 60.9 | 52.2 | 42.9 | 33.4 | 22.7 | 12.2 | 41.6 | OneDrive |
CoLA | 67.5 | 60.6 | 51.9 | 43.2 | 34.2 | 24.2 | 13.9 | 42.2 | OneDrive |
CO2-Net | 70.8 | 64.7 | 55.7 | 46.8 | 39.8 | 26.5 | 13.8 | 45.4 | OneDrive |
mAP@AVG is the average mAP under the thresholds 0.1:0.1:0.7.
Method | ActivityNet 1.2 | Download | |||
---|---|---|---|---|---|
[email protected] | [email protected] | [email protected] | mAP@AVG | ||
HAM-Net | 41.3 | 25.2 | 5.5 | 25.4 | OneDrive |
CoLA | 41.0 | 27.5 | 4.2 | 26.4 | OneDrive |
CO2-Net | 44.4 | 27.0 | 5.4 | 27.1 | OneDrive |
mAP@AVG is the average mAP under the thresholds 0.5:0.05:0.95.