Skip to content

Latest commit

 

History

History
49 lines (49 loc) · 1.95 KB

2024-01-23-rahman24a.md

File metadata and controls

49 lines (49 loc) · 1.95 KB
title abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation
Transformers have shown great success in medical image segmentation. However, transformers may exhibit a limited generalization ability due to the underlying single-scale self-attention (SA) mechanism. In this paper, we address this issue by introducing a Multi-scale hiERarchical vIsion Transformer (MERIT) backbone network, which improves the generalizability of the model by computing SA at multiple scales. We also incorporate an attention-based decoder, namely Cascaded Attention Decoding (CASCADE), for further refinement of multi-stage features generated by MERIT. Finally, we introduce an effective multi-stage feature mixing loss aggregation (MUTATION) method for better model training via implicit ensembling. Our experiments on two widely used medical image segmentation benchmarks (i.e., Synapse Multi-organ, ACDC) demonstrate the superior performance of MERIT over state-of-the-art methods. Our MERIT architecture and MUTATION loss aggregation can be used with downstream medical image and semantic segmentation tasks.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
rahman24a
0
Multi-scale Hierarchical Vision Transformer with Cascaded Attention Decoding for Medical Image Segmentation
1526
1544
1526-1544
1526
false
Rahman, Md Mostafijur and Marculescu, Radu
given family
Md Mostafijur
Rahman
given family
Radu
Marculescu
2024-01-23
Medical Imaging with Deep Learning
227
inproceedings
date-parts
2024
1
23