Awesome-video-moment-retrieval

A personal paper list on Video Moment Retrieval (VMR), or Natural Language Video Localization (NLVL), or Temporal Sentence Grounding in Videos (TSGV)), Natural Language Query (NLQ).

Keywords: moment retrieval, temporal grounding, video/language/moment grounding/localization, sentence grounding, etc.

1 Papers List

Summarized by,

2 Quick references

Survey

视频片段检索研究综述, 软件学报，2020
A survey on temporal sentence grounding in videos. in ArXiv 2021
The Elements of Temporal Sentence Grounding in Videos: A Survey and Future Directions. in ArXiv 2022

Datasets

Dataset	Video Source	Domain
TACoS	Kitchen	Cooking
Charades-STA	Homes	Indoor Activity
ActivityNet Captions	Youtube	Open
DiDeMo	Flickr	Open
MAD， CVPR22	Movie	Open

Referring to this paper, more info,

Dataset	Video #	VL-pair# --> train	val	Test	Vocab Size
ActivityNet Captions	14926	37421	17505	17031	15406
TACoS	127	10146	4589	4083	2255
DiDeMo	10642	33005	4180	4021	7523
Charades-STA	6670	12404	-	3720	1289

Normally, top three is widely used. Then processed feature,

Visual: 1) by 3D ConvNet, e.g. C3D, I3D 2) by 2D ConvNet, e.g. vgg

Text: 1) pretained word embeddings, e.g. GloVe 2) pre-trained language models, e.g. BERT

NEW MAD: both by CLIP.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assert		assert
README.md		README.md
plist-by-methods.md		plist-by-methods.md
plist-by-people.md		plist-by-people.md
plist-by-year.md		plist-by-year.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-video-moment-retrieval

1 Papers List

2 Quick references

Survey

Datasets

Process

Performance Comparisons

3 Resources

About

ZhenZHAO/awesome-video-moment-retrieval

Folders and files

Latest commit

History

Repository files navigation

Awesome-video-moment-retrieval

1 Papers List

2 Quick references

Survey

Datasets

Process

Performance Comparisons

3 Resources

About

Topics

Resources

Stars

Watchers

Forks