Tencent-MVSE is a large-scale benchmark dataset for the multi-modal video similarity evaluation task. The features of Tencent-MVSE includes
You can download the Tencent-MVSE dataset from here.
The folder is in the following architecture:
tencent-mvse/ ├── annotations │ ├── pairwise.json │ ├── pairwise.tsv │ ├── pointwise.json │ ├── test-dev.json │ ├── test_dev.tsv │ ├── test-std.json │ └── test_std.tsv └── features ├── clip │ ├── pairwise.zip │ ├── pointwise_0.zip │ ├── pointwise_1.zip │ ├── ... │ ├── pointwise_20.zip │ ├── test-dev.zip │ └── test-std.zip ├── efficientnetb3 │ └── ... └── resnet50 └── ...
where *.json store the meta-data, *.tsv are annotation scores, and *.zip contain video features.
If you intend to publish results based on the Tencent-MVSE dataset, please kindly include the following reference:
@inproceedings{zeng2019tencent, title={Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity Evaluation}, author={Zhaoyang Zeng, Yongsheng Luo, Zhenhua Liu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2022} }