Tencent-MVSE

Overview

Tencent-MVSE is a large-scale benchmark dataset for the multi-modal video similarity evaluation task. The features of Tencent-MVSE includes

A total of 1,135,705 videos, including 1 million videos for unsupervise pre-training 63,613 videos for training, and 63,960 videos for evaluation.
Rich meta-data for each video, including title, ASR text, categories, and tags.
We annotate 67,854 video pairs for training and 67,887 video pairs for evaluation.
The 328 categories and 64,903 tags are all manually annotated by human.

Download

You can download the Tencent-MVSE dataset from here.

The folder is in the following architecture:

tencent-mvse/
├── annotations
│   ├── pairwise.json
│   ├── pairwise.tsv
│   ├── pointwise.json
│   ├── test-dev.json
│   ├── test_dev.tsv
│   ├── test-std.json
│   └── test_std.tsv
└── features
    ├── clip
    │   ├── pairwise.zip
    │   ├── pointwise_0.zip
    │   ├── pointwise_1.zip
    │   ├── ...
    │   ├── pointwise_20.zip
    │   ├── test-dev.zip
    │   └── test-std.zip
    ├── efficientnetb3
    │   └── ...
    └── resnet50
        └── ...

where *.json store the meta-data, *.tsv are annotation scores, and *.zip contain video features.

Statistics

Citation

If you intend to publish results based on the Tencent-MVSE dataset, please kindly include the following reference:

@inproceedings{zeng2019tencent,
  title={Tencent-MVSE: A Large-Scale Benchmark Dataset for Multi-Modal Video Similarity Evaluation},
  author={Zhaoyang Zeng, Yongsheng Luo, Zhenhua Liu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Overview

Download

Statistics

Citation

Copyright © 2022. Tencent QQ Browser Lab. All rights reserved.