language:
- en license: apache-2.0 size_categories:
- 1K<n<10K task_categories:
- video-classification
- text-to-video
- text-classification
pretty_name: Veo 3.1 Human Preferences
dataset_info:
features:
- name: prompt dtype: string
- name: video1 dtype: string
- name: video2 dtype: string
- name: weighted_results1_Alignment dtype: float64
- name: weighted_results2_Alignment dtype: float64
- name: detailedResults_Alignment
list:
- name: userDetails
struct:
- name: age dtype: string
- name: country dtype: string
- name: gender dtype: string
- name: language dtype: string
- name: occupation dtype: string
- name: userScores
struct:
- name: global dtype: float64
- name: votedFor dtype: string
- name: userDetails
struct:
- name: weighted_results1_Coherence dtype: float64
- name: weighted_results2_Coherence dtype: float64
- name: detailedResults_Coherence
list:
- name: userDetails
struct:
- name: age dtype: string
- name: country dtype: string
- name: gender dtype: string
- name: language dtype: string
- name: occupation dtype: string
- name: userScores
struct:
- name: global dtype: float64
- name: votedFor dtype: string
- name: userDetails
struct:
- name: weighted_results1_Preference dtype: float64
- name: weighted_results2_Preference dtype: float64
- name: detailedResults_Preference
list:
- name: userDetails
struct:
- name: age dtype: string
- name: country dtype: string
- name: gender dtype: string
- name: language dtype: string
- name: occupation dtype: string
- name: userScores
struct:
- name: global dtype: float64
- name: votedFor dtype: string
- name: userDetails
struct:
- name: file_name1 dtype: string
- name: file_name2 dtype: string
- name: model1 dtype: string
- name: model2 dtype: string splits:
- name: train num_bytes: 6227078 num_examples: 1643 download_size: 660798 dataset_size: 6227078 configs:
- config_name: default
data_files:
- split: train path: data/train-* tags:
- videos
- t2v
- text-2-video
- text2video
- text-to-video
- human
- annotations
- preferences
- likert
- coherence
- alignment
- wan
- wan 2.1
- veo2
- veo
- pikka
- alpha
- sora
- hunyuan
- veo3
- mochi-1
- seedance-1-pro
- seedance
- seedance 1
- Marey
- moonvalley
- sora2
- openai
- veo 3.1
Rapidata Video Generation Veo 3.1 Human Preference
In this dataset, ~74k human responses from ~23k human annotators were collected to evaluate the Veo 3.1 video generation model on our benchmark. This dataset was collected using the Rapidata Python API, accessible to anyone and ideal for large scale data annotation.
Explore our latest model rankings on our website.
If you get value from this dataset and would like to see more in the future, please consider liking it ❤️
Overview
In this dataset, ~74k human responses from ~23k human annotators were collected to evaluate the Veo 3.1 video generation model on our benchmark. This dataset was collected in roughtly 30 min using the Rapidata Python API, accessible to anyone and ideal for large scale data annotation. The benchmark data is accessible on huggingface directly.
Explanation of the colums
The dataset contains paired video comparisons. Each entry includes 'video1' and 'video2' fields, which contain links to downscaled GIFs for easy viewing. The full-resolution videos can be found here
The weighted_results column contains scores ranging from 0 to 1, representing aggregated user responses. Individual user responses can be found in the detailedResults column.
Alignment
The alignment score quantifies how well an video matches its prompt. Users were asked: "Which video fits the description better?".
Examples
Coherence
The coherence score measures whether the generated video is logically consistent and free from artifacts or visual glitches. Without seeing the original prompt, users were asked: "Which video has more glitches and is more likely to be AI generated?"
Examples
Preference
The preference score reflects how visually appealing participants found each video, independent of the prompt. Users were asked: "Which video do you prefer aesthetically?"
Examples
About Rapidata
Rapidata's technology makes collecting human feedback at scale faster and more accessible than ever before. Visit rapidata.ai to learn more about how we're revolutionizing human feedback collection for AI development.
Other Datasets
We run a benchmark of the major video generation models, the results can be found on our website. We rank the models according to their coherence/plausiblity, their aligment with the given prompt and style prefernce. The underlying 2M+ annotations can be found here:
- Link to the Rich Video Annotation dataset
- Link to the Coherence dataset
- Link to the Text-2-Image Alignment dataset
- Link to the Preference dataset