[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recog...
Source code for "Taming Visually Guided Sound Generation" (Oral at the B...
Tools for movie and video research
STEP: Spatio-Temporal Progressive Learning for Video Action Detection. C...
Temporally Efficient Vision Transformer for Video Instance Segmentation,...
ActionVLAD for video action classification (CVPR 2017)
The 2nd place Solution to the Youtube-8M Video Understanding Challenge b...
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term...
Pytorch Implementation of "Object level Visual Reasoning in Videos", F. ...
Paper list of activity prediction and related area
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transfor...
Video Contrastive Learning with Global Context, ICCVW 2021
Easily convert RGB video data (e.g. .avi) to the TensorFlow tfrecords fi...
PyTorch implementation of BEVT (CVPR 2022) https://arxiv.org/abs/2112.01529
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirection...