Anurag Arnab
University of Oxford
H-index: 26
Europe-United Kingdom
Top articles of Anurag Arnab
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Streaming Dense Video Captioning | arXiv preprint arXiv:2404.01297 | Xingyi Zhou Anurag Arnab Shyamal Buch Shen Yan Austin Myers | 2024/4/1 |
Pre-Training a Model Using Unlabeled Videos | 2024/4/18 | ||
An automated method for tendon image segmentation on ultrasound using grey-level co-occurrence matrix features and hidden Gaussian Markov random fields | Computers in Biology and Medicine | Isabelle Scott David Connell Derek Moulton Sarah Waters Ana Namburete | 2024/2/1 |
Does Visual Pretraining Help End-to-End Reasoning? | Advances in Neural Information Processing Systems | Chen Sun Calvin Luo Xingyi Zhou Anurag Arnab Cordelia Schmid | 2024/2/13 |
Time-, Memory-and Parameter-Efficient Visual Adaptation | arXiv preprint arXiv:2402.02887 | Otniel-Bogdan Mercea Alexey Gritsenko Cordelia Schmid Anurag Arnab | 2024/2/5 |
Token turing machines | Michael S Ryoo Keerthana Gopalakrishnan Kumara Kahatapitiya Ted Xiao Kanishka Rao | 2023 | |
Victr: Video-conditioned text representations for activity recognition | arXiv preprint arXiv:2304.02560 | Kumara Kahatapitiya Anurag Arnab Arsha Nagrani Michael S Ryoo | 2023/4/5 |
Pali-x: On scaling up a multilingual vision and language model | arXiv preprint arXiv:2305.18565 | Xi Chen Josip Djolonga Piotr Padlewski Basil Mustafa Soravit Changpinyo | 2023/5/29 |
Pixel aligned language models | arXiv preprint arXiv:2312.09237 | Jiarui Xu Xingyi Zhou Shen Yan Xiuye Gu Anurag Arnab | 2023/12/14 |
Attention Bottlenecks for Multimodal Fusion | 2023/6/8 | ||
Video Summarization: Towards Entity-Aware Captions | arXiv preprint arXiv:2312.02188 | Hammad A Ayyubi Tianqi Liu Arsha Nagrani Xudong Lin Mingda Zhang | 2023/12/1 |
UnLoc: a unified framework for video localization tasks | Anurag Arnab Arsha Nagrani Cordelia Schmid David Ross Shen Yan | 2023 | |
Cat-seg: Cost aggregation for open-vocabulary semantic segmentation | arXiv preprint arXiv:2303.11797 | Seokju Cho Heeseong Shin Sunghwan Hong Seungjun An Seungjun Lee | 2023/3/21 |
End-to-end spatio-temporal action localisation with video transformers | arXiv preprint arXiv:2304.12160 | Alexey Gritsenko Xuehan Xiong Josip Djolonga Mostafa Dehghani Chen Sun | 2023/4/24 |
Scaling vision transformers to 22 billion parameters | Mostafa Dehghani Josip Djolonga Basil Mustafa Piotr Padlewski Jonathan Heek | 2023/7/3 | |
How can objects help action recognition? | Xingyi Zhou Anurag Arnab Chen Sun Cordelia Schmid | 2023 | |
Optimizing vivit training: Time and memory reduction for action recognition | arXiv preprint arXiv:2306.04822 | Shreyank N Gowda Anurag Arnab Jonathan Huang | 2023/6/7 |
Adaptive computation with elastic input sequence | Fuzhao Xue Valerii Likhosherstov Anurag Arnab Neil Houlsby Mostafa Dehghani | 2023/7/3 | |
Audiovisual masked autoencoders | Mariana-Iuliana Georgescu Eduardo Fonseca Radu Tudor Ionescu Mario Lucic Cordelia Schmid | 2023 | |
Computer vision neural networks with learned tokenization | 2023/12/21 |