SAM 3

SAM 3: Segment Anything with Concepts

Nicolas Carion, Laura Gustafson, Yuan-Ting Hu, Shoubhik Debnath, Ronghang Hu, Didac Suris, Chaitanya Ryali, Kalyan Vasudev Alwala, Haitham Khedr, Andrew Huang, Jie Lei, Tengyu Ma, Baishan Guo, Arpit Kalla, Markus Marks, Joseph Greer, Meng Wang, Peize Sun, Roman Rädle, Triantafyllos Afouras, Effrosyni Mavroudi, Katherine Xu, Tsung-Han Wu, Yu Zhou, Liliane Momeni, Rishi Hazra, Shuangrui Ding, Sagar Vaze, Francois Porcher, Feng Li, Siyuan Li, Aishwarya Kamath, Ho Kei Cheng, Piotr Dollár, Nikhila Ravi, Kate Saenko, Pengchuan Zhang, Christoph Feichtenhofer
ICLR 2026
arXiv demo code

Perception Encoder

Perception Encoder: The best visual embeddings are not at the output of the network

Daniel Bolya, Po-Yao Huang, Peize Sun, Jang Hyun Cho, Andrea Madotto, Chen Wei, Tengyu Ma, Jiale Zhi, Jathushan Rajasegaran, Hanoona Rasheed, Junke Wang, Marco Monteiro, Hu Xu, Shiyu Dong, Nikhila Ravi, Daniel Li, Piotr Dollár, Christoph Feichtenhofer
NeurIPS 2025
Spotlight
arXiv code

PerceptionLM

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi, Triantafyllos Afouras, Tushar Nagarajan, Muhammad Maaz, Yale Song, Tengyu Ma, Shuming Hu, Suyog Jain, Miguel Martin, Huiyu Wang, Hanoona Rasheed, Peize Sun, Po-Yao Huang, Daniel Bolya, Nikhila Ravi, Shashank Jain, Tammy Stark, Shane Moon, Babak Damavandi, Vivian Lee, Andrew Westbury, Salman Khan, Philipp Krähenbühl, Piotr Dollár, Lorenzo Torresani, Kristen Grauman, Christoph Feichtenhofer
NeurIPS 2025
arXiv code

SAM 2: Segment Anything in Images and Videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman Rädle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Dollár, Christoph Feichtenhofer
ICLR 2025
Oral Presentation Best Paper Honourable Mention Top 6 of 11,672 submissions
arXiv demo code

Segment Anything

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex C. Berg, Wan-Yen Lo, Piotr Dollàr, Ross Girshick
ICCV 2023
Oral Presentation Best Paper Honourable Mention Top 3 of 8,620 submissions
arXiv website code

Omni3D

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari
CVPR 2023
arXiv website code

Shape and Layout

Learning 3D Object Shape and Layout without 3D Supervision

Georgia Gkioxari, Nikhila Ravi, Justin Johnson
CVPR 2022
arXiv website

ViewSeg

Recognizing Scenes from Novel Viewpoints

Shengyi Qian, Alexander Kirillov, Nikhila Ravi, Devendra Singh Chaplot, Justin Johnson, David F. Fouhey, Georgia Gkioxari
arXiv website

Omnivore

Omnivore: A Single Model for Many Visual Modalities

Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra
CVPR 2022 Oral Presentation
arXiv website code

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image

Ronghang Hu, Nikhila Ravi, Alex Berg, Deepak Pathak
ICCV 2021 Oral Presentation
arXiv website code

PyTorch3D

Accelerating 3D Deep Learning with PyTorch3D

Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari
NeurIPS 2020 WiML Workshop, one of six selected talks
arXiv website code

C3DPO

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure from Motion

David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedaldi
ICCV 2019 Oral Presentation
arXiv website code