CVPR2023

CVPR 2023 录用论文

CVPR 2023 统计数据:

提交:9155 篇论文

接受:2360 篇论文(接受率 25.8%)

亮点:235 篇论文(接受论文的 10%,提交论文的 2.6%)

获奖候选人:12 篇论文(接受论文的 0.51%,提交论文的 0.13%)


已接受论文列表(未决抄袭和双重提交检查):

Generating Human Motion from Textual Descriptions with High Quality Discrete Representation

Jianrong Zhang · Yangsong Zhang · Xiaodong Cun · Yong Zhang · Hongwei Zhao · Hongtao Lu · Xi SHEN · Ying Shan

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Wenxuan Zhang · Xiaodong Cun · Xuan Wang · Yong Zhang · Xi SHEN · Yu Guo · Ying Shan · Fei Wang

Explicit Visual Prompting for Low-Level Structure Segmentations

Weihuang Liu · Xi SHEN · Chi-Man Pun · Xiaodong Cun

Privacy-preserving Adversarial Facial Features

Zhibo Wang · He Wang · Shuaifan Jin · Wenwen Zhang · Jiahui Hu · Yan Wang · Peng Sun · Wei Yuan whu · Kaixin Liu · Kui Ren

NeRF-RPN: A general framework for object detection in NeRFs

Benran Hu · Junkai Huang · Yichen Liu · Yu-Wing Tai · Chi-Keung Tang

Category Query Learning for Human-Object Interaction Classification

Chi Xie · Fangao Zeng · Yue Hu · Shuang Liang · Yichen Wei

A Unified Pyramid Recurrent Network for Video Frame Interpolation

Xin Jin · LONG WU · Jie Chen · Chen Youxin · Jay Koo · Cheul-hee Hahm

SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

Chong Bao · Yinda Zhang · Bangbang Yang · Tianxing Fan · Zesong Yang · Hujun Bao · Guofeng Zhang · Zhaopeng Cui

PATS: Patch Area Transportation with Subdivision for Local Feature Matching

Junjie Ni · Yijin Li · Zhaoyang Huang · Hongsheng Li · Zhaopeng Cui · Hujun Bao · Guofeng Zhang

DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation

Ying-Tian Liu · Zhifei Zhang · Yuan-Chen Guo · Matthew Fisher · Zhaowen Wang · Song-Hai Zhang

Towards Robust Tampered Text Detection in Document Image: New dataset and New Solution

Chenfan Qu · Chongyu Liu · Yuliang Liu · Xinhong Chen · Dezhi Peng · Fengjun Guo · Lianwen Jin

PanoSwin: a Pano-style Swin Transformer for Panorama Understanding

Zhixin Ling · Zhen Xing · Xiangdong Zhou · Man Cao · Guichun Zhou

SVFormer: Semi-supervised Video Transformer for Action Recognition

Zhen Xing · Qi Dai · Han Hu · Jingjing Chen · Zuxuan Wu · Yu-Gang Jiang

Multi-Object Manipulation via Object-Centric Neural Scattering Functions

Stephen Tian · Yancheng Cai · Hong-Xing Yu · Sergey Zakharov · Katherine Liu · Adrien Gaidon · Yunzhu Li · Jiajun Wu

RealImpact: A Dataset of Impact Sound Fields for Real Objects

Samuel Clarke · Ruohan Gao · Mason L Wang · Mark Rau · Julia Xu · Jui-Hsien Wang · Doug James · Jiajun Wu

3D Neural Field Generation using Triplane Diffusion

Jesse Shue · Eric Chan · Ryan Po · Zachary Ankner · Jiajun Wu · Gordon Wetzstein

Putting People in Their Place: Affordance-Aware Human Insertion into Scenes

Sumith Kulal · Tim Brooks · Alex Aiken · Jiajun Wu · Jimei Yang · Jingwan Lu · Alexei A. Efros · Krishna Kumar Singh

Towards Effective Visual Representations for Partial-Label Learning

Shiyu Xia · Jiaqi Lyu · Ning Xu · Gang Niu · Xin Geng

AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation

Zhen Li · Zuo-Liang Zhu · Ling-Hao Han · Qibin Hou · Chunle Guo · Ming-Ming Cheng

DNF: Decouple and Feedback Network for Seeing in the Dark

Xin Jin · Ling-Hao Han · Zhen Li · Chunle Guo · Zhi Chai · Chongyi Li

Spectral Enhanced Rectangle Transformer for Hyperspectral Image Denoising

Miaoyu Li · Ji Liu · Ying Fu · Yulun Zhang · Dejing Dou

Dynamic Aggregated Network for Gait Recognition

Kang Ma · Ying Fu · Dezhi Zheng · Chunshui Cao · Xuecai Hu · Yongzhen Huang

LG-BPN: Local and Global Blind-Patch Network for Self-Supervised Real-World Denoising

ZiChun Wang · Ying Fu · Ji Liu · Yulun Zhang

Real-Time Neural Light Field on Mobile Devices

Junli Cao · Huan Wang · Pavlo Chemerys · Vladislav Shakhrai · Ju Hu · Yun Fu · Denys Makoviichuk · Sergey Tulyakov · Jian Ren

ScaleDet: A Scalable Multi-Dataset Object Detector

Yanbei Chen · Manchen Wang · Abhay Mittal · Zhenlin Xu · Paolo Favaro · Joseph Tighe · Davide Modolo

All in One: Exploring Unified Video-Language Pre-training

Jinpeng Wang · Yixiao Ge · Rui Yan · Yuying Ge · Kevin Qinghong Lin · Satoshi Tsutsui · Xudong Lin · Guanyu Cai · Jianping WU · Ying Shan · Xiaohu Qie · Mike Zheng Shou

Learning Transferable Spatiotemporal Representations from Natural Script Knowledge

Ziyun Zeng · Yuying Ge · Xihui Liu · Bin Chen · Ping Luo · Shu-Tao Xia · Yixiao Ge

KD-GAN: Data Limited Image Generation via Knowledge Distillation

Kaiwen Cui · Yingchen Yu · Fangneng Zhan · Shengcai Liao · Shijian Lu · Eric Xing

Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision

Xinyi Ying · Li Liu · Yingqian Wang · Ruojing Li · Nuo Chen · Zaiping Lin · Weidong Sheng · Shilin Zhou

Logical Consistency and Greater Descriptive Power for Facial Hair Attribute Learning

Haiyu Wu · Grace Bezold · Aman Bhatta · Kevin Bowyer

Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding

Gyeongman Kim · Hajin Shim · Hyunsu Kim · Yunjey Choi · Junho Kim · Eunho Yang

3D Video Object Detection with Learnable Object-Centric Global Optimization

Jiawei He · Yuntao Chen · Naiyan Wang · Zhaoxiang Zhang

BEVFormer v2: Adapting Modern Image Backbones to Bird’s-Eye-View Recognition via Perspective Supervision

Chenyu Yang · Yuntao Chen · Hao Tian · Chenxin Tao · Xizhou Zhu · Zhaoxiang Zhang · Gao Huang · Hongyang Li · Yu Qiao · Lewei Lu · Jie Zhou · Jifeng Dai

MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds

Jiahui Liu · Chirui CHANG · Jianhui Liu · Xiaoyang Wu · Lan Ma · XIAOJUAN QI

Understanding Imbalanced Semantic Segmentation Through Neural Collapse

Zhisheng Zhong · Jiequan Cui · Yibo Yang · Xiaoyang Wu · XIAOJUAN QI · Xiangyu Zhang · Jiaya Jia

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation

Bohao PENG · Zhuotao Tian · Xiaoyang Wu · Chengyao Wang · Shu Liu · Jingyong Su · Jiaya Jia

Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning

Xiaoyang Wu · Xin Wen · Xihui Liu · Hengshuang Zhao

Self-Correctable and Adaptable Inference for Generalizable Human Pose Estimation

Zhehan Kan · Shuoshuo Chen · Ce Zhang · Yushun Tang · Zhihai He

Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation

Yushun Tang · Ce Zhang · Heng Xu · Shuoshuo Chen · Jie Cheng · Luziwei Leng · Qinghai Guo · Zhihai He

Noisy Correspondence Learning with Meta Similarity Correction

Haochen Han · Kaiyao Miao · Qinghua Zheng · Minnan Luo

Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency

Xiaogeng Liu · Minghui Li · Haoyu Wang · Shengshan Hu · Dengpan Ye · Hai Jin · Libing Wu · Chaowei Xiao

PolyFormer: Referring Image Segmentation as Sequential Polygon Generation

Jiang Liu · Hui Ding · Zhaowei Cai · Yuting Zhang · Ravi Satzoda · Vijay Mahadevan · R. Manmatha

Glocal Energy-based Learning for Few-Shot Open-Set Recognition

Haoyu Wang · Guansong Pang · Peng Wang · Lei Zhang · Wei Wei · Yanning Zhang

PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection

Linfeng Zhang · Runpei Dong · Hung-Shuo Tai · Kaisheng Ma

LipFormer: High-fidelity and Generalizable Talking Face Generation with A Pre-learned Facial Codebook

Jiayu Wang · Kang Zhao · Shiwei Zhang · Yingya Zhang · Yujun Shen · Deli Zhao · Jingren Zhou

High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

Chao Xu · Junwei Zhu · Jiangning Zhang · Yue Han · Wenqing Chu · Ying Tai · Chengjie Wang · Zhifeng Xie · Yong Liu

EC^2: Emergent Communication for Embodied Control

Yao Mu · Shunyu Yao · Mingyu Ding · Ping Luo · Chuang Gan

Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss

Anas Mahmoud · Jordan Sir Kwang Hu · Tianshu Kuai · Ali Harakeh · Liam Paull · Steven Waslander

Instance Relation Graph Guided Source-Free Domain Adaptive Object Detection

Vibashan Vishnukumar Sharmini · Poojan Oza · Vishal Patel

Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations

Vibashan Vishnukumar Sharmini · Ning Yu · Chen Xing · Can Qin · Mingfei Gao · Juan Carlos Niebles · Vishal Patel · Ran Xu

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

Xiaoyu Zhu · Po-Yao Huang · Junwei Liang · Celso de Melo · Alexander Hauptmann

DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks

Qiangqiang Wu · Tianyu Yang · Ziquan Liu · Baoyuan Wu · Ying Shan · Antoni Chan

TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization

Ziquan Liu · Yi Xu · Xiangyang Ji · Antoni Chan

Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting

Wei Lin · Antoni Chan

Music-Driven Group Choreography

Nhat Le · Trong Thang Pham · Tuong Do · Erman Tjiputra · Quang Tran · Anh Nguyen

Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization

Mengmeng Xu · Yanghao Li · Cheng-Yang Fu · Bernard Ghanem · Tao Xiang · Juan-Manuel Perez-Rua

Rotation-Invariant Transformer for Point Cloud Matching

Hao Yu · Zheng Qin · Ji Hou · Mahdi Saleh · Dongsheng Li · Benjamin Busam · Slobodan Ilic

Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors

Ji Hou · Xiaoliang Dai · Zijian He · Angela Dai · Matthias Niessner

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

Yuhao Chen · Xin Tan · Borui Zhao · ZhaoWei CHEN · Renjie Song · jiajun liang · Xuequan Lu

Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization

Shichao Dong · Jin Wang · Renhe Ji · jiajun liang · Haoqiang Fan · Zheng Ge

EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision

Jiahui Lei · Congyue Deng · Karl Schmeckpeper · Leonidas Guibas · Kostas Daniilidis

SemiCVT: Semi-Supervised Convolutional Vision Transformer for Semantic Segmentation

Huimin Huang · Shiao Xie · Lanfen Lin · Tong Ruofeng · Yen-wei Chen · Yuexiang Li · Hong Wang · Yawen Huang · Yefeng Zheng

CNVid-3.5M: Build, Filter, and Pre-train the Large-scale Public Chinese Video-text Dataset

Tian Gan · Qing Wang · Xingning Dong · Xiangyuan Ren · Liqiang Nie · Qingpei Guo

Disentangling Writer and Character Styles for Handwriting Generation

Gang Dai · Yifan Zhang · Qingfeng Wang · Qing Du · Zhuliang Yu · Zhuoman Liu · Shuangping Huang

A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image

Changlong Jiang · Yang Xiao · Cunlin Wu · Mingyang Zhang · Jinghong Zheng · Zhiguo Cao · Joey Zhou

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

Hao Li · Jinguo Zhu · Xiaohu Jiang · Xizhou Zhu · Hongsheng Li · Chun Yuan · Xiaohua Wang · Yu Qiao · Xiaogang Wang · Wenhai Wang · Jifeng Dai

ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations

Panos Achlioptas · Ian Huang · Minhyuk Sung · Sergey Tulyakov · Leonidas Guibas

Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR

Feng Li · Ailing Zeng · Shilong Liu · Hao Zhang · Hongyang Li · Lionel Ni · Lei Zhang

Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation

Feng Li · Hao Zhang · Huaizhe Xu · Shilong Liu · Lei Zhang · Lionel Ni · Heung-Yeung Shum

MP-Former: Mask-Piloted Transfomer for Image Segmentation

Hao Zhang · Feng Li · Huaizhe Xu · Shijia Huang · Shilong Liu · Lionel Ni · Lei Zhang

Enlarging Instance-specific and Class-specific Information for Open-set Action Recognition

Jun Cen · Shiwei Zhang · Xiang Wang · Yixuan Pei · Zhiwu Qing · Yingya Zhang · Qifeng Chen

MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition

Xiang Wang · Shiwei Zhang · Zhiwu Qing · Changxin Gao · Yingya Zhang · Deli Zhao · Nong Sang

PCR: Proxy-based Contrastive Replay for Online Class-Incremental Continual Learning

Huiwei Lin · Baoquan Zhang · Shanshan Feng · Xutao Li · Yunming Ye

Building Rearticulable Models for Arbitrary 3D Objects from 4D Point Clouds

Shaowei Liu · Saurabh Gupta · Shenlong Wang

Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention

Xuran Pan · Tianzhu Ye · Zhuofan Xia · Shiji Song · Gao Huang

Compressing Volumetric Radiance Fields to 1 MB

Lingzhi Li · Zhen Shen · Zhongshu Wang · Li Shen · Liefeng Bo

REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory

Ziniu Hu · Ahmet Iscen · Chen Sun · Zirui Wang · Kai-Wei Chang · Yizhou Sun · Cordelia Schmid · David Ross · Alireza Fathi

Improving Image Recognition by Retrieving from Web-Scale Image-Text Data

Ahmet Iscen · Alireza Fathi · Cordelia Schmid

Learning to Name Classes for Vision and Language Models

Sarah Parisot · Yongxin Yang · Steven McDonagh

SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory

Sicheng Li · Hao Li · Yue Wang · Yiyi Liao · Lu Yu

Semi-Supervised Video Inpainting with Cycle Consistency Constraints

Zhiliang Wu · Han Xuan · Changchang Sun · Weili Guan · Kang Zhang · Yan Yan

Deep Stereo Video Inpainting

Zhiliang Wu · Changchang Sun · Han Xuan · Yan Yan

VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

Siteng Huang · Biao Gong · Yulin Pan · Jianwen Jiang · Yiliang Lv · Yuyuan Li · Donglin Wang

NeRF-Supervised Deep Stereo

Fabio Tosi · Alessio Tonioni · Daniele Gregorio · Matteo Poggi

Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding

Zihang Lin · Chaolei Tan · Jian-Fang Hu · Zhi Jin · Tiancai Ye · Wei-Shi Zheng

Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding

Chaolei Tan · Zihang Lin · Jian-Fang Hu · Wei-Shi Zheng · Jianhuang Lai

Combining Implicit-Explicit View Correlation for Light Field Semantic Segmentation

Ruixuan Cong · Da Yang · Rongshan Chen · Sizhe Wang · Zhenglong Cui · HaoSheng

Improving Robustness of Vision Transformers by Reducing Sensitivity to Patch Corruptions

Yong Guo · David Stutz · Bernt Schiele

DF-Platter: Multi-Face Heterogeneous Deepfake Dataset

Kartik Narayan · Harsh Agarwal · Kartik Thakral · Surbhi Mittal · Mayank Vatsa · Richa Singh

Metadata-Based RAW Reconstruction via Implicit Neural Functions

Leyi Li · Huijie Qiao · Qi Ye · Qinmin Yang

I

2

-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs

Jingsen Zhu · Yuchi Huo · Qi Ye · Fujun Luan · Jifan Li · Dianbing Xi · Lisha Wang · Rui Tang · Wei Hua · Hujun Bao · Rui Wang

Polarized Color Image Denoising

Zhuoxiao Li · Haiyang Jiang · Mingdeng Cao · Yinqiang Zheng

NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination

Haoqian Wu · Zhipeng Hu · Lincheng Li · Yongqiang Zhang · Changjie Fan · Xin Yu

Balanced Energy Regularization Loss for Out-of-distribution Detection

Hyunjun Choi · Hawook Jeong · Jin Choi

DeCo : Decomposition and Reconstruction for Compositional Temporal Grounding via Coarse-to-Fine Contrastive Ranking

Lijin Yang · Quan Kong · Hsuan-Kung Yang · Wadim Kehl · Yoichi Sato · Norimasa Kobori

CREPE: Can Vision-Language Foundation Models Reason Compositionally?

Zixian Ma · Jerry Hong · Mustafa Omer Gul · Mona Gandhi · Irena Gao · Ranjay Krishna

Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask

Shangzhan Zhang · Sida Peng · Tianrun Chen · Linzhan Mou · Haotong Lin · Kaicheng Yu · Yiyi Liao · Xiaowei Zhou

Learning 3D-aware Image Synthesis with Unknown Pose Distribution

Zifan Shi · Yujun Shen · Yinghao Xu · Sida Peng · Yiyi Liao · Sheng Guo · Qifeng Chen · Dit-Yan Yeung

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

Jiazhi Guan · Zhanwang Zhang · Hang Zhou · Tianshu Hu · Kaisiyuan Wang · Dongliang He · Haocheng Feng · Jingtuo Liu · Errui Ding · Ziwei Liu · Jingdong Wang

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

Zhiheng Li · Ivan Evtimov · Albert Gordo · Caner Hazirbas · Tal Hassner · Cristian Canton · Chenliang Xu · Mark Ibrahim

Cooperation or Competition: Avoiding Player Domination for Multi-target Robustness by Adaptive Budgets

Yimu Wang · Dinghuai Zhang · Yihan Wu · Heng Huang · Hongyang Zhang

Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues

Stefanie Walz · Mario Bijelic · Andrea Ramazzina · Amanpreet Walia · Fahim Mannan · Felix Heide

SliceMatch: Geometry-guided Aggregation for Cross-View Pose Estimation

Zimin Xia · Holger Caesar · Julian Kooij · Ted Lentsch

Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations

Lei Hsiung · Yun-Yun Tsai · Pin-Yu Chen · Tsung-Yi Ho

StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer

Sasikarn Khwanmuang · Pakkapon Phongthawee · Patsorn Sangkloy · Supasorn Suwajanakorn

Learning Geometric-aware Properties in 2D Representation Using Lightweight CAD Models, or Zero Real 3D Pairs

Pattaramanee Arsomngern · Sarana Nutanong · Supasorn Suwajanakorn

Visibility Constrained Wide-band Illumination Spectrum Design for Seeing-in-the-Dark

Muyao Niu · Zhuoxiao Li · Zhihang Zhong · Yinqiang Zheng

ToThePoint: Efficient Contrastive Learning of 3D Point Clouds via Recycling

Xinglin Li · Jiajing Chen · Jinhui Ouyang · Hanhui Deng · Senem Velipasalar · Di Wu

AUNet: Learning Relations Between Action Units for Face Forgery Detection

Weiming Bai · Yufan Liu · Zhipeng Zhang · Bing Li · Weiming Hu

Physical-World Optical Adversarial Attacks on 3D Face Recognition

Yanjie Li · Yiquan Li · Xuelong Dai · Songtao Guo · Bin Xiao

Robust Single Image Reflection Removal Against Adversarial Attacks

Zhenbo Song · Zhenyuan Zhang · Kaihao Zhang · Wenhan Luo · Zhaoxin Fan · Wenqi Ren · Jianfeng Lu

The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training

Junhao Dong · Seyed-Mohsen Moosavi-Dezfooli · Jianhuang Lai · Xiaohua Xie

Boosting Accuracy and Robustness of Student Models via Adaptive Adversarial Distillation

Bo Huang · Mingyang Chen · Yi Wang · JUNDA LU · Minhao Cheng · Wei Wang

Introducing Competition to Boost the Transferability of Targeted Adversarial Examples through Clean Feature Mixup

Junyoung Byun · Myung-Joon Kwon · Seungju Cho · Yoonji Kim · Changick Kim

Angelic Patches for Improving Third-Party Object Detector Performance

Wenwen Si · Shuo Li · Sangdon Park · Insup Lee · Osbert Bastani

Sibling-Attack: Rethinking Transferable Adversarial Attacks against Face Recognition

Zexin Li · Bangjie Yin · Taiping Yao · Junfeng Guo · Shouhong Ding · Simin Chen · Cong Liu

A Practical Upper Bound for the Worst-Case Attribution Deviations

Fan Wang · Adams Kong

You Are Catching My Attention: Are Vision Transformers Bad Learners under Backdoor Attacks?

Zenghui Yuan · Pan Zhou · Kai Zou · Yu Cheng

Architectural Backdoors in Neural Networks

Mikel Bober-Irizar · Ilia Shumailov · Yiren Zhao · Robert Mullins · Nicolas Papernot

The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection

Simin Chen · Hanlin Chen · Mirazul Haque · Cong Liu · Wei Yang

StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning

Yuqian Fu · YU XIE · Yanwei Fu · Yu-Gang Jiang

Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment

Yiyou Sun · Yaojie Liu · Xiaoming Liu · Yixuan Li · Vincent Chu

Make Landscape Flatter in Differentially Private Federated Learning

Yifan Shi · Yingqi Liu · Kang Wei · Li Shen · Xueqian Wang · Dacheng Tao

Confidence-aware Personalized Federated Learning via Variational Expectation Maximization

Junyi Zhu · Xingchen Ma · Matthew Blaschko

ScaleFL: Resource-Adaptive Federated Learning with Heterogeneous Clients

Fatih Ilhan · Gong Su · Ling Liu

MetaMix: Towards Corruption-Robust Continual Learning with Temporally Self-Adaptive Data Transformation

Zhenyi Wang · Li Shen · Donglin Zhan · Qiuling Suo · Yanjun Zhu · Tiehang Duan · Mingchen Gao

Revisiting Reverse Distillation for Anomaly Detection

Tran Dinh Tien · Anh Tuan Nguyen · Nguyen Tran · Huy Ta · Soan Duong · Chanh Nguyen · Steven Truong

Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping

Zuhao Liu · Xiao-Ming Wu · Dian Zheng · Kun-Yu Lin · Wei-Shi Zheng

Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection

Xincheng Yao · Ruoqi Li · Jing Zhang · Jun Sun · Chongyang Zhang

Towards Universal Fake Image Detectors that Generalize Across Generative Models

Utkarsh Ojha · Yuheng Li · Yong Jae Lee

Edges to Shapes to Concepts: Adversarial Augmentation for Robust Vision

Aditay Tripathi · Rishubh Singh · Anirban Chakraborty · Pradeep Shenoy

Sequential training of GANs against GAN-classifiers reveals correlated “knowledge gaps” present among independently trained GAN instances

Arkanath Pathak · Nicholas Dufour

Masked Auto-Encoders Meet Generative Adversarial Networks and Beyond

Zhengcong Fei · Mingyuan Fan · Li Zhu · Junshi Huang · Xiaoming Wei · Xiaolin Wei

Vector Quantization with Self-attention for Quality-independent Representation Learning

zhou yang · Weisheng Dong · Xin Li · Mengluan Huang · Yulin Sun · Guangming Shi

PD-Quant: Post-Training Quantization Based on Prediction Difference Metric

Jiawei Liu · Lin Niu · Zhihang Yuan · Dawei Yang · Xinggang Wang · Wenyu Liu

Hard Sample Matters a Lot in Zero-Shot Quantization

Huantong Li · Xiangmiao Wu · fanbing Lv · Daihai Liao · Thomas Li · Yonggang Zhang · Bo Han · Mingkui Tan

Fair Scratch Tickets: Finding Fair Sparse Networks without Weight Training

Pengwei Tang · Wei Yao · Zhicong Li · Yong Liu

Understanding Deep Generative Models with Generalized Empirical Likelihoods

Suman Ravuri · Mélanie Rey · Shakir Mohamed · Marc Deisenroth

Deep Deterministic Uncertainty: A New Simple Baseline

Jishnu Mukhoti · Andreas Kirsch · Joost van Amersfoort · Philip Torr · Yarin Gal

Compacting Binary Neural Networks by Sparse Kernel Selection

Yikai Wang · Wenbing Huang · Yinpeng Dong · Fuchun Sun · Anbang Yao

Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures

Eugenia Iofinova · Alexandra Peste · Dan Alistarh

X-Pruner: eXplainable Pruning for Vision Transformers

Lu Yu · Wei Xiang

Deep Graph Reprogramming

Yongcheng Jing · Chongbin Yuan · Li Ju · Yiding Yang · Xinchao Wang · Dacheng Tao

FlowGrad: Controlling the Output of Generative ODEs with Gradients

Xingchao Liu · Lemeng Wu · Shujian Zhang · Chengyue Gong · Wei Ping · qiang liu

Exploring Data Geometry for Continual Learning

Zhi Gao · Chen Xu · Feng Li · Yunde Jia · Mehrtash Harandi · Yuwei Wu

Improving Generalization with Domain Convex Game

Fangrui Lv · Jian Liang · Shuang Li · Jinming Zhang · Di Liu

SLACK: Stable Learning of Augmentations with Cold-start and KL regularization

Juliette Marrie · Michael Arbel · Diane Larlus · Julien Mairal

Critical Learning Periods for Multisensory Integration in Deep Networks

Michael Kleinman · Alessandro Achille · Stefano Soatto

Preserving Linear Separability in Continual Learning by Backward Feature Projection

Qiao Gu · Dongsub Shim · Florian Shkurti

Multi-level Logit Distillation

Ying Jin · Jiaqi Wang · Dahua Lin

Data-Free Knowledge Distillation via Feature Exchange and Activation Region Constraint

Shikang Yu · Jiachen Chen · Hu Han · Shuqiang Jiang

Masked Autoencoders Enable Efficient Knowledge Distillers

Yutong Bai · Zeyu Wang · Junfei Xiao · Chen Wei · Huiyu Wang · Alan Yuille · Yuyin Zhou · Cihang Xie

DKT: Diverse Knowledge Transfer Transformer for Class Incremental Learning

Xinyuan Gao · Yuhang He · SongLin Dong · Jie Cheng · Xing Wei · Yihong Gong

BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning

Changdae Oh · Hyeji Hwang · Hee-young Lee · YongTaek Lim · Geunyoung Jung · Jiyoung Jung · Hosik Choi · Kyungwoo Song

PIVOT: Prompting for Video Continual Learning

Andres Villa · Juan Leon Alcazar · Motasem Alfarra · Kumail Alhamoud · Julio Hurtado · Fabian Caba · Alvaro Soto · Bernard Ghanem

MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering

Jingjing Jiang · Nanning Zheng

NIFF: Alleviating Forgetting in Generalized Few-Shot Object Detection via Neural Instance Feature Forging

Karim Guirguis · Johannes Meier · George Eskandar · Matthias Kayser · Bin Yang · Jürgen Beyerer

Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning

Zeyin Song · Yifan Zhao · Yujun Shi · Peixi Peng · Li Yuan · Yonghong Tian

Improved Test-Time Adaptation for Domain Generalization

Liang Chen · Yong Zhang · Yibing Song · Ying Shan · Lingqiao Liu

TIPI: Test Time Adaptation with Transformation Invariance

Anh Tuan Nguyen · Thanh Nguyen-Tang · Ser-Nam Lim · Philip Torr

ActMAD: Activation Matching to Align Distributions for Test-Time-Training

Muhammad Mirza Mirza · Pol Jane Soneira · Wei Lin · Mateusz Kozinski · Horst Possegger · Horst Bischof

Modality-Agnostic Debiasing for Single Domain Generalization

Sanqing Qu · Yingwei Pan · Guang Chen · Ting Yao · changjun jiang · Tao Mei

ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization

Jintao Guo · Na Wang · Lei Qi · Yinghuan Shi

C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation

Nazmul Karim · Niluthpol Chowdhury Mithun · Abhinav Rajvanshi · Han-pang Chiu · Supun Samarasekera · Nazanin Rahnavard

Adjustment and Alignment for Unbiased Open Set Domain Adaptation

Wuyang Li · Jie Liu · Bo Han · Yixuan Yuan

Semi-Supervised Domain Adaptation with Source Label Adaptation

Yu-Chu Yu · Hsuan-Tien Lin

Dynamically Instance-Guided Adaptation: A Backward-free Approach for Test-Time Domain Adaptive Semantic Segmentation

Wei Wang · Zhun Zhong · Weijie Wang · Xi Chen · Charles Ling · Boyu Wang · Nicu Sebe

FCC: Feature Clusters Compression for Long-Tailed Visual Recognition

Jian Li · Ziyao Meng · daqian Shi · Rui Song · Xiaolei Diao · Jingwen Wang · Hao Xu

DISC: Learning from Noisy Labels via Dynamic Instance-Specific Selection and Correction

Yifan Li · Hu Han · Shiguang Shan · Xilin CHEN

Superclass Learning with Representation Enhancement

Zeyu Gan · Suyun Zhao · Jinlong Kang · Liyuan Shang · Hong Chen · Cuiping Li

Improving Selective Visual Question Answering by Learning from Your Peers

Corentin Dancette · Spencer Whitehead · Rishabh Maheshwary · Shanmukha Ramakrishna Vedantam · Stefan Scherer · Xinlei Chen · Matthieu CORD · Marcus Rohrbach

Difficulty-based Sampling for Debiased Contrastive Representation Learning

Taeuk Jang · Xiaoqian Wang

Token Boosting for Robust Self-Supervised Visual Transformer Pre-training

Tianjiao Li · Lin Geng Foo · Ping Hu · Xindi Shang · Hossein Rahmani · Zehuan Yuan · Jun Liu

HyperMatch: Noise-Tolerant Semi-Supervised Learning via Relaxed Contrastive Constraint

Beitong Zhou · Jing Lu · Kerui Liu · Yunlu Xu · Zhanzhan Cheng · Yi Niu

Open-Set Likelihood Maximization for Few-Shot Learning

Malik Boudiaf · Etienne Bennequin · Myriam Tami · Antoine Toubhans · Pablo Piantanida · CELINE HUDELOT · Ismail Ayed

Transductive Few-Shot Learning with Prototypes Label-Propagation by Iterative Graph Refinement

Hao Zhu · Piotr Koniusz

Deep Fair Clustering via Maximizing and Minimizing Mutual Information: Theory, Algorithm and Metric

Pengxin Zeng · Yunfan Li · Peng Hu · Dezhong Peng · Jiancheng Lv · Xi Peng

On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering

Daniel J. Trosten · Sigurd Løkse · Robert Jenssen · Michael Kampffmeyer

Sample-level Multi-view Graph Clustering

Yuze Tan · Yixi Liu · Shudong Huang · Wentao Feng · Jiancheng Lv

Discriminating Known from Unknown Objects via Structure-Enhanced Recurrent Variational AutoEncoder

Aming WU · Cheng Deng

GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection

Xixi Liu · Yaroslava Lochman · Christopher Zach

RankMix: Data Augmentation for Weakly Supervised Learning of Classifying Whole Slide Images with Diverse Sizes and Imbalanced Categories

Yuan-Chih Chen · Chun-Shien Lu

Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data

Paul Hager · Martin J. Menten · Daniel Rueckert

DeGPR: Deep Guided Posterior Regularisation For Multi-Class Cell Detection And Counting

Aayush Tyagi · Chirag Mohapatra · Prasenjit Das · Govind Makharia · Lalita Mehra · Prathosh AP · Mausam .

OCELOT: Overlapped Cell on Tissue Dataset for Histopathology

Jeongun Ryu · Aaron Valero Puche · JaeWoong Shin · Seonwook Park · Biagio Brattoli · Jinhee Lee · Wonkyung Jung · Soo Ick Cho · Kyunghyun Paeng · Chan-Young Ock · Donggeun Yoo · Sérgio Pereira

SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection

Tiange Xiang · Yixiao Zhang · Yongyi Lu · Alan Yuille · Chaoyi Zhang · Weidong Cai · Zongwei Zhou

Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization

Mingze Yuan · Yingda Xia · Hexin Dong · Zifan Chen · Jiawen Yao · Mingyan Qiu · Ke Yan · Xiaoli Yin · Yu Shi · Xin Chen · Zaiyi Liu · Bin Dong · Jingren Zhou · Le Lu · Ling Zhang · Li Zhang

MagicNet: Semi-Supervised Multi-Organ Segmentation via Magic-Cube Partition and Recovery

Duowen Chen · Yunhao Bai · Wei Shen · Qingli Li · Lequan Yu · Yan Wang

(ML)

2

P-Encoder: On Exploration of Channel-class Correlation for Multi-label Zero-shot Learning

Ziming Liu · Song Guo · Xiaocheng Lu · Jingcai Guo · Jiewei Zhang · Yue Zeng · Fushuo Huo

Out-of-Distributed Semantic Pruning for Robust Semi-Supervised Learning

Yu Wang · Pengchong Qiao · Chang Liu · Guoli Song · Xiawu Zheng · Jie Chen

Contrastive Mean Teacher for Domain Adaptive Object Detectors

Shengcao Cao · Dhiraj Joshi · Liangyan Gui · Yu-Xiong Wang

Harmonious Teacher for Cross-domain Object Detection

Jinhong Deng · Dongli Xu · Wen Li · Lixin Duan

Hierarchical Supervision and Shuffle Data Augmentation for 3D Semi-Supervised Object Detection

Chuandong Liu · CHENQIANG GAO · Fangcen Liu · Pengcheng Li · Deyu Meng · Xinbo Gao

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

Jiacheng Zhang · Xiangru Lin · Wei Zhang · Kuo Wang · Xiao Tan · Junyu Han · Errui Ding · Jingdong Wang · Guanbin Li

Continual Detection Transformer for Incremental Object Detection

Yaoyao Liu · Bernt Schiele · Andrea Vedaldi · Christian Rupprecht

DA-DETR: Domain Adaptive Detection Transformer with Information Fusion

Jingyi Zhang · Jiaxing Huang · Zhipeng Luo · Gongjie Zhang · Xiaoqin Zhang · Shijian Lu

CIGAR: Cross-Modality Graph Reasoning for Domain Adaptive Object Detection

Yabo Liu · Jinghua Wang · Chao Huang · Yaowei Wang · Yong Xu

Box-Level Active Detection

Mengyao Lyu · Jundong Zhou · Hui Chen · Yi-Jie Huang · Dongdong Yu · Yaqian Li · Yandong Guo · Yuchen Guo · Liuyu Xiang · Guiguang Ding

Enhanced Training of Query-Based Object Detection via Selective Query Recollection

Fangyi Chen · Han Zhang · Kai Hu · Yu-Kai Huang · Chenchen Zhu · Marios Savvides

Vision Transformers are Good Mask Auto-Labelers

Shiyi Lan · Xitong Yang · Zhiding Yu · Zuxuan Wu · Jose Alvarez · Anima Anandkumar

Weakly Supervised Posture Mining for Fine-grained Classification

Zhenchao Tang · Hualin Yang · Calvin Yu-Chian Chen

IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients

Ruo Yang · Binghui Wang · Mustafa Bilgic

Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm

Yichen Xie · Han Lu · Junchi Yan · Xiaokang Yang · Masayoshi Tomizuka · Wei Zhan

Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation

Zhen Zhao · Sifan Long · Jimin Pi · Jingdong Wang · Luping Zhou

Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation

Yan Jin · Mengke LI · Yang Lu · Yiu-ming Cheung · Hanzi Wang

Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation

Chaohui Yu · Qiang Zhou · Jingliang Li · Jianlong Yuan · Zhibin Wang · Fan Wang

Out-of-Candidate Rectification for Weakly Supervised Semantic Segmentation

Zesen Cheng · Pengchong Qiao · Kehan Li · Siheng Li · Pengxu Wei · Xiangyang Ji · Li Yuan · Chang Liu · Jie Chen

FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation

Junjie He · Pengyu Li · Yifeng Geng · Xuansong Xie

On Calibrating Semantic Segmentation Models: Analyses and An Algorithm

Dongdong Wang · Boqing Gong · Liqiang Wang

Content-aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers

Chenyang Lu · Daan de Geus · Gijs Dubbelman

Ultra-High Resolution Segmentation with Ultra-Rich Context: A Novel Benchmark

Deyi Ji · Feng Zhao · Hongtao Lu · Mingyuan Tao · Jieping Ye

Few-shot Semantic Image Synthesis with Class Affinity Transfer

Marlene Careil · Jakob Verbeek · Stéphane Lathuilière

Network-free, unsupervised semantic segmentation with synthetic images

Qianli Feng · Raghudeep Gadde · Wentong Liao · Eduard Ramon · Aleix Martinez

MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence

Yixuan Sun · Yiwen Huang · HaiJing Guo · Yuzhou Zhao · Runmin Wu · Yizhou Yu · Weifeng Ge · Wenqiang Zhang

GRES: Generalized Referring Expression Segmentation

Chang Liu · Henghui Ding · Xudong Jiang

Semantic Prompt for Few-Shot Image Recognition

Wentao Chen · Chenyang Si · Zhang Zhang · Liang Wang · Zilei Wang · Tieniu Tan

Contrastive Grouping with Transformer for Referring Image Segmentation

Jiajin Tang · Ge Zheng · Cheng Shi · Sibei YANG

Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning

Xiaocheng Lu · Song Guo · Ziming Liu · Jingcai Guo

GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning

Zhenyu Xie · Zaiyu Huang · Xin Dong · Fuwei Zhao · Haoye Dong · Xijin Zhang · Feida Zhu · Xiaodan Liang

OvarNet: Towards Open-vocabulary Object Attribute Recognition

Keyan Chen · Xiaolong Jiang · Yao Hu · Xu Tang · Yan Gao · Jianqi Chen · Weidi Xie

HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models

Shan Ning · Longtian Qiu · Yongfei Liu · Xuming He

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment

Lewei Yao · Jianhua Han · Xiaodan Liang · Dan Xu · Wei Zhang · Zhenguo Li · Hang Xu

Data-efficient Large Scale Place Recognition with Graded Similarity Supervision

Maria Leyva-Vallina · Nicola Strisciuglio · Nicolai Petkov

ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing

Zequn Zeng · Hao Zhang · Zhengjue Wang · Ruiying Lu · Dongsheng Wang · Bo Chen

Deep Hashing with Minimal-Distance-Separated Hash Centers

Liangdao Wang · Yan Pan · Cong Liu · Hanjiang Lai · Jian Yin · Ye Liu

Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment

Runqi Wang · Hao ZHENG · Xiaoyue Duan · Jianzhuang Liu · Yuning Lu · Tian Wang · Songcen Xu · Baochang Zhang

Masked Autoencoding Does Not Help Natural Language Supervision at Scale

Floris Weers · Vaishaal Shankar · Angelos Katharopoulos · Yinfei Yang · Tom Gunter

Improving Cross-Modal Retrieval with Set of Diverse Embeddings

Dongwon Kim · Namyup Kim · Suha Kwak

Revisiting Self-Similarity: Structural Embedding for Image Retrieval

Seongwon Lee · Suhyeon Lee · Hongje Seong · Euntai Kim

LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

Jihye Park · Sunwoo Kim · Soohyun Kim · Seokju Cho · Jaejun Yoo · Youngjung Uh · Seungryong Kim

Scaling Language-Image Pre-training via Masking

Yanghao Li · Haoqi Fan · Ronghang Hu · Christoph Feichtenhofer · Kaiming He

Variational Distribution Learning for Unsupervised Text-to-Image Generation

MINSOO KANG · Doyup Lee · Jiseob Kim · Saehoon Kim · Bohyung Han

Semantic-Conditional Diffusion Networks for Image Captioning

Jianjie Luo · Yehao Li · Yingwei Pan · Ting Yao · Jianlin Feng · Hongyang Chao · Tao Mei

Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

Fengyin Lin · Mingkang Li · Da Li · Timothy Hospedales · Yi-Zhe Song · Yonggang Qi

MAGVLT: Masked Generative Vision-and-Language Transformer

Sungwoong Kim · Daejin Jo · Donghoon Lee · Jongmin Kim

SketchXAI: A First Look at Explainability for Human Sketches

Zhiyu Qu · Yulia Gryaditskaya · Ke Li · Kaiyue Pang · Tao Xiang · Yi-Zhe Song

Learning Geometry-aware Representations by Sketching

Hyundo Lee · Inwoo Hwang · Hyunsung Go · Won-Seok Choi · Kibeom Kim · Byoung-Tak Zhang

Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training

Dezhao Luo · Jiabo Huang · Shaogang Gong · Hailin Jin · Yang Liu

Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting

Syed Talal Wasim · Muhammad Muzammal Naseer · Salman Khan · Fahad Khan · Mubarak Shah

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

WonJun Moon · Sangeek Hyun · SangUk Park · Dongchan Park · Jae-Pil Heo

Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning

Wei Ji · Renjie Liang · Zhedong Zheng · Wenqiao Zhang · Shengyu Zhang · Juncheng Li · Mengze Li · Tat-Seng Chua

Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels

Jingqiu Zhou · Linjiang Huang · Liang Wang · Si Liu · Hongsheng Li

PivoTAL: Prior-Driven Supervision for Weakly-Supervised Temporal Action Localization

Mamshad Nayeem Rizve · Gaurav Mittal · Ye Yu · Matthew Hall · Sandra Sajeev · Mubarak Shah · Mei Chen

Open Set Action Recognition via Multi-Label Evidential Learning

Chen Zhao · Dawei Du · Anthony Hoogs · Christopher Funk

Object Discovery from Motion-Guided Tokens

Zhipeng Bao · Pavel Tokmakov · Yu-Xiong Wang · Adrien Gaidon · Martial Hebert

Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling

Ryo Hachiuma · Fumiaki Sato · Taiki Sekii

Video Test-Time Adaptation for Action Recognition

Wei Lin · Muhammad Mirza Mirza · Mateusz Kozinski · Horst Possegger · Hilde Kuehne · Horst Bischof

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline

Tiantian Geng · Teng WANG · Jinming Duan · Runmin Cong · Feng Zheng

A Light Weight Model for Active Speaker Detection

Junhua Liao · Haihan Duan · Kanghui Feng · WanBing Zhao · Yanbing Yang · Liangyin Chen

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR

Paul Hongsuck Seo · Arsha Nagrani · Cordelia Schmid

Egocentric Audio-Visual Object Localization

Chao Huang · Yapeng Tian · Anurag Kumar · Chenliang Xu

An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling

Tsu-Jui Fu · Linjie Li · Zhe Gan · Kevin Lin · William Yang Wang · Lijuan Wang · Zicheng Liu

Towards End-to-End Generative Modeling of Long Videos with Memory-Efficient Bidirectional Transformers

Jaehoon Yoo · Semin Kim · Doyup Lee · Chiheon Kim · Seunghoon Hong

Unifying Short and Long-Term Tracking with Graph Hierarchies

Orcun Cetintas · Guillem Braso · Laura Leal-Taixé

Hierarchical Neural Memory Network for Low Latency Event Processing

Ryuhei Hamaguchi · Yasutaka Furukawa · Masaki Onishi · Ken Sakurada

Mask-Free Video Instance Segmentation

Lei Ke · Martin Danelljan · Henghui Ding · Yu-Wing Tai · Chi-Keung Tang · Fisher Yu

Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection

Shengyang Sun · Xiaojin Gong

Breaking the “Object” in Video Object Segmentation

Pavel Tokmakov · Jie Li · Adrien Gaidon

VideoTrack: Learning to Track Objects via Video Transformer

Fei Xie · Lei Chu · Jiahao Li · Yan Lu · Chao Ma

Recurrence without Recurrence: Stable Video Landmark Detection with Deep Equilibrium Models

Paul Micaelli · Arash Vahdat · Hongxu Yin · Jan Kautz · Pavlo Molchanov

Unbiased Scene Graph Generation in Videos

Sayak Nag · Kyle Min · Subarna Tripathi · Amit Roy-Chowdhury

Graph Representation for Order-aware Visual Transformation

Yue Qiu · Yanjun Sun · Fumiya Matsuzawa · Kenji Iwata · Hirokatsu Kataoka

Prototype-based Embedding Network for Scene Graph Generation

Chaofan Zheng · Xinyu Lyu · Lianli Gao · Bo Dai · Jingkuan Song

Efficient Mask Correction for Click-Based Interactive Image Segmentation

Fei Du · Jianlong Yuan · Zhibin Wang · Fan Wang

G-MSM: Unsupervised Multi-Shape Matching with Graph-based Affinity Priors

Marvin Eisenberger · Aysim Toker · Laura Leal-Taixé · Daniel Cremers

Shape-Erased Feature Learning for Visible-Infrared Person Re-Identification

Jiawei Feng · Ancong Wu · Wei-Shi Zheng

Mixed Autoencoder for Self-supervised Visual Representation Learning

Kai Chen · Zhili LIU · Lanqing HONG · Hang Xu · Zhenguo Li · Dit-Yan Yeung

Stare at What You See: Masked Image Modeling without Reconstruction

Hongwei Xue · Peng Gao · Hongyang Li · Yu Qiao · Hao Sun · Houqiang Li · Jiebo Luo

ResFormer: Scaling ViTs with Multi-Resolution Training

Rui Tian · Zuxuan Wu · Qi Dai · Han Hu · Yu Qiao · Yu-Gang Jiang

Seeing Beyond the Brain: Conditional Diffusion Model with Sparse Masked Modeling for Vision Decoding

Zijiao Chen · Jiaxin Qing · Tiange Xiang · Wan Lin Yue · Juan Zhou Zhou

DropKey for Vision Transformer

Bonan Li · Yinhan Hu · Xuecheng Nie · Congying Han · Xiangjian Jiang · Tiande Guo · Luoqi Liu

Vision Transformer with Super Token Sampling

Huaibo Huang · Xiaoqiang Zhou · Jie Cao · Ran He · Tieniu Tan

Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers

Cong Wei · Brendan Duke · Ruowei Jiang · Parham Aarabi · Graham Taylor · Florian Shkurti

All are Worth Words: A ViT Backbone for Diffusion Models

Fan Bao · Shen Nie · Kaiwen Xue · Yue Cao · Chongxuan Li · Hang Su · Jun Zhu

Boost Vision Transformer with GPU-Friendly Sparsity and Quantization

Chong Yu · Tao Chen · Zhongxue Gan · Jiayuan Fan

DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training

Yihao Chen · Xianbiao Qi · Jianan Wang · Lei Zhang

Structured Sparsity Learning for Efficient Video Super-Resolution

Bin Xia · Jingwen He · Yulun Zhang · Yitong Wang · Yapeng Tian · Wenming Yang · Luc Van Gool

Efficient Semantic Segmentation by Altering Resolutions for Compressed Videos

Yubin Hu · Yuze He · Yanghao Li · Jisheng Li · Yuxing Han · jiangtao wen · Yong-jin Liu

Neural Video Compression with Diverse Contexts

Jiahao Li · Bin Li · Yan Lu

Large-capacity and Flexible Video Steganography via Invertible Neural Network

Chong Mou · Youmin Xu · Jiechong Song · Chen Zhao · Bernard Ghanem · Jian Zhang

Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization

Mengqi Huang · Zhendong Mao · Zhuowei Chen · Yongdong Zhang

Binary Latent Diffusion

Ze Wang · Jiang Wang · Zicheng Liu · Qiang Qiu

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

Andreas Blattmann · Robin Rombach · Huan Ling · Tim Dockhorn · Seung Wook Kim · Sanja Fidler · Karsten Kreis

Diffusion Probabilistic Model Made Slim

Xingyi Yang · Daquan Zhou · Jiashi Feng · Xinchao Wang

Solving 3D Inverse Problems from Pre-trained 2D Diffusion Models

Hyungjin Chung · Dohoon Ryu · Michael McCann · Marc Klasky · Jong Ye

EDICT: Exact Diffusion Inversion via Coupled Transformations

Bram Wallace · Akash Gokul · Nikhil Naik

Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models

Patrick Schramowski · Manuel Brack · Björn Deiseroth · Kristian Kersting

GLIGEN: Open-Set Grounded Text-to-Image Generation

Yuheng Li · Haotian Liu · Qingyang Wu · Fangzhou Mu · Jianwei Yang · Jianfeng Gao · Chunyuan Li · Yong Jae Lee

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Nataniel Ruiz · Yuanzhen Li · Varun Jampani · Yael Pritch · Michael Rubinstein · Kfir Aberman

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

Guangcong Zheng · Xianpan Zhou · Xuewei Li · Zhongang Qi · Ying Shan · Xi Li

Affordance Diffusion: Synthesizing Hand-Object Interactions

Yufei Ye · Xueting Li · Abhinav Gupta · Shalini De Mello · Stan Birchfield · Jiaming Song · Shubham Tulsiani · Sifei Liu

SceneComposer: Any-Level Semantic Image Synthesis

Yu Zeng · Zhe Lin · Jianming Zhang · Qing Liu · John Collomosse · Jason Kuen · Vishal Patel

Handwritten Text Generation from Visual Archetypes

Vittorio Pippi · Silvia Cascianelli · Rita Cucchiara

Referring Image Matting

Jizhizi Li · Jing Zhang · Dacheng Tao

Neural Transformation Fields for Arbitrary-Styled Font Generation

Bin Fu · Junjun He · Jianjun Wang · Yu Qiao

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Mode

Shaoan Xie · Zhifei Zhang · Zhe Lin · Tobias Hinz · Kun Zhang

Masked and Adaptive Transformer for Exemplar Based Image Translation

chang jiang · Fei Gao · Biao Ma · Lin Yuhao · Nannan Wang · Gang Xu

Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

Thuan Nguyen · Thanh Le · Anh Tran

RWSC-Fusion: Region-Wise Style-Controlled Fusion Network for the Prohibited X-ray Security Image Synthesis

luwen duan · Min Wu · Lijian Mao · Jun Yin · Xiong Jianping · Xi Li

Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method

Ran Yi · Haoyuan Tian · Zhihao Gu · Yu-Kun Lai · Paul Rosin

Omni Aggregation Networks for Lightweight Image Super-Resolution

Hang Wang · Xuanhong Chen · Bingbing Ni · Yutian Liu · Jinfan Liu

Activating More Pixels in Image Super-Resolution Transformer

Xiangyu Chen · Xintao Wang · Jiantao Zhou · Yu Qiao · Chao Dong

Spatial-Frequency Mutual Learning for Face Super-Resolution

Chenyang Wang · Junjun Jiang · Zhiwei Zhong · Xianming Liu

Kernel Aware Resampler

Michael Bernasconi · Abdelaziz Djelouah · Farnood Salehi · Markus Gross · Christopher Schroers

RGB no more: Minimally-decoded JPEG Vision Transformers

Jeongsoo Park · Justin Johnson

Multi-Realism Image Compression with a Conditional Generator

Eirikur Agustsson · David Minnen · George Toderici · Fabian Mentzer

Learning to Exploit the Sequence-Specific Prior Knowledge for Image Processing Pipelines Optimization

Haina Qin · Longfei Han · Weihua Xiong · Juan Wang · Wentao Ma · Bing Li · Weiming Hu

Quality-aware Pre-trained Models for Blind Image Quality Assessment

Kai Zhao · Kun Yuan · Ming Sun · Mading Li · Xing Wen

Robust Unsupervised StyleGAN Image Restoration

Yohan Poirier-Ginter · Jean-Francois Lalonde

RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors

Rui-Qi Wu · Zheng-Peng Duan · Chunle Guo · Zhi Chai · Chongyi Li

Toward Stable, Interpretable, and Lightweight Hyperspectral Super-resolution

Wenjin Guo · Weiying Xie · Kai Jiang · Yunsong Li · Jie Lei · Leyuan Fang

Residual Degradation Learning Unfolding Framework with Mixing Priors across Spectral and Spatial for Compressive Spectral Imaging

Yubo Dong · Dahua Gao · Tian Qiu · Yuyan Li · Minxi Yang · Guangming Shi

Learning a Simple Low-light Image Enhancer from Paired Low-light Instances

Zhenqi Fu · Yan Yang · Xiaotong Tu · Yue Huang · Xinghao Ding · Kai-Kuang Ma

Learning a Deep Color Difference Metric for Photographic Images

Haoyu Chen · Zhihua Wang · Yang Yang · Qilin Sun · Kede Ma

Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models

Cheng Guo · Leidong Fan · Ziyu Xue · Xiuhua Jiang

BiasBed – Rigorous Texture Bias Evaluation

Nikolai Kalischek · Rodrigo Daudt · Torben Peters · Reinhard Furrer · Jan D. Wegner · Konrad Schindler

A Unified HDR Imaging Method with Pixel and Patch Level

Qingsen Yan · Weiye Chen · song zhang · Yu Zhu · Jinqiu Sun · Yanning Zhang

Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement

Nancy Mehta · Akshay Dudhane · Subrahmanyam Murala · Syed Waqas Zamir · Salman Khan · Fahad Khan

Deep Discriminative Spatial and Temporal Network for Efficient Video Deblurring

Jinshan Pan · Boming Xu · Jiangxin Dong · Jianjun Ge · Jinhui Tang

1000 FPS HDR Video with a Spike-RGB Hybrid Camera

Yakun Chang · Chu Zhou · Yuchen Hong · hu liwen · Chao Xu · Tiejun Huang · Boxin Shi

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

Kun Zhou · Wenbo Li · Xiaoguang Han · Jiangbo Lu

Range-nullspace Video Frame Interpolation with Focalized Motion Estimation

Zhiyang Yu · Yu Zhang · Dongqing Zou · Xijun Chen · Jimmy Ren · Shunqing Ren

Deep Polarization Reconstruction with PDAVIS Events

Haiyang Mei · Zuowen Wang · Xin Yang · Xiaopeng Wei · Tobi Delbruck

Unsupervised space-time network for temporally-consistent segmentation of multiple motions

Etienne Meunier · Patrick Bouthemy

NeMo: Learning 3D Neural Motion Fields from Multiple Video Instances of the Same Action

Kuan-Chieh Wang · Zhenzhen Weng · Maria Xenochristou · Joao Araujo · Jeffrey Gu · Karen Liu · Serena Yeung

TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification

Haocong Rao · Chunyan Miao

FLAG3D: A 3D Fitness Activity Dataset with Language Instruction

Yansong Tang · Jinpeng Liu · Aoyang Liu · Bin Yang · Wenxun Dai · Yongming Rao · Jiwen Lu · Jie Zhou · Xiu Li

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation

Bowen Zhang · Chenyang Qi · Pan Zhang · Bo Zhang · HsiangTao Wu · Dong Chen · Qifeng Chen · Yong Wang · Fang Wen

Feature Representation Learning with Adaptive Displacement Generation and Transformer Fusion for Micro-Expression Recognition

Zhijun Zhai · Jianhui Zhao · Chengjiang Long · Wenju Xu · He Shuangjiang · huijuan zhao

Clothing-Change Feature Augmentation for Person Re-Identification

Ke Han · Shaogang Gong · Yan Huang · Liang Wang · Tieniu Tan

MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors

Yuang Zhang · Tiancai Wang · Xiangyu Zhang

Camouflaged Object Detection with Feature Decomposition and Edge Reconstruction

Chunming He · Kai Li · Yachao Zhang · Longxiang Tang · Yulun Zhang · Zhenhua Guo · Xiu Li

Source-free Adaptive Gaze Estimation with Uncertainty Reduction

Xin Cai · Jiabei Zeng · Shiguang Shan · Xilin CHEN

PyPose: A Library for Robot Learning with Physics-based Optimization

Chen Wang · Dasong Gao · Kuan Xu · Junyi Geng · Yaoyu Hu · Yuheng Qiu · Bowen Li · Fan Yang · Brady Moon · Abhinav Pandey · Aryan FNU · Jiahe Xu · Tianhao Wu · Haonan He · Daning Huang · Zhongqiang Ren · Shibo Zhao · Taimeng Fu · Pranay Reddy Anthireddy · Xiao Lin · Wenshan Wang · Jingnan Shi · Rajat Talak · Kun Cao · Yi Du · Han Wang · Huai Yu · Shanzhao Wang · Siyu Chen · Ananth Kashyap · Rohan Bandaru · Karthik Dantu · Jiajun Wu · Lihua Xie · Luca Carlone · Marco Hutter · Sebastian Scherer

Stimulus Verification is a Universal and Effective Sampler in Multi-modal Human Trajectory Prediction

Jianhua Sun · Yuxuan Li · Liang Chai · Cewu Lu

StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments

Sean Kulinski · Nicholas Waytowich · James Hare · David I. Inouye

ProphNet: Efficient Agent-Centric Motion Forecasting with Anchor-Informed Proposals

Xishun Wang · Tong Su · Fang Da · Xiaodong Yang

Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving

Xiaosong Jia · Penghao Wu · Li Chen · Jiangwei Xie · Conghui He · Junchi Yan · Hongyang Li

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

SHIXIANG TANG · Cheng Chen · Meilin Chen · Qingsong Xie · Yizhou Wang · Yuanzheng Ci · LEI BAI · Feng Zhu · Haiyang Yang · Li Yi · Rui Zhao · Wanli Ouyang

BEV-Guided Multi-Modality Fusion for Driving Perception

Yunze Man · Liangyan Gui · Yu-Xiong Wang

Robust and Scalable Gaussian Process Regression and Its Applications

Yifan Lu · Jiayi Ma · Leyuan Fang · Xin Tian · Junjun Jiang

Tangentially Elongated Gaussian Belief Propagation for Event-based Incremental Optical Flow Estimation

Jun Nagata · Yusuke Sekikawa

Adaptive Annealing for Robust Geometric Estimation

Sidhartha Chitturi · Lalit Manam · Venu Madhav Govindu

Iterative Geometry Encoding Volume for Stereo Matching

Xu Gangwei · Xianqi Wang · Xiaohuan Ding · Xin Yang

PMatch: Paired Masked Image Modeling for Dense Geometric Matching

Shengjie Zhu · Xiaoming Liu

Adaptive Spot-Guided Transformer for Consistent Local Feature Matching

Jiahuan Yu · Jiahao Chang · Jianfeng He · Tianzhu Zhang · Jiyang Yu · Feng Wu

Learning Rotation-Equivariant Features for Visual Correspondence

Jongmin Lee · Byungjin Kim · Seungwook Kim · Minsu Cho

UTM: A Unified Multiple Object Tracking Model with Identity-Aware Feature Enhancement

Sisi You · Hantao Yao · Bing-Kun BAO · Changsheng Xu

Conjugate Product Graphs for Globally Optimal 2D-3D Shape Matching

Paul Rötzer · Zorah Laehner · Florian Bernard

LP-DIF: Learning Local Pattern-specific Deep Implicit Function for 3D Objects and Scenes

Meng Wang · Yushen Liu · Yue Gao · Kanle Shi · Yi Fang · Zhizhong Han

HGNet: Learning Hierarchical Geometry from Points, Edges, and Surfaces

Ting Yao · Yehao Li · Yingwei Pan · Tao Mei

Neural Intrinsic Embedding for Non-rigid Point Cloud Matching

puhua jiang · Mingze Sun · Ruqi Huang

PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering

Fuchen Long · Ting Yao · Zhaofan Qiu · Lusong Li · Tao Mei

Self-positioning Point-based Transformer for Point Cloud Understanding

Jinyoung Park · Sanghyeok Lee · Sihyeon Kim · Yunyang Xiong · Hyunwoo Kim

PointConvFormer: Revenge of the Point-Based Convolution

Wenxuan Wu · Li Fuxin · Qi Shan

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Renrui Zhang · Liuhui Wang · Yu Qiao · Peng Gao · Hongsheng Li

Geometry and Uncertainty-Aware 3D Point Cloud Class-Incremental Semantic Segmentation

Yuwei Yang · Munawar Hayat · Zhao Jin · Chao Ren · Yinjie Lei

Learning Weather-General and Weather-Specific Features for Image Restoration Under Multiple Adverse Weather Conditions

Yurui Zhu · Tianyu Wang · Xueyang Fu · Xuanyu Yang · Xin Guo · Jifeng Dai · Yu Qiao · Xiaowei Hu

PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models

Minghua Liu · Yinhao Zhu · Hong Cai · Shizhong Han · Zhan Ling · Fatih Porikli · Hao Su

Semi-Weakly Supervised Object Kinematic Motion Prediction

Gengxin Liu · Qian Sun · Haibin Huang · Chongyang Ma · Yulan Guo · Li Yi · Hui Huang · Ruizhen Hu

Implicit Surface Contrastive Clustering for LiDAR Point Clouds

Zaiwei Zhang · Min Bai · Li Erran Li

LaserMix for Semi-Supervised LiDAR Semantic Segmentation

Lingdong Kong · Jiawei Ren · Liang Pan · Ziwei Liu

MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving

Jiale Li · Hang Dai · Hao Han · Yong Ding

GraVoS: Voxel Selection for 3D Point-Cloud Detection

Oren Shrout · Yizhak Ben-Shabat · Ayellet Tal

VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking

Yukang Chen · Jianhui Liu · Xiangyu Zhang · XIAOJUAN QI · Jiaya Jia

Virtual Sparse Convolution for Multimodal 3D Object Detection

Hai Wu · Chenglu Wen · Shaoshuai Shi · Xin Li · Cheng Wang

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection

Yang Jiao · ZEQUN JIE · Shaoxiang Chen · Jingjing Chen · Lin Ma · Yu-Gang Jiang

OrienterNet: Visual Localization in 2D Public Maps with Neural Matching

Paul-Edouard Sarlin · Daniel DeTone · Tsun-Yi Yang · Armen Avetisyan · Julian Straub · Tomasz Malisiewicz · Samuel Rota Bulò · Richard Newcombe · Peter Kontschieder · Vasileios Balntas

Uncertainty-aware Vision-based Metric Cross-view Geolocalization

Florian Fervers · Sebastian Bullinger · Christoph Bodensteiner · Michael Arens · Rainer Stiefelhagen

BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection

Lei Yang · Kaicheng Yu · tao tang · Jun Li · Kun Yuan · Li Wang · Xinyu Zhang · Peng Chen

Understanding the Robustness of 3D Object Detection with Bird’s-Eye-View Representations in Autonomous Driving

Zijian Zhu · Yichi Zhang · Hai Chen · Yinpeng Dong · Shu Zhao · Wenbo Ding · Jiachen Zhong · Shibao Zheng

Object Detection with Self-Supervised Scene Adaptation

ZEKUN ZHANG · Minh Hoai

AeDet: Azimuth-invariant Multi-view 3D Object Detection

Chengjian Feng · ZEQUN JIE · Yujie Zhong · Xiangxiang Chu · Lin Ma

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

Kaixin Xiong · Shi Gong · Xiaoqing Ye · Xiao Tan · Ji Wan · Errui Ding · Jingdong Wang · Xiang Bai

VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud

Ziqin Wang · Bowen Cheng · Lichen Zhao · Dong Xu · Yang Tang · Lyu Sheng

Modality-invariant Visual Odometry for Embodied Vision

Marius Memmel · Roman Bachmann · Amir Zamir

Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes

Rui Li · Dong Gong · Wei Yin · Hao Chen · Yu Zhu · Kaixuan Wang · Xiaozhi Chen · Jinqiu Sun · Yanning Zhang

OmniVidar: Omnidirectional Depth Estimation from Multi-Fisheye Images

Sheng Xie · Daochuan Wang · Yun-Hui Liu

DINN360: Deformable Invertible Neural Networks for Latitude-aware 360

\degree

Image Rescaling

Yichen Guo · Mai Xu · Lai Jiang · Ning Li · Leon Sigal · Yunjin Chen

GeoMVSNet: Learning Multi-View Stereo with Geometry Perception

Zhe Zhang · Rui Peng · Yuxi Hu · Ronggang Wang

A Practical Stereo Depth System for Smart Glasses

Jialiang Wang · Daniel Scharstein · Akash Bapat · Kevin Blackburn-Matzen · Matthew Yu · Jonathan Lehman · Suhib Alsisan · Yanghan Wang · Sam Tsai · Jan-Michael Frahm · Zijian He · Peter Vajda · Michael Cohen · Matt Uyttendaele

DC

2

: Dual-Camera Defocus Control by Learning to Refocus

Hadi AlZayer · Abdullah Abuolaim · Leung Chun Chan · Yang Yang · Ying Lou · Jia-Bin Huang · Abhishek Kar

iDisc: Internal Discretization for Monocular Depth Estimation

Luigi Piccinelli · Christos Sakaridis · Fisher Yu

SfM-TTR: Using Structure from Motion for Test-Time Refinement of Single-View Depth Networks

Sergio Izquierdo · Javier Civera

Inverting the Imaging Process by Learning an Implicit Camera Model

Xin Huang · Qi Zhang · Ying Feng · Hongdong Li · Qing Wang

Learning to Measure the Point Cloud Reconstruction Loss in a Representation Space

Tianxin Huang · Zhonggan Ding · Jiangning Zhang · Ying Tai · Zhenyu Zhang · Mingang Chen · Chengjie Wang · Yong Liu

Better “CMOS” Produces Clearer Images: Learning Space-Variant Blur Estimation for Blind Image Super-Resolution

Xuhai Chen · Jiangning Zhang · Chao Xu · Yabiao Wang · Chengjie Wang · Yong Liu

Delivering Arbitrary-Modal Semantic Segmentation

Jiaming Zhang · Ruiping Liu · Hao Shi · Kailun Yang · Simon Reiß · Haodong Fu · Kunyu Peng · Kaiwei Wang · Rainer Stiefelhagen

Efficient Hierarchical Entropy Model for Learned Point Cloud Compression

Rui Song · Chunyang Fu · Shan Liu · Ge Li

Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

Ruyang Liu · Jingjia Huang · Ge Li · Jiashi Feng · Xinglong Wu · Thomas Li

Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

Feng Liang · Bichen Wu · Xiaoliang Dai · Kunpeng Li · Yinan Zhao · Hang Zhang · Peizhao Zhang · Peter Vajda · Diana Marculescu

Imagic: Text-Based Real Image Editing with Diffusion Models

Bahjat Kawar · Shiran Zada · Oran Lang · Omer Tov · Huiwen Chang · Tali Dekel · Inbar Mosseri · michal Irani

Neumann Network with Recursive Kernels for Single Image Defocus Deblurring

Yuhui Quan · Zicong Wu · Hui Ji

Transfer4D: A framework for frugal motion capture and deformation transfer

Shubh Maheshwari · Rahul Narain · Ramya Hebbalaguppe

Iterative Proposal Refinement for Weakly-Supervised Video Grounding

Meng Cao · Fangyun Wei · Can Xu · Xiubo Geng · Long Chen · Can Zhang · Yuexian Zou · Tao Shen · Daxin Jiang

X

3

KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection

Marvin Klingner · Shubhankar Borse · Varun Ravi Kumar · Behnaz Rezaei · Venkatraman Narayanan · Senthil Yogamani · Fatih Porikli

AnyFlow: Arbitrary Scale Optical Flow with Implicit Neural Representation

Hyunyoung Jung · Zhuo Hui · Lei Luo · Haitao Yang · Feng Liu · Sungjoo Yoo · Rakesh Ranjan · Denis Demandolx

IterativePFN: True Iterative Point Cloud Filtering

Dasith de Silva Edirimuni · Xuequan Lu · Zhiwen Shao · Gang Li · Antonio Robles-Kelly · Ying He

Fake it till you make it: Learning transferable representations from synthetic ImageNet clones

Mert Bulent Sariyildiz · Karteek Alahari · Diane Larlus · Yannis Kalantidis

Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness

Zhijie Shen · Zishuo Zheng · Chunyu Lin · Lang Nie · Kang Liao · Shuai Zheng · Yao Zhao

Exploring Incompatible Knowledge Transfer in Few-shot Image Generation

Yunqing Zhao · Chao Du · Milad Abdollahzadeh · Tianyu Pang · Min Lin · Shuicheng YAN · Ngai-man Cheung

OmniObject3D: Large Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation

Tong Wu · Jiarui Zhang · Xiao Fu · Yuxin WANG · Jiawei Ren · Liang Pan · Wenyan Wu · Lei Yang · Jiaqi Wang · Chen Qian · Dahua Lin · Ziwei Liu

CelebV-Text: A Large-Scale Facial Text-Video Dataset

Jianhui Yu · Hao Zhu · Liming Jiang · CHEN CHANGE LOY · Weidong Cai · Wenyan Wu

TensoIR: Tensorial Inverse Rendering

Haian Jin · Isabella Liu · Peijia Xu · Xiaoshuai Zhang · Songfang Han · Sai Bi · Xiaowei Zhou · Zexiang Xu · Hao Su

Simultaneously Short- and Long-Term Temporal Modeling for Semi-Supervised Video Semantic Segmentation

Jiangwei Lao · Weixiang Hong · Xin Guo · Yingying Zhang · Wang Jian · Jingdong Chen · Wei Chu

Integral Neural Networks

Kirill Solodskikh · Azim Kurbanov · Ruslan Aydarkhanov · Irina Zhelavskaya · Yury Parfenov · Dehua Song · Stamatios Lefkimmiatis

FEND: A Future Enhanced Distribution-Aware Contrastive Learning Framework For Long-tail Trajectory Prediction

Yuning Wang · Pu Zhang · LEI BAI · Jianru Xue

NeuralEditor: Editing Neural Radiance Fields via Manipulating Point Clouds

Junkun Chen · Jipeng Lyu · Yu-Xiong Wang

3D Line Mapping Revisited

Shaohui Liu · Yifan Yu · Rémi Pautrat · Marc Pollefeys · Viktor Larsson

Single View Scene Scale Estimation using Scale Field

Byeong-Uk Lee · Jianming Zhang · Yannick Hold-Geoffroy · In So Kweon

PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes

Ruoyu Wang · Zehao Yu · Shenghua Gao

Self-supervised Super-plane for Neural 3D Reconstruction

Botao Ye · Sifei Liu · Xueting Li · Ming-Hsuan Yang

NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization

Zhixiang Min · Bingbing Zhuang · Samuel Schulter · Buyu Liu · Enrique Dunn · Manmohan Chandraker

Multi-sensor large-scale dataset for multi-view 3D reconstruction

Oleg Voynov · Gleb Bobrovskikh · Pavel Karpyshev · Saveliy Galochkin · Andrei-Timotei Ardelean · Arseniy Bozhenko · Ekaterina Karmanova · Pavel Kopanev · Yaroslav Labutin-Rymsho · Ruslan Rakhimov · Aleksandr Safin · Valerii Serpiva · Alexey Artemov · Evgeny Burnaev · Dzmitry Tsetserukou · Denis Zorin

AutoRecon: Automated 3D Object Discovery and Reconstruction

Yuang Wang · Xingyi He · Sida Peng · Haotong Lin · Hujun Bao · Xiaowei Zhou

A Large-Scale Homography Benchmark

Daniel Barath · Dmytro Mishkin · Michal Polic · Wolfgang Förstner · Jiri Matas

SparsePose: Sparse-View Camera Pose Regression and Refinement

Samarth Sinha · Jason Zhang · Andrea Tagliasacchi · Igor Gilitschenski · David Lindell

Few-shot Geometry-Aware Keypoint Localization

Xingzhe He · Gaurav Bharaj · David Ferman · Helge Rhodin · Pablo Garrido

Self-Supervised Representation Learning for CAD

Benjamin Jones · Michael Hu · Milin Kodnongbua · Vladimir Kim · Adriana Schulz

IMP: Iterative Matching and Pose Estimation with Adaptive Pooling

Fei XUE · Ignas Budvytis · Roberto Cipolla

SMOC-Net: Leveraging Camera Pose for Self-Supervised Monocular Object Pose Estimation

Tao Tan · Qiulei Dong

Markerless Camera-to-Robot Pose Estimation via Self-supervised Sim-to-Real Transfer

Jingpei Lu · Florian Richter · Michael Yip

TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation

Taeyeop Lee · Jonathan Tremblay · Valts Blukis · Bowen Wen · Byeong-Uk Lee · Inkyu Shin · Stan Birchfield · In So Kweon · Kuk-Jin YOON

3D-POP – An automated annotation approach to facilitate markerless 2D-3D tracking of freely moving birds with marker-based motion capture

Hemal Naik · Hoi Hang Chan · Junran Yang · Mathilde Delacoux · Iain Couzin · Fumihiro Kano · Máté Nagy

Delving into Discrete Normalizing Flows on SO(3) Manifold for Probabilistic Rotation Modeling

Yulin Liu · Haoran Liu · Yingda Yin · Yang Wang · Baoquan Chen · He Wang

PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers

Zhongwei Qiu · Yang Qiansheng · Jian Wang · Haocheng Feng · Junyu Han · Errui Ding · Chang Xu · Dongmei Fu · Jingdong Wang

Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos

Yilin Wen · Hao Pan · Lei Yang · Jia Pan · Taku Komura · Wenping Wang

GarmentTracking: Category-Level Garment Pose Tracking

Han Xue · Wenqiang Xu · Jieyi Zhang · Tutian Tang · Yutong Li · Wenxin Du · Ruolin Ye · Cewu Lu

Towards Transferable Targeted Adversarial Examples

Zhibo Wang · Hongshan Yang · Yunhe Feng · Peng Sun · Hengchang Guo · Zhifei Zhang · Kui Ren

Proximal Splitting Adversarial Attack for Semantic Segmentation

Jérôme Rony · Jean-Christophe Pesquet · Ismail Ayed

T-SEA: Transfer-based Self-Ensemble Attack on Object Detection

Hao Huang · Ziyan Chen · Huanran Chen · Yongtao Wang · Kevin Zhang

Reinforcement Learning-Based Black-Box Model Inversion Attacks

Gyojin Han · Jaehyun Choi · Haeil Lee · Junmo Kim

Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks

Bingxu Mu · Zhenxing Niu · Le Wang · xue wang · Qiguang Miao · Rong Jin · Gang Hua

MEDIC: Remove Model Backdoors via Importance Driven Cloning

Qiuling Xu · Guanhong Tao · Jean Honorio · Yingqi Liu · Shengwei An · Guangyu Shen · Siyuan Cheng · Xiangyu Zhang

Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection

Lianyu Wang · Meng Wang · Daoqiang Zhang · Huazhu Fu

Adversarially Masking Synthetic to Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation

Guangrui Li · Guoliang Kang · Xiaohan Wang · Yunchao Wei · Yi Yang

Instance-Aware Domain Generalization for Face Anti-Spoofing

Qianyu Zhou · Ke-Yue Zhang · Taiping Yao · Xuequan Lu · Ran Yi · Shouhong Ding · Lizhuang Ma

Bias-Eliminating Augmentation Learning for Debiased Federated Learning

Yuan-Yi Xu · Ci-Siang Lin · Yu-Chiang Frank Wang

Adaptive Channel Sparsity for Federated Learning under System Heterogeneity

Dongping Liao · Xitong Gao · Yiren Zhao · Cheng-zhong Xu

Reliable and Interpretable Personalized Federated Learning

Zixuan Qin · Liu Yang · Qilong Wang · Yahong Han · Qinghua Hu

DaFKD: Domain-aware Federated Knowledge Distillation

Haozhao Wang · Yichen Li · Wenchao Xu · Ruixuan Li · Yufeng Zhan · Zhigang Zeng

SimpleNet: A Simple Network for Image Anomaly Detection and Localization

Zhikang Liu · Yiming Zhou · Yuansheng Xu · Zilei Wang

A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation

Congqi Cao · Yue Lu · PENG WANG · Yanning Zhang

Masked Jigsaw Puzzle : A Versatile Position Embedding for Vision Transformers

Bin Ren · Yahui Liu · Yue Song · Wei Bi · Rita Cucchiara · Nicu Sebe · Wei Wang

ImageNet-E: Benchmarking Neural Network Robustness against Attribute Editing

Xiaodan Li · YUEFENG CHEN · Yao Zhu · Shuhui Wang · Rong Zhang · Hui Xue’

Private Image Generation with Dual-Purpose Auxiliary Classifier

Chen Chen · Daochang Liu · Siqi Ma · Surya Nepal · Chang Xu

Discriminator-Cooperated Feature Map Distillation for GAN Compression

Tie Hu · Mingbao Lin · Lizhou You · Fei Chao · Rongrong Ji

TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation

DEVAVRAT TOMAR · Guillaume Vray · Behzad Bozorgtabar · Jean-Philippe Thiran

Practical Network Acceleration with Tiny Sets

Guo-Hua Wang · Jianxin Wu

NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers

Yijiang Liu · Huanrui Yang · ZHEN DONG · Kurt Keutzer · Li Du · Shanghang Zhang

Bias Mimicking: A Simple Sampling Approach for Bias Mitigation

Maan Qraitem · Kate Saenko · Bryan Plummer

Masked Images Are Counterfactual Samples for Robust Fine-tuning

Yao Xiao · Ziyi Tang · Pengxu Wei · Cong Liu · Liang Lin

Samples with Low Loss Curvature Improve Data Efficiency

Isha Garg · Kaushik Roy

Defining and Quantifying the Emergence of Sparse Concepts in DNNs

Jie Ren · Mingjie Li · Qirui Chen · Huiqi Deng · Quanshi Zhang

Network Expansion For Practical Training Acceleration

Ning Ding · Yehui Tang · Kai Han · Chao Xu · Yunhe Wang

AstroNet: When Astrocyte Meets Artificial Neural Network

Mengqiao Han · Liyuan Pan · Xiabi Liu

Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization

Xingxuan Zhang · Renzhe Xu · Han Yu · Hao Zou · Peng Cui

Re-basin via implicit Sinkhorn differentiation

Fidel A Guerrero Pena · Heitor Medeiros · Thomas Dubail · Masih Aminbeidokhti · Eric Granger · Marco Pedersoli

Tunable Convolutions with Parametric Multi-Loss Optimization

Matteo Maggioni · Thomas Tanay · Francesca Babiloni · Steven McDonagh · Ales Leonardis

Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning

Xinwen Hou · Huangyuan Su · Jieyu Zhang · Xinwen Hou

Simulated Annealing in Early Layers Leads to Better Generalization

Amirmohammad Sarfi · Zahra Karimpour · Muawiz Chaudhary · Nasir Khalid · Mirco Ravanelli · Sudhir Mudur · Eugene Belilovsky

On the Stability-Plasticity Dilemma of Class-Incremental Learning

Dongwan Kim · Bohyung Han

Decoupling Learning and Remembering: a Bilevel Memory Framework with Knowledge Projection for Task-Incremental Learning

Wenju Sun · Qingyong Li · Jing Zhang · Wen Wang · Yangliao Geng

Generalization Matters: Loss Minima Flattening via Parameter Hybridization for Efficient Online Knowledge Distillation

Tianli Zhang · Mengqi Xue · Jiangtao Zhang · Haofei Zhang · Yu Wang · Lechao Cheng · Jie Song · Mingli Song

Regularizing Second-Order Influences for Continual Learning

Zhicheng Sun · Yadong MU · Gang Hua

Rethinking Feature-based Knowledge Distillation for Face Recognition

Jingzhi Li · Zidong Guo · Hui Li · Seungju Han · Ji-won Baek · Min Yang · Ran Yang · Sungjoo Suh

ERM-KTP: Knowledge-level Machine Unlearning via Knowledge Transfer

Shen Lin · Xiaoyu Zhang · Chenyang Chen · Xiaofeng Chen · Willy Susilo

Partial Network Cloning

Jingwen Ye · Songhua Liu · Xinchao Wang

Rebalancing Batch Normalization for Exemplar-based Class-Incremental Learning

Sungmin Cha · Sungjun Cho · Dasol Hwang · Sunwon Hong · Moontae Lee · Taesup Moon

1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions

Dongshuo Yin · Yiran Yang · Zhechao Wang · Hongfeng Yu · kaiwen wei · Xian Sun

MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models

Dohwan Ko · Joonmyung Choi · Hyeong Kyu Choi · Kyoung-Woon On · Byungseok Roh · Hyunwoo Kim

MDL-NAS: A Joint Multi-domain Learning framework for Vision Transformer

Shiguang Wang · TAO XIE · Jian Cheng · Xingcheng ZHANG · Haijun Liu

Independent Component Alignment for Multi-Task Learning

Dmitry Senushkin · Nikolay Patakin · Arsenii Kuznetsov · Anton Konushin

Revisiting Prototypical Network for Cross Domain Few-Shot Learning

Fei Zhou · Peng Wang · Lei Zhang · Wei Wei · Yanning Zhang

Feature Alignment and Uniformity for Test Time Adaptation

Shuai Wang · Daoan Zhang · Zipei YAN · Jianguo Zhang · Rui Li

MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning

shicai wei · Chunbo Luo · Yang Luo

PMR: Prototypical Modal Rebalance for Multimodal Learning

Yunfeng FAN · Wenchao Xu · Haozhao Wang · Junxiao Wang · Song Guo

Upcycling Models under Domain and Category Shift

Sanqing Qu · Tianpei Zou · Florian Röhrbein · Cewu Lu · Guang Chen · Dacheng Tao · changjun jiang

MHPL: Minimum Happy Points Learning for Active Source Free Domain Adaptation

Fan Wang · Zhongyi Han · Zhiyan Zhang · Rundong He · Yilong Yin

COT: Unsupervised Domain Adaptation with Clustering and Optimal Transport

Yang Liu · Zhipeng Zhou · Baigui Sun

FREDOM: Fairness Domain Adaptation Approach to Semantic Scene Understanding

Thanh-Dat Truong · Ngan Le · Bhiksha Raj · Jackson Cothren · Khoa Luu

Transfer Knowledge from Head to Tail: Uncertainty Calibration under Long-tailed Distribution

Jiahao Chen · Bing Su

Balanced Product of Calibrated Experts for Long-Tailed Recognition

Emanuel Sanchez Aimar · Arvi Jonnarth · Michael Felsberg · Marco Kuhlmann

Why is the winner the best?

Matthias Eisenmann · Annika Reinke · Vivienn Weru · Minu Tizabi · Fabian Isensee · Tim Adler · Sharib Ali · Vincent Andrearczyk · Marc Aubreville · Ujjwal Baid · Spyridon Bakas · Niranjan Balu · Sophia Bano · Jorge Bernal · Sebastian Bodenstedt · Alessandro Casella · Veronika Cheplygina · Marie Daum · Marleen de Bruijne · Adrien Depeursinge · Reuben Dorent · Jan Egger · David Ellis · Sandy Engelhardt · Melanie Ganz · Noha Ghatwary · Gabriel Girard · Patrick Godau · Anubha Gupta · Lasse Hansen · Kanako Harada · Mattias Heinrich · Nicholas Heller · Alessa Hering · Arnaud Huaulmé · Pierre Jannin · Ali Emre Kavur · Oldřich Kodym · Michal Kozubek · Jianning Li · Hongwei Li · Jun Ma · Carlos Isla · bjoern menze · Alison Noble · Valentin Oreiller · Nicolas Padoy · Sarthak Pati · Kelly Payette · Tim Rädsch · Jonathan Rafael-Patino · Vivek Bawa · Stefanie Speidel · Carole Sudre · Kimberlin van Wijnen · Martin Wagner · Donglai Wei · Amine Yamlahi · Moi Hoon Yap · Chun Yuan · Maximilian Zenk · Aneeq Zia · David Zimmerer · Dogu Baran Aydogan · Binod Bhattarai · Louise Bloch · Raphael Brüngel · Jihoon Cho · Chanyeol Choi · DOU QI · Ivan Ezhov · Christoph M. Friedrich · Clifton Fuller · Rebati Gaire · Adrian Galdran · Álvaro García Faura · Maria Grammatikopoulou · SeulGi Hong · Mostafa Jahanifar · Ikbeom Jang · Abdolrahim Kadkhodamohammadi · Inha Kang · Florian Kofler · Satoshi Kondo · Hugo Kuijf · Mingxing Li · Huan Luu · Tomaž Martinčič · Pedro Morais · Mohamed Naser · Bruno Oliveira · David Owen · Subeen Pang · Jinah Park · Sung-Hong Park · Szymon Plotka · Elodie Puybareau · Nasir Rajpoot · Kanghyun Ryu · Numan Saeed · Adam Shephard · Pengcheng Shi · Dejan Štepec · Ronast Subedi · Guillaume Tochon · Helena Torres · Helene Urien · João Vilaça · Kareem Wahid · haojie wang · jiacheng wang · Liansheng Wang · Xiyue Wang · Benedikt Wiestler · Marek Wodzinski · Fangfang Xia · Juanying Xie · Zhiwei Xiong · Sen Yang · Yanwu Yang · Zixuan Zhao · Klaus Maier-Hein · Paul Jaeger · Annette Kopp-Schneider · Lena Maier-hein

SuperDisco: Super-Class Discovery Improves Visual Recognition for the Long-Tail

Yingjun Du · Jiayi Shen · Xiantong Zhen · Cees Snoek

Learning from Noisy Labels with Decoupled Meta Label Purifier

Yuanpeng Tu · Boshen Zhang · Yuxi Li · Liang Liu · Jian Li · Yabiao Wang · Chengjie Wang · Cai Zhao

Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos

Rohit Gupta · Anirban Roy · Sujeong Kim · Claire Christensen · Todd Grindal · Sarah Gerard · Madeline Cincebeaux · Ajay Divakaran · Mubarak Shah

MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset

Chen Feng · Ioannis Patras

HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization

Sungyeon Kim · Boseung Jeong · Suha Kwak

Bi-directional Distribution Alignment for Transductive Zero Shot Learning

Zhicai Wang · YANBIN HAO · Tingting Mu · Ouxiang Li · Shuo Wang · Xiangnan He

BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency

Shuo Yang · xu Pan · Kai Wang · Yang You · Hongxun Yao · Tongliang Liu · Min Xu

Exploring and Exploiting Uncertainty for Incomplete Multi-View Classification

Mengyao Xie · Zongbo Han · Changqing Zhang · Yichen Bai · Qinghua Hu

GCFAgg: Global and Cross-view Feature Aggregation for Multi-view Clustering

Weiqing Yan · Yuanyang Zhang · Chenlei Lv · Chang Tang · Guanghui Yue · Liang Liao · Weisi Lin

LINe: Out-of-Distribution Detection by Leveraging Important Neurons

Yong Hyun Ahn · Gyeong-Moon Park · Seong Tae Kim

Visual prompt tuning for generative transfer learning

Kihyuk Sohn · Huiwen Chang · Jose Lezama · Luisa Polania Cabrera · Han Zhang · Yuan Hao · Irfan Essa · Lu Jiang

Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images

Tiancheng Lin · Yu Zhimiao · Hongyu Hu · Yi Xu · Chang-Wen Chen

Image Quality-aware Diagnosis via Meta-knowledge Co-embedding

Haoxuan Che · Siyu Chen · Hao Chen

KiUT: Knowledge-injected U-Transformer for Radiology Report Generation

Zhongzhen Huang · Xiaofan Zhang · Shaoting Zhang

Hierarchical discriminative learning improves visual representations of biomedical microscopy

Cheng Jiang · Xinhai Hou · Akhil Kondepudi · Asadur Chowdury · Christian Freudiger · Daniel Orringer · Honglak Lee · Todd Hollon

Pseudo-label Guided Contrastive Learning for Semi-supervised Medical Image Segmentation

Hritam Basak · Zhaozheng Yin

FFF: Fragment-Guided Flexible Fitting for Building Complete Protein Structures

Weijie Chen · Xinyan Wang · Yuhang Wang

Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images

Ming Y. Lu · Bowen Chen · Andrew Zhang · Drew Williamson · Richard Chen · Tong Ding · Long Le · Yung-Sung Chuang · Faisal Mahmood

ProD: Prompting-to-disentangle Domain Knowledge for Cross-domain Few-shot Image Classification

Tianyi Ma · Yifan Sun · Zongxin Yang · Yi Yang

Open-Set Representation Learning through Combinatorial Embedding

Geeho Kim · Junoh Kang · Bohyung Han

Multiclass Confidence and Localization Calibration for Object Detection

Bimsara Pathiraja · Malitha Gunawardhana · Muhammad Khan Khan

Distilling Scale-Aware Knowledge in Small Object Detector

Yichen Zhu · Qiqi Zhou · Ning Liu · Zhiyuan Xu · Zhicai Ou · mou xiaofeng · Jian Tang

Generating Features with Increased Crop-related Diversity for Few-Shot Object Detection

Jingyi Xu · Hieu Le · Dimitris Samaras

DETRs with Hybrid Matching

Ding Jia · Yuhui Yuan · Haodi He · Xiaopei Wu · Haojun Yu · Weihong Lin · Lei Sun · Chao Zhang · Han Hu

Adaptive Sparse Pairwise Loss for Object Re-Identification

Xiao Zhou · Yujie Zhong · Zhen Cheng · Fan Liang · Lin Ma

CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection

Shuailei Ma · Yuefeng Wang · Ying Wei · Jiaqi Fan · Thomas Li · Hongli Liu · fanbing Lv

Weak-shot Object Detection through Mutual Knowledge Transfer

Xuanyi Du · Weitao Wan · Chong Sun · Chen Li

Modeling the Distributional Uncertainty for Salient Object Detection Models

Jing Zhang · Mochu Xiang · Yuchao Dai · Xinyu Tian

Supervised Masked Knowledge Distillation for Few-Shot Transformers

Han Lin · Guangxing Han · Jiawei Ma · Shiyuan Huang · Xudong Lin · Shih-Fu Chang

Co-Salient Object Detection with Uncertainty-aware Group Exchange-Masking

Yang Wu · Huihui Song · Bo Liu · Kaihua Zhang · Dong Liu

Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation

Dahyun Kang · Piotr Koniusz · Minsu Cho · Naila Murray

DualRel: Semi-Supervised Mitochondria Segmentation from A Prototype Perspective

Huayu Mai · Rui Sun · Tianzhu Zhang · Zhiwei Xiong · Feng Wu

WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation

Jongheon Jeong · Yang Zou · Taewan Kim · DongQing Zhang · Avinash Ravichandran · Onkar Dabeer

Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization

Lian Xu · Wanli Ouyang · Mohammed Bennamoun · Farid Boussaid · Dan Xu

Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation

Zicheng Wang · Zhen Zhao · Xiaoxia Xing · Dong Xu · Xiangyu Kong · Luping Zhou

Boundary-enhanced Co-training for Weakly Supervised Semantic Segmentation

Shenghai Rong · Bohai Tu · Zilei Wang · Junjie Li

Balancing Logit Variation for Long-tailed Semantic Segmentation

Yuchao Wang · Jingjing Fei · Haochen Wang · Wei Li · Tianpeng Bao · Liwei Wu · Rui Zhao · Yujun Shen

Leveraging Hidden Positives for Unsupervised Semantic Segmentation

Hyun Seok Seong · WonJun Moon · Su Been Lee · Jae-Pil Heo

PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers

Jiacong Xu · Zixiang Xiong · Shankar P Bhattacharyya

AttentionShift: Iteratively Estimated Part-based Attention Map for Pointly Supervised Instance Segmentation

Mingxiang Liao · Zonghao Guo · Yuze Wang · Peng Yuan · bailan feng · Fang Wan

Principles of Forgetting in Domain-Incremental Semantic Segmentation in Adverse Weather Conditions

Tobias Kalb · Jürgen Beyerer

Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation

SHUTING HE · Henghui Ding · Wei Jiang

Interactive Segmentation as Gaussion Process Classification

Minghao Zhou · Hong Wang · Qian Zhao · Yuexiang Li · Yawen Huang · Deyu Meng · Yefeng Zheng

Meta Compositional Referring Expression Segmentation

Li Xu · Mark Huang · Xindi Shang · Zehuan Yuan · Ying Sun · Jun Liu

DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction

Shubhankar Borse · Debasmit Das · Hyojin Park · Hong Cai · Risheek Garrepalli · Fatih Porikli

Zero-shot Referring Image Segmentation with Global-Local Context Features

seonghoon yu · Paul Hongsuck Seo · Jeany Son

FreeSeg: Unified, Universal and Open-Vocabulary Image Segmentation

Jie Qin · Jie Wu · Pengxiang Yan · Ming Li · Yuxi Ren · Xuefeng Xiao · Yitong Wang · Rui Wang · Shilei Wen · Xin Pan · Xingang Wang

Semantic Human Parsing via Scalable Semantic Transfer over Multiple Label Domains

Jie Yang · Chaoqun Wang · Zhen Li · Junle Wang · Ruimao Zhang

Open Vocabulary Semantic Segmentation with Patch Aligned Contrastive Learning

Jishnu Mukhoti · Tsung-Yu Lin · Omid Poursaeed · Rui Wang · Ashish Shah · Philip Torr · Ser-Nam Lim

Neural Congealing: Aligning Images to a Joint Semantic Atlas

Dolev Ofri-Amar · Michal Geyer · Yoni Kasten · Tali Dekel

Open-Category Human-Object Interaction Pre-training via Language Modeling Framework

Sipeng Zheng · Boshen Xu · Qin Jin

Open-set Fine-grained Retrieval via Prompting Vision-Language Evaluator

Shijie Wang · Jianlong Chang · Haojie Li · Zhihui Wang · Wanli Ouyang · Qi Tian

R

2

Former: Unified

R

etrieval and

R

eranking Transformer for Place Recognition

Sijie Zhu · Linjie Yang · Chen Chen · Mubarak Shah · Xiaohui Shen · Heng Wang

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

Yuxin Fang · Wen Wang · Binhui Xie · Quan Sun · Ledell Wu · Xinggang Wang · Tiejun Huang · Xinlong Wang · Yue Cao

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

Maoyuan Ye · Jing Zhang · Shanshan Zhao · Juhua Liu · Tongliang Liu · Bo Du · Dacheng Tao

Finetune like you pretrain: Improved finetuning of zero-shot vision models

Sachin Goyal · Ananya Kumar · Sankalp Garg · J Kolter · Aditi Raghunathan

Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

Zhiqiu Lin · Samuel Yu · Zhiyi Kuang · Deepak Pathak · Deva Ramanan

DATE: Domain Adaptive Product Seeker for E-commerce

Haoyuan Li · Hao Jiang · Tao Jin · Mengyan Li · Yan Chen · Zhijie Lin · Yang Zhao · Zhou Zhao

Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval

Kuniaki Saito · Kihyuk Sohn · Xiang Zhang · Chun-Liang Li · Chen-Yu Lee · Kate Saenko · Tomas Pfister

Text-guided Unsupervised Latent Transformations for Multi-attribute Image Manipulation

Xiwen Wei · Zhen Xu · Cheng Liu · Si Wu · Zhiwen Yu · Hau-San Wong

Fine-grained Image-text Matching by Cross-modal Hard Aligning Network

pan zhengxin · Fangyu Wu · Bailing Zhang

RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-training

Chen-Wei Xie · Siyang Sun · Xiong Xiong · Yun Zheng · Deli Zhao · Jingren Zhou

Unifying Vision, Language, Layout and Tasks for Universal Document Processing

Zineng Tang · Ziyi Yang · Guoxin Wang · Yuwei Fang · Yang Liu · Chenguang Zhu · Michael Zeng · Cha Zhang · Mohit Bansal

MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID

Jianyang Gu · Kai Wang · Hao Luo · Chen Chen · Wei Jiang · Yuqiang Fang · Shanghang Zhang · Yang You · Jian ZHAO

EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding

Yanmin Wu · Xinhua Cheng · Renrui Zhang · Zesen Cheng · Jian Zhang

L-CoIns: Language-based Colorization with Instance Awareness

Zheng Chang · Shuchen Weng · Peixuan Zhang · Yu Li · Si Li · Boxin Shi

Learning Visual Representations via Language-Guided Sampling

Mohamed Samir Mahmoud Hussein Elbanani · Karan Desai · Justin Johnson

Shepherding Slots to Objects: Towards Stable and Robust Object-Centric Learning

Jinwoo Kim · Janghyuk Choi · Ho-Jin Choi · Seon Joo Kim

Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification

Yue Yang · Artemis Panagopoulou · Shenghao Zhou · Daniel Jin · Chris Callison-Burch · Mark Yatskar

Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks

Wenhui Wang · Hangbo Bao · Li Dong · Johan Bjorck · Zhiliang Peng · Qiang Liu · Kriti Aggarwal · Owais Khan Mohammed · Saksham Singhal · Subhojit Som · Furu Wei

Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations

Ziyan Yang · Kushal Kafle · Franck Dernoncourt · Vicente Ordonez

Leveraging per Image-Token Consistency for Vision-Language Pre-training

Yunhao GOU · Tom Ko · Hansi Yang · James Kwok · Yu Zhang · Mingxuan Wang

RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension

Jiamu Sun · Gen Luo · Yiyi Zhou · Xiaoshuai Sun · GUANNAN JIANG · Zhiyu Wang · Rongrong Ji

Understanding and Improving Visual Prompting: A Label-Mapping Perspective

Aochuan Chen · Yuguang Yao · Pin-Yu Chen · Yihua Zhang · Sijia Liu

Meta-Personalizing Vision-Language Models to Find Named Instances in Video

Chun-Hsiao Yeh · Bryan Russell · Josef Sivic · Fabian Caba · Simon Jenni

MaPLe: Multi-modal Prompt Learning

Muhammad Uzair Khattak · Hanoona Bangalath · Muhammad Maaz · Salman Khan · Fahad Khan

VQACL: A Novel Visual Question Answering Continual Learning Setting

Xi Zhang · Feifei Zhang · Changsheng Xu

Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language

Chuanhao Li · Zhen Li · Chenchen Jing · Yunde Jia · Yuwei Wu

Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge

Steven Spratley · Krista A. Ehinger · Tim Miller

Token Turing Machines

Michael Ryoo · Keerthana Gopalakrishnan · Kumara Kahatapitiya · Ted Xiao · Kanishka Rao · Austin Stone · Yao Lu · Julian Ibarz · Anurag Arnab

Policy Adaptation from Foundation Model Feedback

Yuying Ge · Annabella Macaluso · Li Erran Li · Ping Luo · Xiaolong Wang

LANA: A Language-Capable Navigator for Instruction Following and Generation

Xiaohan Wang · Wenguan Wang · Jiayi shao · Yi Yang

LEGO-Net: Learning Regular Rearrangements of Objects in Rooms

Qiuhong Anna Wei · Sijie Ding · Jeong Joon Park · Rahul Sajnani · Adrien Poulenard · Srinath Sridhar · Leonidas Guibas

Discovering the Real Association: Multimodal Causal Reasoning in Video Question Answering

Chuanqi Zang · Hanqing Wang · Mingtao Pei · Wei Liang

CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning

Yiting Cheng · Fangyun Wei · Jianmin Bao · Dong Chen · Wenqiang Zhang

Context De-confounded Emotion Recognition

Dingkang Yang · Zhaoyu Chen · Yuzheng Wang · Shunli Wang · Mingcheng Li · Liu Siao · Xiao Zhao · Shuai Huang · Zhiyan Dong · Peng Zhai · Lihua Zhang

Learning Emotion Representations from Verbal and Nonverbal Communication

Sitao Zhang · Yimu Pan · James Wang

CLIPPING: Distilling CLIP-Based Models with a Student Base for Video-Language Retrieval

Renjing Pei · Jianzhuang Liu · Weimian Li · Bin Shao · Songcen Xu · Peng Dai · Juwei Lu · Youliang Yan

Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval

Xiaoshuai Hao · Wanqian Zhang · Dayan Wu · Fei Zhu · Bo Li

StepFormer: Self-supervised Step Discovery and Localization in Instructional Videos

Nikita Dvornik · Isma Hadji · Ran Zhang · Konstantinos Derpanis · Rick Wildes · Allan Jepson

Text with Knowledge Graph Augmented Transformer for Video Captioning

Xin Gu · Guang Chen · Yufei Wang · Libo Zhang · Tiejian Luo · Longyin Wen

RILS: Masked Visual Reconstruction in Language Semantic Space

Shusheng Yang · Yixiao Ge · Kun Yi · Dian Li · Ying Shan · Xiaohu Qie · Xinggang Wang

DegAE: A New Pretraining Paradigm for Low-level Vision

Yihao Liu · Jingwen He · Jinjin Gu · Xiangtao Kong · Yu Qiao · Chao Dong

Teacher-generated spatial-attention labels boost robustness and accuracy of contrastive models

Yushi Yao · Chang Ye · Gamaleldin Elsayed · Junfeng He

CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose

Xu Zhang · Wen Wang · Zhe Chen · Yufei Xu · Jing Zhang · Dacheng Tao

MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model

Yatai Ji · Junjie Wang · Yuan Gong · Lin Zhang · yanru Zhu · WANG HongFa · Jiaxing Zhang · Tetsuya Sakai · Yujiu Yang

Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models

qu tang · Xiangyu Zhu · Zhen Lei · Zhaoxiang Zhang

Position-guided Text Prompt for Vision-Language Pre-training

Jinpeng Wang · Pan Zhou · Mike Zheng Shou · Shuicheng YAN

LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision & Language Models

Adrian Bulat · Georgios Tzimiropoulos

Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training

Junfan Lin · Jianlong Chang · Lingbo Liu · Guanbin Li · Liang Lin · Qi Tian · Chang-Wen Chen

GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation

Jingyang Huo · Qiang Sun · Boyan Jiang · Haitao Lin · Yanwei Fu

MetaCLUE: Towards Comprehensive Visual Metaphors Research

Arjun Akula · Brendan Driscoll · Pradyumna Narayana · Soravit Changpinyo · Zhiwei Jia · Suyash Damle · Garima Pruthi · S Basu · Leonidas Guibas · William Freeman · Yuanzhen Li · Varun Jampani

ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos

Zhou Yu · Lixiang Zheng · Zhou Zhao · Fei Wu · Jianping Fan · Kui Ren · Jun Yu

Where We Are and What We’re Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes

Brandon Clark · Alec Kerrigan · Parth Parag Kulkarni · Vicente Vivanco Cepeda · Mubarak Shah

CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation

Samir Yitzhak Gadre · Mitchell Wortsman · Gabriel Ilharco · Ludwig Schmidt · Shuran Song

Accelerating Vision-Language Pretraining with Free Language Modeling

Teng WANG · Yixiao Ge · Feng Zheng · Ran Cheng · Ying Shan · Xiaohu Qie · Ping Luo

Joint Visual Grounding and Tracking with Natural Language Specification

Li Zhou · Zikun Zhou · Kaige Mao · Zhenyu He

CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition with Variational Alignment

Jiangbin Zheng · Yile Wang · Cheng Tan · Siyuan Li · Ge Wang · Jun Xia · Yidong Chen · Stan Li

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling

Linjie Li · Zhe Gan · Kevin Lin · Chung-Ching Lin · Zicheng Liu · Ce Liu · Lijuan Wang

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

Davide Moltisanti · Frank Keller · Hakan Bilen · Laura Sevilla-Lara

WINNER: Weakly-supervised hIerarchical decompositioN and aligNment for spatio-tEmporal video gRounding

Mengze Li · Han Wang · Wenqiao Zhang · Jiaxu Miao · Zhou Zhao · Shengyu Zhang · Wei Ji · Fei Wu

HierVL: Learning Hierarchical Video-Language Embeddings

Kumar Ashutosh · Rohit Girdhar · Lorenzo Torresani · Kristen Grauman

Hierarchical Video-Moment Retrieval and Step-Captioning

Abhay Zala · Jaemin Cho · Satwik Kottur · Xilun Chen · Barlas Oguz · Yashar Mehdad · Mohit Bansal

AutoAD: Movie Description in Context

Tengda Han · Max Bain · Arsha Nagrani · Gul Varol · Weidi Xie · Andrew Zisserman

SViTT: Temporal Learning of Sparse Video-Text Transformers

Yi Li · Kyle Min · Subarna Tripathi · Nuno Vasconcelos

Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training

Yifei Huang · Lijin Yang · Yoichi Sato

Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies

Bei Gan · Xiujun Shu · Ruizhi Qiao · Haoqian Wu · Keyu Chen · Hanjun Li · Bo Ren

Weakly Supervised Video Emotion Detection and Prediction via Cross-Modal Temporal Erasing Network

Zhicheng Zhang · Lijuan Wang · Jufeng Yang

Two-Stream Networks for Weakly-Supervised Temporal Action Localization with Semantic-Aware Mechanisms

Yu Wang · Yadong Li · Hongbin Wang

Hybrid Active Learning via Deep Clustering for Video Action Detection

Aayush Jung B Rana · Yogesh Rawat

TriDet: Temporal Action Detection with Relative Boundary Modeling

Dingfeng Shi · Yujie Zhong · Qiong Cao · Lin Ma · Jia Li · Dacheng Tao

HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions

Anshul Shah · Aniket Roy · Ketul Shah · Shlok Mishra · David Jacobs · Anoop Cherian · Rama Chellappa

Post-Processing Temporal Action Detection

Sauradip Nag · Xiatian Zhu · Yi-Zhe Song · Tao Xiang

Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception

Junyu Gao · Mengyuan Chen · Changsheng Xu

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision

Xubo Liu · Egor Lakomkin · Konstantinos Vougioukas · Pingchuan Ma · Honglie Chen · Ruiming Xie · Morrie Doulaty · Niko Moritz · Jachym Kolar · Stavros Petridis · Maja Pantic · Christian Fuegen

ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration

Wei-Ning Hsu · Tal Remez · Bowen Shi · Jacob Donley · Yossi Adi

Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring

Joanna Hong · Minsu Kim · Jeongsoo Choi · Yong Man Ro

Temporal Attention Unit: Towards Efficient Spatiotemporal Predictive Learning

Cheng Tan · Zhangyang Gao · Lirong Wu · Yongjie Xu · Jun Xia · Siyuan Li · Stan Li

Latency Matters: Real-Time Action Forecasting Transformer

Harshayu Girase · Nakul Agarwal · Chiho Choi · Karttikeya Mangalam

Efficient Movie Scene Detection using State-Space Transformers

Md Mohaiminul Islam · Mahmudul Hasan · Kishan Shamsundar Athrey · Tony Braskich · Gediminas Bertasius

TarViS: A Unified Approach for Target-based Video Segmentation

Ali Athar · Alexander Hermans · Jonathon Luiten · Deva Ramanan · Bastian Leibe

HOOD: Hierarchical Graphs for Generalized Modelling of Clothing Dynamics

Artur Grigorev · Bernhard Thomaszewski · Michael Black · Otmar Hilliges

Structured 3D Features for Reconstructing Controllable Avatars

Enric Corona · Mihai Zanfir · Thiemo Alldieck · Eduard Bazavan · Andrei Zanfir · Cristian Sminchisescu

MonoHuman: Animatable Human Neural Field from Monocular Video

Zhengming Yu · Wei Cheng · Xian Liu · Wenyan Wu · Kwan-Yee Lin

JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields

Xi WANG · Robin Courant · Jinglei Shi · Eric Marchand · Marc Christie

InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds

Tianjian Jiang · Xu Chen · Jie Song · Otmar Hilliges

X-Avatar: Expressive Human Avatars

Kaiyue Shen · Chen Guo · Manuel Kaufmann · Juan Zarate · Julien Valentin · Jie Song · Otmar Hilliges

OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering

Zhiyuan Ma · Xiangyu Zhu · Guo-Jun Qi · Zhen Lei · Lei Zhang

Learning Personalized High Quality Volumetric Head Avatars from Monocular RGB Videos

Ziqian Bai · Feitong Tan · Zeng Huang · Kripasindhu Sarkar · Danhang Tang · Di Qiu · Abhimitra Meka · Ruofei Du · Mingsong Dou · Sergio Orts-Escolano · Rohit Pandey · Ping Tan · Thabo Beeler · Sean Fanello · Yinda Zhang

AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction

Aggelina Chatziagapi · Dimitris Samaras

NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images

Mingwu Zheng · Haiyu Zhang · Hongyu Yang · Di Huang

Continuous Landmark Detection with 3D Queries

Prashanth Chandran · Gaspard Zoss · Paulo Gotardo · Derek Bradley

GlassesGAN: Eyewear Personalization using Synthetic Appearance Discovery and Targeted Subspace Modeling

Richard Plesh · Peter Peer · Vitomir Struc

High-Res Facial Appearance Capture from Polarized Smartphone Images

Dejan Azinovic · Olivier Maury · Christophe Hery · Matthias Niessner · Justus Thies

Interactive Cartoonization with Controllable Perceptual Factors

Namhyuk Ahn · Patrick Kwon · Jihye Back · Kibeom Hong · Mark Kim

SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations

Pu Li · Jianwei Guo · Xiaopeng Zhang · Dong-ming Yan

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

Jiacheng Wei · Hao Wang · Jiashi Feng · Guosheng Lin · Kim-Hui Yap

High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition

Tianyu Luan · Yuanhao Zhai · Jingjing Meng · Zhong Li · Zhang Chen · Yi Xu · Junsong Yuan

Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process

Yuhan Li · Yishun Dou · Xuanhong Chen · Bingbing Ni · Yilin Sun · Yutian Liu · Fuzhen Wang

Consistent View Synthesis with Pose-Guided Diffusion Models

Hung-Yu Tseng · Qinbo Li · Changil Kim · Suhib Alsisan · Jia-Bin Huang · Johannes Kopf

Patch-based 3D Natural Scene Generation from a Single Example

Weiyu Li · Xuelin Chen · Jue Wang · Baoquan Chen

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

Siyuan Huang · Zan Wang · Puhao Li · Baoxiong Jia · Tengyu Liu · Yixin Zhu · Wei Liang · Song-Chun Zhu

DA Wand: Distortion-Aware Selection using Neural Mesh Parameterization

Richard Liu · Noam Aigerman · Vladimir Kim · Rana Hanocka

Neural Vector Fields: Implicit Representation by Explicit Learning

Xianghui Yang · Guosheng Lin · Zhenghao Chen · Luping Zhou

Octree Guided Unoriented Surface Reconstruction

Chamin Hewa Koneputugodage · Yizhak Ben-Shabat · Stephen Gould

Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction

Mingfang Zhang · Jinglu Wang · Xiao Li · Yifei Huang · Yoichi Sato · Yan Lu

Multi-View Reconstruction using Signed Ray Distance Functions (SRDF)

Pierre Zins · Yuanlu Xu · Edmond Boyer · Stefanie Wuhrer · Tony Tung

VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction

Yufan Ren · Fangjinhua Wang · Tong Zhang · Marc Pollefeys · Sabine Süsstrunk

TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering

Jaehoon Choi · Dongki Jung · Taejae Lee · SangWook Kim · YoungDong Jung · Dinesh Manocha · Donghwan Lee

RelightableHands: Efficient Neural Relighting of Articulated Hand Models

Shun Iwase · Shunsuke Saito · Tomas Simon · Stephen Lombardi · Timur Bagautdinov · Rohan Joshi · Fabian Prada · Takaaki Shiratori · Yaser Sheikh · Jason Saragih

Computational Flash Photography through Intrinsics

Sepideh Sarajian Maralan · Chris Careaga · Yagiz Aksoy

PixHt-Lab: Pixel Height Based Light Effect Generation for Image Compositing

Yichen Sheng · Jianming Zhang · Julien Philip · Yannick Hold-Geoffroy · Xin Sun · HE Zhang · Lu Ling · Bedrich Benes

Tensor4D : Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering

Ruizhi Shao · Zerong Zheng · Hanzhang Tu · Boning Liu · Hongwen Zhang · Yebin Liu

UV Volumes for Real-time Rendering of Editable Free-view Human Performance

Yue Chen · Xuan Wang · Xingyu Chen · Qi Zhang · Xiaoyu Li · Yu Guo · Jue Wang · Fei Wang

HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling

Benjamin Attal · Jia-Bin Huang · Christian Richardt · Johannes Kopf · Michael Zollhöfer · Matthew O’Toole · Changil Kim

Complementary Intrinsics from Neural Radiance Fields and CNNs for Outdoor Scene Relighting

Siqi Yang · Xuanning Cui · Yongjie Zhu · Jiajun Tang · Si Li · Zhaofei Yu · Boxin Shi

Balanced Spherical Grid for Egocentric View Synthesis

Changwoon Choi · Sang Min Kim · Young Min Kim

pCON: Polarimetric Coordinate Networks for Neural Scene Representations

Henry Peters · Yunhao Ba · Achuta Kadambi

MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures

Zhiqin Chen · Thomas Funkhouser · Peter Hedman · Andrea Tagliasacchi

ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field

Zhe Jun Tang · Tat-Jen Cham · Haiyu Zhao

NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds

chen yang · Peihao Li · Zanwei Zhou · Shanxin Yuan · Bingbing Liu · Xiaokang Yang · Weichao Qiu · Wei Shen

Progressively Optimized Local Radiance Fields for Robust View Synthesis

Andreas Meuleman · Yu-Lun Liu · Chen Gao · Jia-Bin Huang · Changil Kim · Min Kim Kim · Johannes Kopf

Removing Objects From Neural Radiance Fields

Silvan Weder · Guillermo Garcia-Hernando · Aron Monszpart · Marc Pollefeys · Gabriel Brostow · Michael Firman · Sara Vicente

SCADE: Space Carving with Ambiguity-aware Depth Estimates

Mikaela Uy · Ricardo Martin Brualla · Leonidas Guibas · Ke Li

ContraNeRF: Generalizable Neural Radiance Fields for Synthetic-to-real Novel View Synthesis via Contrastive Learning

Hao Yang · Lanqing HONG · Aoxue Li · Tianyang Hu · Zhenguo Li · Gim Lee · Liwei Wang

JacobiNeRF: NeRF Shaping with Mutual Information Gradients

Xiaomeng Xu · Yanchao Yang · Kaichun Mo · Boxiao Pan · Li Yi · Leonidas Guibas

Fresnel Microfacet BRDF: Unification of Polari-Radiometric Surface-Body Reflection

Tomoki Ichikawa · Yoshiki Fukao · Shohei Nobuhara · Ko Nishino

DartBlur: Privacy Preservation with Detection Artifacts Suppression

Baowei Jiang · Bing Bai · Haozhe Lin · Yu Wang · Yuchen Guo · LU FANG

Evading Forensic Classifiers with Attribute-Conditioned Adversarial Faces

Fahad Shamshad · Koushik Srivatsan · Karthik Nandakumar

RIATIG: Reliable and Imperceptible Adversarial Text-to-Image Generation with Natural Prompts

Han Liu · Yuhao Wu · Shixuan Zhai · Bo Yuan · Ning Zhang

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization

Zifan Wang · Nan Ding · Tomer Levinboim · Xi Chen · Radu Soricut

Randomized Adversarial Training via Taylor Expansion

Gaojie Jin · Xinping Yi · Dengyu Wu · Ronghui Mu · Xiaowei Huang

Adversarial Counterfactual Visual Explanations

Guillaume Jeanneret · Loic Simon · Frederic Jurie

Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization

Jianping Zhang · Yizhan Huang · Weibin Wu · Michael Lyu

Dynamic Generative Targeted Attacks with Pattern Injection

Weiwei Feng · Nanqing Xu · Tianzhu Zhang · Yongdong Zhang

Turning Strengths into Weaknesses: A Certified Robustness Inspired Attack Framework against Graph Neural Networks

Binghui Wang · Meng Pang · Yun Dong

Re-thinking Model Inversion Attacks Against Deep Neural Networks

Ngoc-Bao Nguyen · Keshigeyan Chandrasegaran · Milad Abdollahzadeh · Ngai-man Cheung

Can’t Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Zeyang Sha · Xinlei He · Ning Yu · Michael Backes · Yang Zhang

Detecting Backdoors in Pre-trained Encoders

Shiwei Feng · Guanhong Tao · Siyuan Cheng · Guangyu Shen · Xiangzhe Xu · Yingqi Liu · Kaiyuan Zhang · Shiqing Ma · Xiangyu Zhang

STDLens: Model Hijacking-resilient Federated Learning for Object Detection

Ka-Ho Chow · Ling Liu · Wenqi Wei · Fatih Ilhan · Yanzhao Wu

Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations

Hagay Michaeli · Tomer Michaeli · Daniel Soudry

FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning

Yuanhao Xiong · Ruochen Wang · Minhao Cheng · Felix Yu · Cho-Jui Hsieh

Rethinking Federated Learning with Domain Shift: A Prototype View

Wenke Huang · Mang Ye · Zekun Shi · He Li · Bo Du

Fair Federated Medical Image Segmentation via Client Contribution Estimation

Meirui Jiang · Holger Roth · Wenqi Li · Dong Yang · Can Zhao · Vishwesh Nath · Daguang Xu · DOU QI · Ziyue Xu

Class Balanced Adaptive Pseudo Labeling for Federated Semi-Supervised Learning

Ming Li · Qingli Li · Yan Wang

Prototypical Residual Networks for Anomaly Detection and Localization

Hui Zhang · Zuxuan Wu · Zheng Wang · Zhineng Chen · Yu-Gang Jiang

Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection

Chen Zhang · Guorong Li · Yuankai Qi · Shuhui Wang · Laiyun Qing · Qingming Huang · Ming-Hsuan Yang

A New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories

Reza Akbarian Bafghi · Danna Gurari

Boosting Verified Training for Robust Image Classifications via Abstraction

Zhaodi Zhang · Zhiyi Xue · Yang Chen · Si Liu · Yueling Zhang · Jing Liu · Min Zhang

Soft Augmentation for Image Classification

Yang Liu · Shen Yan · Laura Leal-Taixé · James Hays · Deva Ramanan

Re-GAN: Data-Efficient GANs Training via Architectural Reconfiguration

Divya Saxena · Jiannong Cao · Jiahao XU · Tarun Kulshrestha

AdaptiveMix: Improving GAN Training via Feature Space Shrinkage

Haozhe Liu · Wentian Zhang · Bing Li · Haoqian Wu · Nanjun He · Yawen Huang · Yuexiang Li · Bernard Ghanem · Yefeng Zheng

Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck

Jongheon Jeong · Sihyun Yu · Hankook Lee · Jinwoo Shin

Bit-shrinking: Limiting Instantaneous Sharpness for Improving Post-training Quantization

Lin Chen · Bo Peng · Zheyang Li · Wenming Tan · Ye Ren · Jun Xiao · Shiliang Pu

Robust Generalization against Photon-Limited Corruptions via Worst-Case Sharpness Minimization

Zhuo Huang · Miaoxi Zhu · Xiaobo Xia · Li Shen · Jun Yu · Chen Gong · Bo Han · Bo Du · Tongliang Liu

OT-Filter: An Optimal Transport Filter for Learning with Noisy Labels

Chuanwen Feng · Yilong Ren · Xike Xie

Don’t Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis

Thomas FEL · Melanie Ducoffe · David Vigouroux · Remi Cadene · Mikaël Capelle · Claire NICODEME · Thomas Serre

Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations

Alexander Binder · Leander Weber · Sebastian Lapuschkin · Grégoire Montavon · Klaus Muller · Wojciech Samek

ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

Sanghyun Woo · Shoubhik Debnath · Ronghang Hu · Xinlei Chen · Zhuang Liu · In So Kweon · Saining Xie

Regularization of polynomial networks for image recognition

Grigorios Chrysos · Bohan Wang · Jiankang Deng · Volkan Cevher

Stitchable Neural Networks

Zizheng Pan · Jianfei Cai · Bohan Zhuang

DepGraph: Towards Any Structural Pruning

Gongfan Fang · Xinyin Ma · Mingli Song · Michael Bi Mi · Xinchao Wang

Meta-Learning with a Geometry-Adaptive Preconditioner

Suhyun Kang · Duhun Hwang · Moonjung Eo · Taesup Kim · Wonjong Rhee

Class Adaptive Network Calibration

Bingyuan Liu · Jérôme Rony · Adrian Galdran · Jose Dolz · Ismail Ayed

Differentiable Architecture Search with Random Features

zhang xuanyang · Yonggang Li · Xiangyu Zhang · Yongtao Wang · Jian Sun

DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks

Samyak Jain · Sravanti Addepalli · Pawan Sahu · Priyam Dey · Venkatesh Babu Radhakrishnan

NICO++: Towards better bechmarks for Out-of-Distribution Generalization

Xingxuan Zhang · Yue He · Renzhe Xu · Han Yu · Zheyan Shen · Peng Cui

Bilateral Memory Consolidation for Continual Learning

Xing Nie · Shixiong Xu · Xiyan Liu · Gaofeng Meng · Chunlei Huo · Shiming Xiang

CafeBoost: Causal Feature Boost to Eliminate Task-Induced Bias for Class Incremental Learning

Benliu Qiu · Hongliang Li · Haitao Wen · Heqian Qiu · Lanxiao Wang · Fanman Meng · Qingbo Wu · Lili Pan

Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval

Yi Xie · Huaidong Zhang · Xuemiao Xu · Jianqing Zhu · Shengfeng He

Generic-to-Specific Distillation of Masked Autoencoders

Wei Huang · Zhiliang Peng · Li Dong · Furu Wei · Jianbin Jiao · Qixiang Ye

Heterogeneous Continual Learning

Divyam Madaan · Hongxu Yin · Wonmin Byeon · Jan Kautz · Pavlo Molchanov

Manipulating Transfer Learning for Property Inference

Yulong Tian · Fnu Suya · Anshuman Suri · Fengyuan Xu · David Evans

Adapting Shortcut with Normalizing Flow: An Efficient Tuning Framework for Visual Recognition

Yaoming Wang · Bowen Shi · XIAOPENG ZHANG · Jin Li · Yuchen Liu · Wenrui Dai · Chenglin Li · Hongkai Xiong · Qi Tian

A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation

Hui Tang · Kui Jia

Switchable Representation Learning Framework with Self-compatibility

shengsen wu · Yan Bai · Yihang Lou · Xiongkun Linghu · Jianzhong He · LINGYU DUAN

Domain Expansion of Image Generators

Yotam Nitzan · MICHAEL GHARBI · Richard Zhang · Taesung Park · Jun-Yan Zhu · Daniel Cohen-Or · Eli Shechtman

Robust Test-Time Adaptation in Dynamic Scenarios

Longhui Yuan · Binhui Xie · Shuang Li

Train/Test-Time Adaptation with Retrieval

Luca Zancato · Alessandro Achille · Tian Yu Liu · Matthew Trager · Pramuditha Perera · Stefano Soatto

Bi-level Meta-learning for Few-shot Domain Generalization

Xiaorong Qin · Xinhang Song · Shuqiang Jiang

Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information

Weijie Su · Xizhou Zhu · Chenxin Tao · Lewei Lu · Bin Li · Gao Huang · Yu Qiao · Xiaogang Wang · Jie Zhou · Jifeng Dai

Multi-modal Learning with Missing Modality via Shared-Specific Feature Modeling

Hu Wang · Yuanhong Chen · Congbo Ma · Jodie Avery · M. Louise Hull · Gustavo Carneiro

DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation

Fengyi Shen · Akhil Gurram · Ziyuan Liu · He Wang · Alois Knoll

Progressive Open Space Expansion for Open Set Model Attribution

Tianyun Yang · Danding Wang · Fan Tang · Xinying Zhao · Juan Cao · Sheng Tang

DLBD: A Self-Supervised Direct-Learned Binary Descriptor

Bin Xiao · Yang Hu · Bo Liu · Xiuli Bi · Weisheng Li · Xinbo Gao

DAA: A Delta Age AdaIN operation for age estimation via binary code transformer

Ping Chen · Xingpeng Zhang · Ye Li · Ju Tao · Bin Xiao · Bing Wang · zongjie jiang

Curvature-Balanced Feature Manifold Learning for Long-Tailed Classification

Yanbiao Ma · Licheng Jiao · Fang Liu · Shuyuan Yang · Xu Liu · Lingling Li

Global and Local Mixture Consistency Cumulative Learning for Long-tailed Visual Recognitions

Fei Du · peng yang · Qi Jia · Fengtao Nan · xiaoting chen · Yun Yang

No One Left Behind: Improving the Worst Categories in Long-Tailed Learning

Yingxiao Du · Jianxin Wu

Learning Imbalanced Data with Vision Transformers

Zhengzhuo Xu · Ruikang Liu · Shuo Yang · Zenghao Chai · Chun Yuan

Ranking Regularization for Critical Rare Classes: Minimizing False Positives at a High True Positive Rate

Kiarash Mohammadi · He Zhao · Mengyao Zhai · Frederick Tung

MarginMatch: Using Training Dynamics of Unlabeled Data for Semi-Supervised Learning

Tiberiu Sosea · Cornelia Caragea

CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning

Jianlong Wu · Haozhe Yang · Tian Gan · Ning Ding · Feijun Jiang · Liqiang Nie

Boosting Transductive Few-Shot Fine-tuning with Margin-based Uncertainty Weighting and Probability Regularization

Ran Tao · Hao Chen · Marios Savvides

Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

Yun-Hao Cao · Peiqin Sun · Shuchang Zhou

Towards Bridging the Performance Gaps of Joint Energy-based Models

Xiulong Yang · Qing Su · Shihao Ji

Siamese DETR

Zeren Chen · Gengshi Huang · Wei Li · Jianing Teng · Kun Wang · Jing Shao · CHEN CHANGE LOY · Lyu Sheng

Highly Confident Local Structure Based Consensus Graph Learning for Incomplete Multi-view Clustering

Jie Wen · Chengliang Liu · Gehui Xu · Zhihao Wu · Chao Huang · Lunke Fei · Yong Xu

Block Selection Method for Using Feature Norm in Out-of-Distribution Detection

Yeonguk Yu · Sungho Shin · Seongju Lee · Changhyun Jun · Kyoobin Lee

Causally-Aware Intraoperative Imputation for Overall Survival Time Prediction

Xiang Li · Xuelin Qian · Litian Liang · Lingjie Kong · Qiaole Dong · Chen Jiejun · Dingxia Liu · Xiuzhong Yao · Yanwei Fu

PEFAT: Boosting Semi-supervised Medical Image Classification via Pseudo-loss Estimation and Feature Adversarial Training

Zeng Qingjie · Yutong Xie · Lu Zilin · Yong Xia

Histopathology Whole Slide Image Analysis with Heterogeneous Graph Representation Learning

Tsai Chan Chan · Fernando Julio Cendra · Lan Ma · Guosheng Yin · Lequan Yu

MCF: Mutual Correction Framework for Semi-Supervised Medical Image Segmentation

Yongchao Wang · Bin Xiao · Xiuli Bi · Weisheng Li · Xinbo Gao

DoNet: Deep De-overlapping Network for Cytology Instance Segmentation

Hao JIANG · Rushan Zhang · Yanning Zhou · Yumeng Wang · Hao Chen

Weakly supervised segmentation with point annotations for histopathology images via contrast-based variational model

hongrun zhang · Liam Burrows · Yanda Meng · Declan Sculthorpe · ABHIK MUKHERJEE · Sarah Coupland · Ke Chen · Yalin Zheng

Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture

Mido Assran · Quentin Duval · Pascal Vincent · Ishan Misra · Piotr Bojanowski · Michael Rabbat · Yann LeCun · Nicolas Ballas

Boosting Detection in Crowd Analysis via Underutilized Output Features

Shaokai Wu · Fengyu Yang

Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection

Jiakang Yuan · Bo Zhang · Xiangchao Yan · Tao Chen · Botian Shi · Yikang LI · Yu Qiao

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection

Chang Liu · Weiming Zhang · Xiangru Lin · Wei Zhang · Xiao Tan · Junyu Han · Xiaomao Li · Errui Ding · Jingdong Wang

Large-scale Training Data Search for Object Re-identification

Yue Yao · Tom Gedeon · Liang Zheng

SOOD: Towards Semi-Supervised Oriented Object Detection

Wei Hua · Dingkang Liang · jingyu li · Xiaolong Liu · Zhikang Zou · Xiaoqing Ye · Xiang Bai

Zero-Shot Object Counting

Jingyi Xu · Hieu Le · Vu Nguyen · Viresh Ranjan · Dimitris Samaras

SAP-DETR: Bridging the Gap between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency

Yang Liu · Yao Zhang · Yixin Wang · Yang Zhang · Jiang Tian · zhongchao shi · Jianping Fan · Zhiqiang He

Knowledge Combination to Learn Rotated Detection Without Rotated Annotation

Tianyu Zhu · Bryce Ferenczi · Pulak Purkait · Tom Drummond · Hamid Rezatofighi · Anton Hengel

The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector

Caixia Zhou · Yaping Huang · Mengyang Pu · Qingji Guan · Li Huang · Haibin Ling

Decoupled Semantic Prototypes enable learning from arbitrary annotation types for semi-weakly segmentation in expert-driven domains

Simon Reiß · Constantin Seibold · Alexander Freytag · Erik Rodner · Rainer Stiefelhagen

Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt

HAO LI · Dingwen Zhang · Nian Liu · Lechao Cheng · Yalun Dai · Chao Zhang · Xinggang Wang · Junwei Han

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection

Zhenglin Zhou · Huaxia Li · Hong Liu · Nanyang Wang · Gang Yu · Rongrong Ji

Fuzzy Positive Learning for Semi-supervised Semantic Segmentation

Pengchong Qiao · Zhidan Wei · Yu Wang · Zhennan Wang · Guoli Song · Fan Xu · Xiangyang Ji · Chang Liu · Jie Chen

Sparsely Annotated Semantic Segmentation with Adaptive Gaussian Mixtures

Linshan Wu · Zhun Zhong · Leyuan Fang · Xingxin He · Qiang Liu · Jiayi Ma · Hao Chen

Spatial-temporal Concept based Explanation of 3D ConvNets

Ying Ji · Yu Wang · Jien Kato

Weakly-Supervised Domain Adaptive Semantic Segmentation with Prototypical Contrastive Learning

Anurag Das · Yongqin Xian · Dengxin Dai · Bernt Schiele

Exemplar-FreeSOLO: Enhancing Unsupervised Instance Segmentation with Exemplars

TAOSEEF ISHTIAK · Qing En · Yuhong Guo

Decoupling Human and Camera Motion from Videos in the Wild

Vickie Ye · Georgios Pavlakos · Jitendra Malik · Angjoo Kanazawa

CIRCLE: Capture In Rich Contextual Environments

Joao Araujo · Jiaman Li · Karthik Vetrivel · Rishi Agarwal · Deepak Gopinath · Jiajun Wu · Alexander Clegg · Karen Liu

CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects

Nick Heppert · Muhammad Zubair Irshad · Sergey Zakharov · Katherine Liu · Rareș Ambruș · Jeannette Bohg · Abhinav Valada · Thomas Kollar

DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects

Chen Bao · Helin Xu · Yuzhe Qin · Xiaolong Wang

FLEX: Full-Body Grasping Without Full-Body Grasps

Purva Tendulkar · Didac Suris Coll-Vinent · Carl Vondrick

Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes

Jihyun Lee · Minhyuk Sung · Honggyu Choi · Tae-Kyun Kim

One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer

Jing Lin · Ailing Zeng · Haoqian Wang · Lei Zhang · Yu Li

Implicit 3D Human Mesh Recovery using Consistency with Pose and Shape from Unseen-view

Hanbyel Cho · Yooshin Cho · Jaesung Ahn · Junmo Kim

Flow supervision for Deformable NeRF

Chaoyang Wang · Lachlan MacDonald · Laszlo Jeni · Simon Lucey

FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views

Vinoj Yasanga Jayasundara Magalle Hewa · Amit Agrawal · Nicolas Heron · Abhinav Shrivastava · Larry Davis

POEM: Reconstructing Hand in a Point Embedded Multi-view Stereo

Lixin Yang · Jian Xu · Licheng Zhong · Xinyu Zhan · Zhicheng Wang · Kejian Wu · Cewu Lu

Clothed Human Performance Capture with a Double-layer Neural Radiance Fields

Kangkan Wang · Guofeng Zhang · Suxu Cong · Jian Yang

VGFlow: Visibility guided Flow Network for Human Reposing

Rishabh Jain · Krishna Kumar Singh · Mayur Hemani · Jingwan Lu · Mausoom Sarkar · Duygu Ceylan · Balaji Krishnamurthy

HandNeRF: Neural Radiance Fields for Animatable Interacting Hands

Zhiyang Guo · Wengang Zhou · Min Wang · Li Li · Houqiang Li

PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters

Shuhong Chen · Kevin Zhang · Yichun Shi · Heng Wang · Yiheng Zhu · Guoxian Song · Sizhe An · Janus Kristjansson · Xiao Yang · Matthias Zwicker

PointAvatar: Deformable Point-based Head Avatars from Videos

Yufeng Zheng · Wang Yifan · Gordon Wetzstein · Michael Black · Otmar Hilliges

Ham2Pose: Animating Sign Language Notation into Pose Sequences

Rotem Shalev Arkushin · Amit Moryossef · Ohad Fried

Auto-CARD: Efficient and Robust Codec Avatar Driving for Real-time Mobile Telepresence

Yonggan Fu · Yuecheng Li · Chenghui Li · Jason Saragih · Peizhao Zhang · Xiaoliang Dai · Yingyan Lin

Learning Locally Editable Virtual Humans

Hsuan-I Ho · Lixin Xue · Jie Song · Otmar Hilliges

Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation

Rui Zhao · Wei Li · Zhipeng Hu · Lincheng Li · Zhengxia Zou · Zhenwei Shi · Changjie Fan

Learning Neural Parametric Head Models

Simon Giebenhain · Tobias Kirschstein · Markos Georgopoulos · Martin Rünz · Lourdes Agapito · Matthias Niessner

Next3D: Generative Neural Texture Rasterization for 3D-Aware Head Avatars

Jingxiang Sun · Xuan Wang · Lizhen Wang · Xiaoyu Li · Yong Zhang · Hongwen Zhang · Yebin Liu

Graphics Capsule: Learning Hierarchical 3D Face Representations from 2D Images

Chang Yu · Xiangyu Zhu · Xiaomei Zhang · Zhaoxiang Zhang · Zhen Lei

Parameter Efficient Local Implicit Image Function Network for Face Segmentation

Mausoom Sarkar · Nikitha S R · Mayur Hemani · Rishabh Jain · Balaji Krishnamurthy

StyleGene: Crossover and Mutation of Region-level Facial Genes for Kinship Face Synthesis

Hao Li · Xianxu Hou · Zepeng Huang · Linlin Shen

PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360

Sizhe An · Hongyi Xu · Yichun Shi · Guoxian Song · Umit Ogras · Linjie Luo

Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion

Yushi LAN · Xuyi Meng · Shuai Yang · CHEN CHANGE LOY · Bo Dai

3D Highlighter: Localizing Regions on 3D Shapes via Text Descriptions

Dale Decatur · Itai Lang · Rana Hanocka

Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

Jiale Xu · Xintao Wang · Weihao Cheng · Yan-Pei Cao · Ying Shan · Xiaohu Qie · Shenghua Gao

Efficient View Synthesis and 3D-based Multi-Frame Denoising with Multiplane Feature Representations

Thomas Tanay · Ales Leonardis · Matteo Maggioni

Diffusion-Based Signed Distance Fields for 3D Shape Generation

Jaehyeok Shim · Changwoo Kang · Kyungdon Joo

Persistent Nature: A Generative Model of Unbounded 3D Worlds

Lucy Chai · Richard Tucker · Zhengqi Li · Phillip Isola · Noah Snavely

OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields

Haim Sawdayee · Amir Vaxman · Amit Bermano

Sphere-Guided Training of Neural Implicit Surfaces

Andreea Dogaru · Andrei-Timotei Ardelean · Savva Ignatyev · Egor Zakharov · Evgeny Burnaev

NeuralUDF: Learning Unsigned Distance Fields for Multi-view Reconstruction of Surfaces with Arbitrary Topologies

Xiaoxiao Long · Cheng Lin · Lingjie Liu · Yuan Liu · Peng Wang · Christian Theobalt · Taku Komura · Wenping Wang

Looking Through the Glass: Neural Surface Reconstruction Against High Specular Reflections

Jiaxiong Qiu · Peng-Tao Jiang · Yifan Zhu · Ze-Xin Yin · Ming-Ming Cheng · Bo Ren

Teleidoscopic Imaging System for Microscale 3D Shape Reconstruction

Ryo Kawahara · Meng-Yu Kuo · Shohei Nobuhara

The Differentiable Lens: Compound Lens Search over Glass Surfaces and Materials for Object Detection

Geoffroi Côté · Fahim Mannan · Simon Thibault · Jean-Francois Lalonde · Felix Heide

SunStage: Portrait Reconstruction and Relighting using the Sun as a Light Stage

Yifan Wang · Aleksander Holynski · Xiuming Zhang · Cecilia Zhang

Nighttime smartphone reflective flare removal using optical center symmetry prior

Yuekun Dai · Yihang Luo · Shangchen Zhou · Chongyi Li · CHEN CHANGE LOY

ORCA: Glossy Objects as Radiance Field Cameras

Kushagra Tiwary · Akshat Dave · Nikhil Behari · Tzofi Klinghoffer · Ashok Veeraraghavan · Ramesh Raskar

ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of Real World Objects

Marco Toschi · Riccardo De Matteo · Riccardo Spezialetti · Daniele Gregorio · Luigi Di Stefano · Samuele Salti

Neural Scene Chronology

Haotong Lin · Qianqian Wang · Ruojin Cai · Sida Peng · Hadar Averbuch-Elor · Xiaowei Zhou · Noah Snavely

DyNCA: Real-time Dynamic Texture Synthesis Using Neural Cellular Automata

Ehsan Pajouheshgar · Yitao Xu · Tong Zhang · Sabine Süsstrunk

TriVol: Point Cloud Rendering via Triple Volumes

Tao Hu · Xiaogang Xu · Ruihang Chu · Jiaya Jia

Occlusion-Free Scene Recovery via Neural Radiance Fields

Chengxuan Zhu · Renjie Wan · Yunkai Tang · Boxin Shi

Transforming Radiance Field with Lipschitz Network for Photorealistic 3D Scene Stylization

Zicheng Zhang · Yinglu Liu · Congying Han · Yingwei Pan · Tiande Guo · Ting Yao

PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields

Zhengfei Kuang · Fujun Luan · Sai Bi · Zhixin Shu · Gordon Wetzstein · Kalyan Sunkavalli

Masked Wavelet Representation for Compact Neural Radiance Fields

Daniel Rho · Byeonghyeon Lee · Seungtae Nam · Joo Chan Lee · Jong Hwan Ko · Eunbyung Park

SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields

Ashkan Mirzaei · Tristan Aumentado-Armstrong · Konstantinos Derpanis · Jonathan Kelly · Marcus Brubaker · Igor Gilitschenski · Alex Levinshtein

MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs

Seunghyeon Seo · Donghoon Han · Yeonjin Chang · Nojun Kwak

GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images

Jianchuan Chen · Wentao Yi · Liqian Ma · Xu Jia · Huchuan Lu

NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors

Congyue Deng · Chiyu Jiang · Charles R. Qi · Xinchen Yan · Yin Zhou · Leonidas Guibas · Dragomir Anguelov

RobustNeRF: Ignoring Distractors with Robust Losses

Sara Sabour · Suhani Vora · Daniel Duckworth · Ivan Krasin · David Fleet · Andrea Tagliasacchi

High-fidelity Event-Radiance Recovery via Transient Event Frequency

Jin Han · Yuta Asano · Boxin Shi · Yinqiang Zheng · Zhihang Zhong

TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization

Fabrizio Guillaro · Davide Cozzolino · Avneesh Sud · Nicholas Dufour · Luisa Verdoliva

CLIP2Protect: Protecting Facial Privacy using Text-Guided Makeup via Adversarial Latent Search

Fahad Shamshad · Muhammad Muzammal Naseer · Karthik Nandakumar

Discrete Point-wise Attack Is Not Enough: Generalized Manifold Adversarial Attack for Face Recognition

Qian Li · Yuxiao Hu · Ye Liu · Dongxiao Zhang · Xin Jin · Yuntian Chen

Generalist: Decoupling Natural and Robust Generalization

Hongjun Wang · Yisen Wang

AGAIN: Adversarial Training with Attribution Span Enlargement and Hybrid Feature Fusion

Shenglin Yin · kelu Yao · Sheng Shi · Yangzhou Du · Zhen Xiao

HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation

Jian Ding · Nan Xue · Gui-Song Xia · Bernt Schiele · Dengxin Dai

Pruning Parameterization with Bi-level Optimization for Efficient Semantic Segmentation on the Edge

Changdi Yang · Pu Zhao · Yanyu Li · Wei Niu · Jiexiong Guan · Hao Tang · Minghai Qin · Bin Ren · Xue Lin · Yanzhi Wang

Towards Open-World Segmentation of Parts

Tai-Yu Pan · Qing Liu · Wei-Lun Chao · Brian Price

SegLoc: Learning Segmentation-based Representations for Privacy-Preserving Visual Localization

Maxime Pietrantoni · Martin Humenberger · Torsten Sattler · Gabriela Csurka

GeoNet: Benchmarking Unsupervised Adaptation across Geographies

Tarun Kalluri · Wangdong Xu · Manmohan Chandraker

Modeling Entities as Semantic Points for Visual Information Extraction in the Wild

Zhibo Yang · Rujiao Long · Pengfei Wang · Sibo Song · Humen Zhong · Wenqing Cheng · Xiang Bai · Cong Yao

DPF: Learning Dense Prediction Fields with Weak Supervision

Xiaoxue Chen · Yuhang Zheng · Yupeng Zheng · Qiang Zhou · Hao Zhao · Guyue Zhou · Ya-Qin Zhang

Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

Man Liu · Feng Li · Chunjie Zhang · Yunchao Wei · Huihui Bai · Yao Zhao

Universal Instance Perception as Object Discovery and Retrieval

Bin Yan · Yi Jiang · Jiannan Wu · Dong Wang · Ping Luo · Zehuan Yuan · Huchuan Lu

Learning Attention as Disentangler for Compositional Zero-shot Learning

Shaozhe Hao · Kai Han · Kwan-Yee K. Wong

CLIP is Also an Efficient Segmenter: A Text-Driven Approach for Weakly Supervised Semantic Segmentation

Yuqi Lin · Minghao Chen · Wenxiao Wang · Boxi Wu · Ke Li · Binbin Lin · Haifeng Liu · Xiaofei He

Self-supervised Implicit Glyph Attention for Text Recognition

Tongkun Guan · Chaochen Gu · Jingzheng Tu · Xue Yang · Qi Feng · yudi zhao · Wei Shen

Visual Recognition by Request

Chufeng Tang · Lingxi Xie · XIAOPENG ZHANG · Xiaolin Hu · Qi Tian

Aligning Bag of Regions for Open-Vocabulary Object Detection

Size Wu · Wenwei Zhang · Sheng Jin · Wentao Liu · CHEN CHANGE LOY

CLIP^2: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data

Yihan Zeng · Chenhan Jiang · Jiageng Mao · Jianhua Han · Chaoqiang Ye · Qingqiu Huang · Dit-Yan Yeung · Zhen Yang · Xiaodan Liang · Hang Xu

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining

Yanxin Long · Youpeng Wen · Jianhua Han · Hang Xu · Pengzhen Ren · Wei Zhang · Shen Zhao · Xiaodan Liang

Towards Unified Scene Text Spotting based on Sequence Generation

Taeho Kil · Seonghyeon Kim · Sukmin Seo · Yoonsik Kim · Daehee Kim

Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Renrui Zhang · Xiangfei Hu · Bohao Li · Siyuan Huang · Hanqiu Deng · Yu Qiao · Peng Gao · Hongsheng Li

Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval

Tan Pan · Furong Xu · Xudong Yang · Sifeng He · Chen Jiang · Qingpei Guo · Feng Qian · Xiaobo Zhang · Yuan Cheng · Lei Yang · Wei Chu

Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!

Zaid Khan · Vijay Kumar B G · Samuel Schulter · Xiang Yu · Yun Fu · Manmohan Chandraker

ConStruct-VL: Data-Free Continual Structured VL Concepts Learning

James Smith · Paola Cascante-Bonilla · Assaf Arbelle · Donghyun Kim · Rameswar Panda · David Cox · Diyi Yang · Zsolt Kira · Rogerio Feris · Leonid Karlinsky

À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting

Benjamin Bowman · Alessandro Achille · Luca Zancato · Matthew Trager · Pramuditha Perera · Giovanni Paolini · Stefano Soatto

Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering

Zhenwei Shao · Zhou Yu · Meng Wang · Jun Yu

Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning

Zhuowan Li · Xingrui Wang · Elias Stengel-Eskin · Adam Kortzlewski · Wufei Ma · Benjamin Van Durme · Alan Yuille

Visual Programming: Compositional visual reasoning without training

Tanmay Gupta · Aniruddha Kembhavi

Multimodal Prompting with Missing Modalities for Visual Recognition

Yi-Lun Lee · Yi-Hsuan Tsai · Wei-Chen Chiu · Chen-Yu Lee

EXCALIBUR: Encouraging and Evaluating Embodied Exploration

Hao Zhu · Raghav Kapoor · So Yeon Min · Winson Han · Jiatai Li · Kaiwen Geng · Graham Neubig · Yonatan Bisk · Aniruddha Kembhavi · Luca Weihs

Iterative Vision-and-Language Navigation

Jacob Krantz · Shurjo Banerjee · Wang Zhu · Jason Corso · Peter Anderson · Stefan Lee · Jesse Thomason

Adaptive Zone-aware Hierarchical Planner for Vision-Language Navigation

Chen Gao · Xingyu Peng · Mi Yan · He Wang · Lirong Yang · Haibing Ren · Hongsheng Li · Si Liu

SkyEye: Self-Supervised Bird’s-Eye-View Semantic Mapping Using Monocular Frontal View Images

Nikhil Gosala · Kürsat Petek · Paulo Drews-Jr · Wolfram Burgard · Abhinav Valada

Natural Language-Assisted Sign Language Recognition

Ronglai Zuo · Fangyun Wei · Brian Mak

Learning to Predict Situation Hyper-Graphs for Video Question Answering

Aisha Urooj · Hilde Kuehne · Bo Wu · Kim Chheu · Walid Bousselham · Chuang Gan · Niels Lobo · Mubarak Shah

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

Bo He · Jun Wang · Jielin Qiu · Trung Bui · Abhinav Shrivastava · Zhaowen Wang

Clover: Towards A Unified Video-Language Alignment and Fusion Model

Jingjia Huang · Yinan Li · Jiashi Feng · Xinglong Wu · Xiaoshuai Sun · Rongrong Ji

Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

Xudong Lin · Simran Tiwari · Shiyuan Huang · Manling Li · Mike Zheng Shou · Heng Ji · Shih-Fu Chang

PDPP:Projected Diffusion for Procedure Planning in Instructional Videos

Hanlin Wang · Yilu Wu · Sheng Guo · Limin Wang

Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations

Yiwu Zhong · Licheng Yu · Yang Bai · Shangwen Li · Xueting Yan · Yin Li

Text-Visual Prompting for Efficient 2D Temporal Video Grounding

Yimeng Zhang · Xin Chen · Jinghan Jia · Sijia Liu · Ke Ding

Language-Guided Music Recommendation for Video via Prompt Analogies

Daniel McKee · Justin Salamon · Josef Sivic · Bryan Russell

MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering

Difei Gao · Luowei Zhou · Lei Ji · Linchao Zhu · Yi Yang · Mike Zheng Shou

Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization

Chen Ju · Kunhao Zheng · Jinxiang Liu · Peisen Zhao · Ya Zhang · Jianlong Chang · Qi Tian · Yanfeng Wang

Cascade Evidential Learning for Open-world Weakly-supervised Temporal Action Localization

Mengyuan Chen · Junyu Gao · Changsheng Xu

STMixer: A One-Stage Sparse Action Detector

Tao Wu · Mengqi Cao · Ziteng Gao · Gangshan Wu · Limin Wang

The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction

Alexandros Stergiou · Dima Damen

A Large-scale Robustness Analysis of Video Action Recognition Models

Madeline Chantry · Naman Biyani · Prudvi Kamtam · Shruti Vyas · Hamid Palangi · Vibhav Vineet · Yogesh Rawat

Learning to Dub Movies via Hierarchical Prosody Models

Gaoxiang Cong · Liang Li · Yuankai Qi · Zheng-Jun Zha · Qi Wu · Wenyu Wang · Bin. Jiang · Ming-Hsuan Yang · Qingming Huang

iQuery: Instruments as Queries for Audio-Visual Sound Separation

Jiaben Chen · Renrui Zhang · Dongze Lian · Jiaqi Yang · Ziyao Zeng · Jianbo Shi

Egocentric Auditory Attention Localization in Conversations

Fiona Ryan · Hao Jiang · Abhinav Shukla · James Rehg · Vamsi Krishna Ithapu

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Jiadong Wang · Xinyuan Qian · Malu Zhang · Robby Tan · Haizhou Li

Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning

Kai Li · Deep A Patel · Erik Kruus · Martin Min

Referring Multi-Object Tracking

Dongming Wu · Wencheng Han · Tiancai Wang · Xingping Dong · Xiangyu Zhang · Jianbing Shen

A Generalized Framework for Video Instance Segmentation

Miran Heo · Sukjun Hwang · Jeongseok Hyun · Hanjung Kim · Seoung Wug Oh · Joon-Young Lee · Seon Joo Kim

LSTFE-Net:Long Short-Term Feature Enhancement Network for Video Small Object Detection

Jinsheng Xiao · Yuanxu Wu · Yunhua Chen · Shurui Wang · Zhongyuan Wang · Jiayi Ma

Streaming Video Model

Yucheng Zhao · Chong Luo · Chuanxin Tang · Dongdong Chen · Noel Codella · Zheng-Jun Zha

Video Event Restoration Based on Keyframes for Video Anomaly Detection

Zhiwei Yang · Jing Liu · Zhaoyang Wu · Peng Wu · Xiaotao Liu

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Long Lian · Zhirong Wu · Stella Yu

SeqTrack: Sequence to Sequence Learning for Visual Object Tracking

Xin Chen · Houwen Peng · Dong Wang · Huchuan Lu · Han Hu

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

Limin Wang · Bingkun Huang · Zhiyu Zhao · Zhan Tong · Yinan He · Yi Wang · Yali Wang · Yu Qiao

Iterative Next Boundary Detection for Instance Segmentation of Tree Rings in Microscopy Images of Shrub Cross Sections

Alexander Gillert · Giulia Resente · Alba Anadon-Rosell · Martin Wilmking · Uwe Freiherr von Lukas

Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

Mingyu Ding · Yikang Shen · Lijie Fan · Zhenfang Chen · Zitian Chen · Ping Luo · Joshua Tenenbaum · Chuang Gan

SimpSON: Simplifying Photo Cleanup with Single-Click Distracting Object Segmentation Network

Chuong Huynh · Yuqian Zhou · Zhe Lin · Connelly Barnes · Eli Shechtman · Sohrab Amirghodsi · Abhinav Shrivastava

Ada

MAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders

Wele Bandara Bandara · Naman Patel · Ali Gholami · Mehdi Nikkhah · Motilal Agrawal · Vishal Patel

FlexiViT: One Model for All Patch Sizes

Lucas Beyer · Pavel Izmailov · Alexander Kolesnikov · Mathilde Caron · Simon Kornblith · Xiaohua Zhai · Matthias Minderer · Michael Tschannen · Ibrahim Alabdulmohsin · Filip Pavetic

Improving Visual Representation Learning through Perceptual Understanding

Samyakh Tukra · Fred Hoffman · Ken Chatfield

Revealing the Dark Secrets of Masked Image Modeling

Zhenda Xie · Zigang Geng · Jingcheng Hu · Zheng Zhang · Han Hu · Yue Cao

Non-Contrastive Unsupervised Learning of Physiological Signals from Video

Jeremy Speth · Nathan Vance · Patrick Flynn · Adam Czajka

High-resolution image reconstruction with latent diffusion models from human brain activity

Yu Takagi · Shinji Nishimoto

RIFormer: Keep Your Vision Backbone Effective But Removing Token Mixer

Jiahao Wang · Songyang Zhang · Yong Liu · Taiqiang Wu · Yujiu Yang · Xihui Liu · Kai Chen · Ping Luo · Dahua Lin

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference

Haoran You · Yunyang Xiong · Xiaoliang Dai · Peizhao Zhang · Bichen Wu · Haoqi Fan · Peter Vajda · Yingyan Lin

EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention

Xinyu Liu · Houwen Peng · Ningxin Zheng · Yuqing Yang · Han Hu · Yixuan Yuan

InternImage: Exploring Large-Scale Vision Fundamental Models with Deformable Convolutions

Wenhai Wang · Jifeng Dai · Zhe Chen · Zhenhang Huang · Zhiqi Li · Xizhou Zhu · Xiaowei Hu · Tong Lu · Lewei Lu · Hongsheng Li · Xiaogang Wang · Yu Qiao

Memory-friendly Scalable Super-resolution via Rewinding Lottery Ticket Hypothesis

林 锦 · Xiaotong Luo · ming Hong · Yanyun Qu · Yuan Xie · Zongze Wu

Learned Image Compression with Mixed Transformer-CNN Architectures

Jinming Liu · Heming Sun · Jiro Katto

NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling

Shishira Maiya · Sharath Girish · Max Ehrlich · Hanyu Wang · Kwot Sin Lee · Patrick Poirson · Pengxiang Wu · Chen Wang · Abhinav Shrivastava

Complexity-guided Slimmable Decoder for Efficient Deep Video Compression

Zhihao Hu · Dong Xu

Context-Based Trit-Plane Coding for Progressive Image Compression

Seungmin Jeon · KWANG PYO CHOI · YOUNGO PARK · Chang-Su Kim

End-to-end Video Matting with Trimap Propagation

Wei-Lun Huang · Ming-Sui Lee

Rethinking Image Super Resolution from Long-Tailed Distribution Learning Perspective

Yuanbiao Gou · Peng Hu · Jiancheng Lv · Hongyuan Zhu · Xi Peng

Shape-aware Text-driven Layered Video Editing

Yao-Chih Lee · Ji-Ze Jang · Yi-Ting Chen · Elizabeth Qiu · Jia-Bin Huang

Dimensionality-Varying Diffusion Process

Han Zhang · Ruili Feng · Zhantao Yang · Lianghua Huang · Yu Liu · Yifei Zhang · Yujun Shen · Deli Zhao · Jingren Zhou · Fan Cheng

On Distillation of Guided Diffusion Models

Chenlin Meng · Robin Rombach · Ruiqi Gao · Diederik Kingma · Stefano Ermon · Jonathan Ho · Tim Salimans

Towards Flexible Multi-modal Document Models

Naoto Inoue · Kotaro Kikuchi · Edgar Simo-Serra · Mayu Otani · Kota Yamaguchi

Toward verifiable and reproducible human evaluation for text-to-image generation

Mayu Otani · Riku Togashi · Yu Sawai · Ryosuke Ishigami · Yuta Nakashima · Esa Rahtu · Janne Heikkila · Shin’ichi Satoh

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style

Haoming Lu · Hazarapet Tunanyan · Kai Wang · Shant Navasardyan · Zhangyang Wang · Humphrey Shi

Freestyle Layout-to-Image Synthesis

Han Xue · Zhiwu Huang · Qianru Sun · Li Song · Wenjun Zhang

ReCo: Region-Controlled Text-to-Image Generation

Zhengyuan Yang · Jianfeng Wang · Zhe Gan · Linjie Li · Kevin Lin · Chenfei Wu · Nan Duan · Zicheng Liu · Ce Liu · Michael Zeng · Lijuan Wang

Conditional Text Image Generation with Diffusion Models

Yuanzhi Zhu · Zhaohai Li · Tianwei Wang · Mengchao He · Cong Yao

Fix the Noise: Disentangling Source Feature for Controllable Domain Translation

Dongyeun Lee · Jae Young Lee · Doyeon Kim · Jaehyun Choi · Jaejun Yoo · Junmo Kim

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis

Ming Tao · Bing-Kun BAO · Hao Tang · Changsheng Xu

DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model

Gwanghyun Kim · Se Young Chun

NÜWA-LIP: Language-guided Image Inpainting with Defect-free VQGAN

Minheng Ni · Xiaoming Li · Wangmeng Zuo

Neural Preset for Color Style Transfer

Zhanghan Ke · Yuhao LIU · Lei Zhu · Nanxuan Zhao · Rynson Lau

Restoration of Hand-Drawn Architectural Drawings using Latent Space Mapping with Degradation Generator

Nakkwan Choi · Seungjae Lee · Yongsik Lee · Seungjoon Yang

Neural Fourier Filter Bank

Zhijie Wu · Yuhe Jin · Kwang Moo Yi

PyramidFlow: High-Resolution Defect Contrastive Localization using Pyramid Normalizing Flow

Jiarui Lei · Xiaobo Hu · Yue Wang · Dong Liu

PHA: Patch-wise High-frequency Augmentation for Transformer-based Person Re-identification

Guiwei Zhang · Yongfei Zhang · Tianyu Zhang · Bo Li · Shiliang Pu

Comprehensive and Delicate: An Efficient Transformer for Image Restoration

Haiyu Zhao · Yuanbiao Gou · Boyun Li · Dezhong Peng · Jiancheng Lv · Xi Peng

Ultrahigh Resolution Image/Video Matting with Spatio-Temporal Sparsity

Yanan SUN · Chi-Keung Tang · Yu-Wing Tai

Equivalent Transformation and Dual Stream Network Construction for Mobile Image Super-Resolution

Jiahao Chao · Zhou Zhou · Hongfan Gao · Jiali Gong · Zhengfeng Yang · Zhenbing Zeng · Lydia Dehbi

Real-time 6K Image Rescaling with Rate-distortion Optimization

Chenyang Qi · XIN YANG · Ka Leong Cheng · Ying-Cong Chen · Qifeng Chen

Human Guided Ground-truth Generation for Realistic Image Super-resolution

Du Chen · Jie Liang · Xindong Zhang · Ming Liu · Hui Zeng · Lei Zhang

Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Weixia Zhang · Guangtao Zhai · Ying Wei · Xiaokang Yang · Kede Ma

Visual Recognition-Driven Image Restoration for Multiple Degradation with Intrinsic Semantics Recovery

Zizheng Yang · Jie Huang · Jiahao Chang · man zhou · Hu Yu · Jinghao Zhang · Feng Zhao

ShadowDiffusion: When Degradation Prior Meets Diffusion Model for Shadow Removal

Lanqing Guo · Chong Wang · Wenhan Yang · Siyu Huang · Yufei Wang · Hanspeter Pfister · Bihan Wen

Probability-based Global Cross-modal Upsampling for Pan-sharpening

Zeyu Zhu · Xiangyong Cao · man zhou · Junhao Huang · Deyu Meng

Real-time Controllable Denoising for Image and Video

Zhaoyang Zhang · Yitong Jiang · Wenqi Shao · Xiaogang Wang · Ping Luo · Kaimo Lin · Jinwei Gu

Zero-Shot Noise2Noise: Efficient Image Denoising without any Data

Youssef Mansour · Reinhard Heckel

Rawgment: Noise-Accounted RAW Augmentation Enables Recognition in a Wide Variety of Environments

Masakazu Yoshimura · Junji Otsuka · Atsushi Irie · Takeshi Ohashi

Structure Aggregation for Cross-Spectral Stereo Image Guided Denoising

Zehua Sheng · Zhu Yu · Xiongwei Liu · Siyuan Cao · Yuqi Liu · Hui-liang Shen · Huaqi Zhang

Self-supervised Blind Motion Deblurring with Deep Expectation Maximization

Ji Li · Weixi Wang · YUESONG NAN · Hui Ji

Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset

Shuaizheng Liu · Xindong Zhang · Lingchen Sun · Zhetong Liang · Hui Zeng · Lei Zhang

MetaFusion: Infrared and Visible Image Fusion via Meta-Feature Embedding from Object Detection

Wenda Zhao · Shigeng Xie · Fan Zhao · You He · Huchuan Lu

FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER

Ce Zheng · Matias Mendieta · Taojiannan Yang · Guo-Jun Qi · Chen Chen

Joint Video Multi-Frame Interpolation and Deblurring under Unknown Exposure Time

Wei Shang · Dongwei Ren · yi yang · Hongzhi Zhang · Kede Ma · Wangmeng Zuo

Learning Event Guided High Dynamic Range Video Reconstruction

Yixin Yang · Jin Han · Jinxiu Liang · Zhihang Zhong · Boxin Shi

Multi Domain Learning for Motion Magnification

JASDEEP SINGH · Subrahmanyam Murala · G Sankara Kosuru

EvShutter: Transforming Events for Unconstrained Rolling Shutter Correction

Julius Erbach · Stepan Tulyakov · Patricia Vitoria · Alfredo Bochicchio · YUANYOU LI

Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation

Clinton Mo · Kun Hu · Chengjiang Long · Zhiyong Wang

Recurrent Vision Transformers for Object Detection with Event Cameras

Mathias Gehrig · Davide Scaramuzza

MoDi: Unconditional Motion Synthesis from Diverse Data

Sigal Raab · Inbal Leibovitch · Peizhuo Li · Kfir Aberman · Olga Sorkine-Hornung · Daniel Cohen-Or

Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry

Jiaxu Zhang · Junwu Weng · Di Kang · Fang Zhao · Shaoli Huang · Xuefei Zhe · Linchao Bao · Ying Shan · Jue Wang · Zhigang Tu

Real-time Multi-person Eyeblink Detection in the Wild for Untrimmed Video

Wenzheng Zeng · Yang Xiao · Sicheng Wei · Jinfang Gan · Xintao Zhang · Zhiguo Cao · Zhiwen Fang · Joey Zhou

SelfME: Self-Supervised Motion Learning for Micro-Expression Recognition

Xinqi Fan · Xueli CHEN · Mingjie Jiang · Ali Shahid · Hong Yan

An In-depth Exploration of Person Re-identification and Gait Recognition in Cloth-Changing Conditions

Weijia Li · Saihui Hou · Chunjie Zhang · Chunshui Cao · Xu Liu · Yongzhen Huang · Yao Zhao

Simple Cues Lead to a Strong Multi-Object Tracker

Jenny Seidenschwarz · Guillem Braso · Víctor Castro Serrano · Ismail Elezi · Laura Leal-Taixé

Tracking through Containers and Occluders in the Wild

Basile Van Hoorick · Pavel Tokmakov · Simon Stent · Jie Li · Carl Vondrick

Indiscernible Object Counting in Underwater Scenes

Guolei Sun · Zhaochong An · Yun Liu · Ce Liu · Christos Sakaridis · Deng-Ping Fan · Luc Van Gool

Affordances from Human Videos as a Versatile Representation for Robotics

Shikhar Bahl · Russell Mendonca · Lili Chen · Unnat Jain · Deepak Pathak

Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-Per-Second

Vincent-Pierre Berges · Andrew Szot · Devendra Singh Chaplot · Aaron Gokaslan · Roozbeh Mottaghi · Dhruv Batra · Eric Undersander

Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

Davis Rempe · Zhengyi Luo · Xue Bin Peng · Ye Yuan · Kris Kitani · Karsten Kreis · Sanja Fidler · Or Litany

FJMP: Factorized Joint Multi-Agent Motion Prediction over Learned Directed Acyclic Interaction Graphs

Luke Rowe · Martin Ethier · Eli-Henry Dykhne · Krzysztof Czarnecki

Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction

Shaofei Cai · Zihao Wang · Xiaojian Ma · Anji Liu · Yitao Liang

ReasonNet: End-to-End Driving with Temporal and Global Reasoning

Hao Shao · Letian Wang · Ruobing Chen · Steven Waslander · Hongsheng Li · Yu Liu

V2V4Real: A large-scale real-world dataset for Vehicle-to-Vehicle Cooperative Perception

Runsheng Xu · Xin Xia · JINLONG LI · Hanzhao Li · Shuo Zhang · Zhengzhong Tu · Zonglin Meng · Hao Xiang · Xiaoyu Dong · Rui Song · Hongkai Yu · Bolei Zhou · Jiaqi Ma

Bayesian posterior approximation with stochastic ensembles

Oleksandr Balabanov · Bernhard Mehlig · Hampus Linander

DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling

Jisoo Jeong · Hong Cai · Risheek Garrepalli · Fatih Porikli

Sliced optimal partial transport

Yikun Bai · Bernhard Schmitzer · Matthew Thorpe · Soheil Kolouri

Unsupervised Deep Asymmetric Stereo Matching with Spatially-Adaptive Self-Similarity

Taeyong Song · Sunok Kim · Kwanghoon Sohn

Similarity Metric Learning For RGB-Infrared Group Re-Identification

Jianghao Xiong · Jianhuang Lai

Generalizable Local Feature Pre-training for Deformable Shape Analysis

SOUHAIB ATTAIKI · Lei Li · Maks Ovsjanikov

Quantum Multi-Model Fitting

Matteo Farina · Luca Magri · Willi Menapace · Elisa Ricci · Vladislav Golyanik · Federica Arrigoni

Bridging Search Region Interaction with Template for RGB-T Tracking

Tianrui Hui · Zizheng Xun · Fengguang Peng · Junshi Huang · Xiaoming Wei · Xiaolin Wei · Jiao Dai · Jizhong Han · Si Liu

Local Connectivity-Based Density Estimation for Face Clustering

Junho Shin · Hyo-Jun Lee · Hyunseop Kim · Jong-Hyeon Baek · Daehyun Kim · Yeong Jun Koh

Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration

Guofeng Mei · Hao Tang · Xiaoshui Huang · Weijie Wang · Juan Liu · Jian Zhang · Luc Van Gool · Qiang Wu

NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud

Xiangyu Zhu · Dong Du · Weikai Chen · Zhiyou Zhao · Yinyu Nie · Xiaoguang Han

SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds

Qing Li · Huifang Feng · Kanle Shi · Yue Gao · Yi Fang · Yushen Liu · Zhizhong Han

AnchorFormer: Point Cloud Completion from Discriminative Nodes

ZHIKAI CHEN · Fuchen Long · Zhaofan Qiu · Ting Yao · Wengang Zhou · Jiebo Luo · Tao Mei

GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training

Xiaoyu Tian · Haoxi Ran · Yue Wang · Hang Zhao

Symmetric Shape-Preserving Autoencoder for Unsupervised Real Scene Point Cloud Completion

Changfeng Ma · Yinuo Chen · Pengxiao Guo · Jie Guo · Chongjun Wang · Yanwen Guo

ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution

Tuan Ngo · Binh-Son Hua · Khoi Nguyen

itKD: Interchange Transfer-based Knowledge Distillation for 3D Object Detection

Hyeon Cho · Junyong Choi · Geonwoo Baek · Wonjun Hwang

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets

Haiyang Wang · Chen Shi · Shaoshuai Shi · Meng Lei · Sen Wang · Di He · Bernt Schiele · Liwei Wang

WeatherStream: Light Transport Automation of Single Image Deweathering

Howard Zhang · Yunhao Ba · Ethan Yang · Varan Mehra · Blake Gella · Akira Suzuki · Arnold Pfahnl · Chethan Chinder Chandrappa · Alex Wong · Achuta Kadambi

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs

Yukang Chen · Jianhui Liu · Xiangyu Zhang · XIAOJUAN QI · Jiaya Jia

PVT-SSD: Single-Stage 3D Object Detector with Point-Voxel Transformer

Honghui Yang · Wenxiao Wang · Minghao Chen · Binbin Lin · Tong He · Hua Chen · Xiaofei He · Wanli Ouyang

Unsupervised Intrinsic Image Decomposition with LiDAR Intensity

Shogo Sato · Yasuhiro Yao · Taiga Yoshida · Takuhiro Kaneko · Shingo Ando · Jun Shimamura

ALSO: Automotive Lidar Self-supervision by Occupancy estimation

Alexandre Boulch · Corentin Sautier · Björn Michele · Gilles Puy · Renaud Marlet

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

Runsen Xu · Tai Wang · Wenwei Zhang · Runjian Chen · Jinkun Cao · Jiangmiao Pang · Dahua Lin

Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images

bowei du · Yecheng Huang · JX Chen · Di Huang

Center Focusing Network for Real-Time LiDAR Panoptic Segmentation

Xiaoyan Li · Gang Zhang · Boyue Wang · Yongli Hu · Baocai Yin

Learning and Aggregating Lane Graphs for Urban Automated Driving

Martin Büchner · Jannik Zürn · Ion-George Todoran · Abhinav Valada · Wolfram Burgard

LiDAR-in-the-loop Hyperparameter Optimization

Félix Antoine Goudreault · Dominik Scheuble · Mario Bijelic · Nicolas Robidoux · Felix Heide

Bi-directional LiDAR-Radar Fusion for 3D Dynamic Object Detection

颖杰 王 · Jiajun Deng · Yao Li · Jinshui Hu · Cong Liu · Yu Zhang · Jianmin Ji · Wanli Ouyang · Yanyong Zhang

Toward RAW Object Detection: A New Benchmark and A New Model

Ruikang Xu · Chang Chen · Jingyang Peng · Cheng Li · Yibin Huang · Fenglong Song · Youliang Yan · Zhiwei Xiong

Resource-Efficient RGBD Aerial Tracking

Jinyu Yang · Shang Gao · Zhe Li · Feng Zheng · Ales Leonardis

Learned Two-Plane Perspective Prior based Image Resampling for Efficient Object Detection

Anurag Ghosh · Dinesh Reddy Narapureddy · Christoph Mertz · Srinivasa Narasimhan

Phase-Shifting Coder: Predicting Accurate Orientation in Oriented Object Detection

Yi Yu · Feipeng Da

PaCa-ViT: Learning Patch-to-Cluster Attention in Vision Transformers

Ryan Grainger · Thomas Paniagua · Xi Song · Naresh Cuntoor · MUN WAI LEE · Tianfu Wu

Global Vision Transformer Pruning with Hessian-Aware Saliency

Huanrui Yang · Hongxu Yin · Maying Shen · Pavlo Molchanov · Hai Li · Jan Kautz

Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

Ning Zhang · Francesco Nex · George Vosselman · Norman Kerle

CompletionFormer: Depth Completion with Convolutions and Vision Transformers

Youmin Zhang · Xianda Guo · Matteo Poggi · Zheng Zhu · Guan Huang · Stefano Mattoccia

TINC: Tree-structured Implicit Neural Compression

Runzhao Yang

WIRE: Wavelet Implicit Neural Representations

Vishwanath Saragadam · Daniel LeJeune · Jasper Tan · Guha Balakrishnan · Ashok Veeraraghavan · Richard Baraniuk

Video Compression with Entropy-Constrained Neural Representations

Carlos Gomes · Roberto Azevedo · Christopher Schroers

MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding

Bowen Liu · Yu Chen · Rakesh Chowdary Machineni · Shiyu Liu · Hun-Seok Kim

EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging

lishun wang · Miao Cao · Xin Yuan

Regularized Vector Quantization for Tokenized Image Synthesis

Jiahui Zhang · Fangneng Zhan · Christian Theobalt · Shijian Lu

Video Probabilistic Diffusion Models in Projected Latent Space

Sihyun Yu · Kihyuk Sohn · Subin Kim · Jinwoo Shin

Conditional Image-to-Video Generation with Latent Flow Diffusion Models

Haomiao Ni · Changhao Shi · Kai Li · Sharon Huang · Martin Min

Class-Balancing Diffusion Models

Yiming QIN · Huangjie Zheng · Jiangchao Yao · Mingyuan Zhou · Ya Zhang

HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images

Animesh Karnewar · Andrea Vedaldi · David Novotny · Niloy Mitra

Self-Guided Diffusion Models

Tao Hu · David Zhang · Yuki Asano · Gertjan Burghouts · Cees Snoek

LayoutFormer++: Conditional Graphic Layout Generation via Constraint Serialization and Decoding Space Restriction

Zhaoyun Jiang · Jiaqi Guo · Shizhao Sun · Huayu Deng · Zhongkai Wu · Vuksan Mijovic · Zijiang Yang · Jian-Guang Lou · Dongmei Zhang

InstructPix2Pix: Learning to Follow Image Editing Instructions

Tim Brooks · Aleksander Holynski · Alexei A. Efros

Paint by Example: Exemplar-based Image Editing with Diffusion Models

Binxin Yang · Shuyang Gu · Bo Zhang · Ting Zhang · Xuejin Chen · Xiaoyan Sun · Dong Chen · Fang Wen

SpaText: Spatio-Textual Representation for Controllable Image Generation

Omri Avrahami · Thomas Hayes · Oran Gafni · Sonal Gupta · Yaniv Taigman · Devi Parikh · Dani Lischinski · Ohad Fried · Xi Yin

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

Su Wang · Chitwan Saharia · Ceslee Montgomery · Jordi Pont-Tuset · Shai Noy · Stefano Pellegrini · Yasumasa Onoe · Sarah Laszlo · David Fleet · Radu Soricut · Jason Baldridge · Mohammad Norouzi · Peter Anderson · William Chan

LayoutDM: Transformer-based Diffusion Model for Layout Generation

Shang Chai · Liansheng Zhuang · Fengying Yan

CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes from Natural Language

Aditya Sanghi · Rao Fu · Vivian Liu · Karl Willis · Hooman Shayani · Amir Khasahmadi · Srinath Sridhar · Daniel Ritchie

Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot Artistic Style Transfer

Hao Tang · Songhua Liu · Tianwei Lin · Shaoli Huang · Fu Li · Dongliang He · Xinchao Wang

DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality

Yuqing Wang · Yizhi Wang · Longhui Yu · Yuesheng Zhu · Zhouhui Lian

ObjectStitch: Object Compositing with Diffusion Model

Yizhi Song · Zhifei Zhang · Zhe Lin · Scott Cohen · Brian Price · Jianming Zhang · Soo Ye Kim · Daniel Aliaga

CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer

Linfeng Wen · Chengying Gao · Changqing Zou

LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization

Sheng Liu · Cong Phuoc Huynh · Cong Chen · Maxim Arap · Raffay Hamid

Efficient and Explicit Modelling of Image Hierarchies for Image Restoration

Yawei Li · Yuchen Fan · Xiaoyu Xiang · Denis Demandolx · Rakesh Ranjan · Radu Timofte · Luc Van Gool

GamutMLP: A Lightweight MLP for Color Loss Recovery

Hoang Le · Brian Price · Scott Cohen · Michael Brown

Cascaded Local Implicit Transformer for Arbitrary-Scale Super-Resolution

Hao-Wei Chen · Yu-Syuan Xu · Min-Fong Hong · Yi-Min Tsai · Hsien-Kai Kuo · Chun-Yi Lee

Super-Resolution Neural Operator

Min Wei · Xuesong Zhang

Guided Depth Super-Resolution by Deep Anisotropic Diffusion

Nando Metzger · Rodrigo Daudt · Konrad Schindler

AutoFocusFormer: Image Segmentation off the Grid

Ziwen Chen · Kaushik Patnaik · Shuangfei Zhai · Alvin Wan · Zhile Ren · Alexander Schwing · R Colburn · Li Fuxin

AccelIR: Task-aware Image Compression for Accelerating Neural Restoration

Juncheol Ye · Hyunho Yeo · Jinwoo Park · Dongsu Han

Raw Image Reconstruction with Learned Compact Metadata

Yufei Wang · Yi Yu · Wenhan Yang · Lanqing Guo · Lap-Pui Chau · Alex Kot · Bihan Wen

Context-aware Pretraining for Efficient Blind Image Decomposition

Chao Wang · Zhedong Zheng · Ruijie Quan · Yifan Sun · Yi Yang

Deep Random Projector: Accelerated Deep Image Prior

Taihui Li · Hengkang Wang · Zhong Zhuang · Ju Sun

Spectral Bayesian Uncertainty for Image Super-resolution

Tao Liu · Jun Cheng · Shan Tan

Contrastive Semi-supervised Learning for Underwater Image Restoration via Reliable Bank

Shirui Huang · Keyan Wang · Huan Liu · Jun Chen · Yunsong Li

You Do Not Need Additional Priors or Regularizers in Retinex-based Low-light Image Enhancement

Huiyuan Fu · Wenkai Zheng · Xiangyu Meng · Xin Wang · Chuanming Wang · Huadong Ma

Decoupling-and-Aggregating for Image Exposure Correction

Yang Wang · Long Peng · Liang Li · Yang Cao · Zheng-Jun Zha

Self-supervised Non-uniform Kernel Estimation with Flow-based Motion Prior for Blind Image Deblurring

Zhenxuan Fang · Fangfang Wu · Weisheng Dong · Xin Li · Jinjian Wu · Guangming Shi

Neural Texture Synthesis with Guided Correspondence

Yang Zhou · Kaijian Chen · rongjun xiao · Hui Huang

GradICON: Approximate Diffeomorphisms via Gradient Inverse Consistency

Lin Tian · Thomas Greer · François-Xavier Vialard · Roland Kwitt · Raul San Jose Estepar · Richard Rushmore · Nikolaos Makris · Sylvain Bouix · Marc Niethammer

TransFlow: Transformer as Flow Learner

Yawen Lu · Qifan Wang · Siqi Ma · Tong Geng · Yingjie Victor Chen · Huaijin Chen · Dongfang Liu

Video Dehazing via a Multi-Range Temporal Alignment Network with Physical Prior

Jiaqi Xu · Xiaowei Hu · Lei Zhu · DOU QI · Jifeng Dai · Yu Qiao · Pheng-Ann Heng

Event-Based Frame Interpolation with Ad-hoc Deblurring

Lei Sun · Christos Sakaridis · Jingyun Liang · Peng Sun · Jiezhang Cao · Kai Zhang · Qi Jiang · Kaiwei Wang · Luc Van Gool

Event-based Video Frame Interpolation with Cross-Modal Asymmetric Bidirectional Motion Fields

Taewoo Kim · Yujeong Chae · Hyun-Kurl Jang · Kuk-Jin YOON

“Seeing’’ Electric Network Frequency from Events

Lexuan Xu · Guang Hua · Haijian Zhang · Lei Yu · Ning Qiao

Executing your Commands via Motion Diffusion in Latent Space

Xin Chen · Biao Jiang · Wen Liu · Zilong Huang · BIN FU · Tao Chen · Gang Yu

Event-guided Person Re-Identification via Sparse-Dense Complementary Learning

Chengzhi Cao · Xueyang Fu · Hongjian Liu · Yukun Huang · Kunyu Wang · Jiebo Luo · Zheng-Jun Zha

Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis

Duomin Wang · Yu Deng · Zixin Yin · Heung-Yeung Shum · Baoyuan Wang

One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural Radiance Field

Weichuang Li · Longhao Zhang · Dong Wang · Bin Zhao · Zhigang Wang · Mulin Chen · Bang Zhang · Zhongjian Wang · Liefeng Bo · Xuelong Li

Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition

Hanyang Wang · Bo Li · Shuang Wu · Siyuan Shen · Feng Liu · Shouhong Ding · Aimin Zhou

Multi-modal Gait Recognition via Effective Spatial-Temporal Feature Fusion

Yufeng Cui · Yimei Kang

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Zheng Qin · Sanping Zhou · Le Wang · Jinghai Duan · Gang Hua · Wei Tang

Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking

Ziqi Pang · Jie Li · Pavel Tokmakov · Dian Chen · Sergey Zagoruyko · Yu-Xiong Wang

Camouflaged Instance Segmentation via Explicit De-camouflaging

Naisong Luo · Yuwen Pan · Rui Sun · Tianzhu Zhang · Zhiwei Xiong · Feng Wu

NeRF in the Palm of Your Hand: Corrective Robot Augmentation via Novel-View Synthesis

Allan Zhou · Moo J Kim · Lirui Wang · Pete Florence · Chelsea Finn

PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav

Ram Ramrakhya · Dhruv Batra · Erik Wijmans · Abhishek Das

AdamsFormer for Spatial Action Localization in the Future

Hyung-gun Chi · Kwonjoon Lee · Nakul Agarwal · Yi Xu · Karthik Ramani · Chiho Choi

Unsupervised Sampling Promoting for Stochastic Human Trajectory Prediction

Guangyi Chen · Zhenhao Chen · Shunxing Fan · Kun Zhang

Query-Centric Trajectory Prediction

Zikang Zhou · Jianping Wang · Yung-Hui Li · Yu-Kai Huang

Planning-oriented Autonomous Driving

yihan hu · Jiazhi Yang · Li Chen · Keyu Li · Chonghao Sima · Xizhou Zhu · Siqi Chai · Senyao Du · Tianwei Lin · Wenhai Wang · Lewei Lu · Xiaosong Jia · Qiang Liu · Jifeng Dai · Yu Qiao · Hongyang Li

UniHCP: A Unified Model for Human-Centric Perceptions

Yuanzheng Ci · Yizhou Wang · Meilin Chen · SHIXIANG TANG · LEI BAI · Feng Zhu · Rui Zhao · Fengwei Yu · Donglian Qi · Wanli Ouyang

You Only Segment Once: Towards Real-Time Panoptic Segmentation

Jie Hu · Linyan Huang · Tianhe Ren · shengchuan zhang · Rongrong Ji · Liujuan Cao

On the Convergence of IRLS and Its Variants in Outlier-Robust Estimation

Liangzu Peng · Christian Kümmerle · Rene Vidal

Learning Adaptive Dense Event Stereo from the Image Domain

Hoonhee Cho · Jegyeong Cho · Kuk-Jin YOON

Correspondence Transformers with Asymmetric Feature Learning and Matching Flow Super-Resolution

Yixuan Sun · Dongyang Zhao · Zhangyue Yin · Yiwen Huang · Tao Gui · Wenqiang Zhang · Weifeng Ge

DKM: Dense Kernelized Feature Matching for Geometry Estimation

Johan Edstedt · Ioannis Athanasiadis · Mårten Wadenbäck · Michael Felsberg

3D Registration with Maximal Cliques

Xiyu Zhang · Jiaqi Yang · Shikun Zhang · Yanning Zhang

Self-Supervised Learning for Multimodal Non-Rigid 3D Shape Matching

Dongliang Cao · Florian Bernard

Towards Better Gradient Consistency for Neural Signed Distance Functions via Level Set Alignment

Baorui Ma · Junsheng Zhou · Yushen Liu · Zhizhong Han

Unsupervised Inference of Signed Distance Functions from Single Sparse Point Clouds without Learning Priors

Chao Chen · Yushen Liu · Zhizhong Han

PEAL: Prior-embedded Explicit Attention Learning for low-overlap Point Cloud Registration

Junle Yu · Luwei Ren · Yu Zhang · Wenhui Zhou · Lili Lin · Guojun Dai

PointListNet: Deep Learning on 3D Point Lists

Hehe Fan · Linchao Zhu · Yi Yang · Mohan Kankanhalli

Meta Architecture for Point Cloud Analysis

Haojia Lin · Xiawu Zheng · lijiang Li · Fei Chao · Shanshan Wang · Yan Wang · Yonghong Tian · Rongrong Ji

Learnable Skeleton-Aware 3D Point Cloud Sampling

Cheng Wen · Baosheng Yu · Dacheng Tao

Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning

Zhuoyang Zhang · Yuhao Dong · Yunze Liu · Li Yi

ViewNet: A Novel Projection-Based Backbone with View Pooling for Few-shot Point Cloud Classification

Jiajing Chen · Minmin Yang · Senem Velipasalar

SCPNet: Semantic Scene Completion on Point Cloud

Zhaoyang Xia · Youquan Liu · Xin Li · Xinge ZHU · Yuexin Ma · Yikang LI · Yuenan Hou · Yu Qiao

SCoDA: Domain Adaptive Shape Completion for Real Scans

Yushuang Wu · Zizheng Yan · Ce Chen · Lai Wei · Xiao Li · Guanbin Li · Yihao Li · Shuguang Cui · Xiaoguang Han

GrowSP: Unsupervised Semantic Segmentation of 3D Point Clouds

zihui zhang · Bo Yang · Bing WANG · Bo Li

MethaneMapper: Spectral Absorption aware Hyperspectral Transformer for Methane Detection

Satish Kumar · Ivan Arevalo · A S M Iftekhar · B.S. Manjunath

Weakly Supervised Class-agnostic Motion Prediction for Autonomous Driving

Ruibo Li · Hanyu Shi · Ziang Fu · Zhe Wang · Guosheng Lin

Single Domain Generalization for LiDAR Semantic Segmentation

Hyeonseong Kim · Yoonsu Kang · Changgyoon Oh · Kuk-Jin YOON

PeakConv: Learning Peak Receptive Field for Radar Semantic Segmentation

Liwen Zhang · Xinyan Zhang · Youcheng Zhang · Yufei Guo · Yuanpei Chen · Xuhui Huang · Zhe Ma

PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds

Jinyu Li · Chenxu Luo · Xiaodong Yang

Density-Insensitive Unsupervised Domain Adaption on 3D Object Detection

Qianjiang Hu · Daizong Liu · Wei Hu

Spherical Transformer for LiDAR-based 3D Recognition

Xin Lai · Yukang Chen · Fanbin Lu · Jianhui Liu · Jiaya Jia

Neural Map Prior for Autonomous Driving

Xuan Xiong · Yicheng Liu · Tianyuan Yuan · Yue Wang · Yilun Wang · Hang Zhao

LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion

Xin Li · Tao MA · Yuenan Hou · Botian Shi · Yuchen Yang · Youquan Liu · Xingjiao Wu · Qin Chen · Yikang LI · Yu Qiao · Liang He

Pix2map: Cross-modal Retrieval for Inferring Street Maps From Images

Xindi Wu · Kwun Fung Lau · Francesco Ferroni · Aljosa Osep · Deva Ramanan

Azimuth Super-Resolution for FMCW Radar in Autonomous Driving

Yu-Jhe Li · Shawn Hunt · Jinhyung Park · Matthew O’Toole · Kris Kitani

MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer

Yunsong Zhou · Hongzi Zhu · Quan Liu · Shan Chang · Minyi Guo

Weakly Supervised Monocular 3D Object Detection using Multi-View Projection and Direction Consistency

Runzhou Tao · Wencheng Han · Zhongying Qiu · Cheng-zhong Xu · Jianbing Shen

Semi-Supervised Stereo-based 3D Object Detection via Cross-View Consensus

Wenhao Wu · Hau-San Wong · Si Wu

BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks

Xiaowei Chi · Jiaming Liu · Ming Lu · Rongyu Zhang · Zhaoqing Wang · Yandong Guo · Shanghang Zhang

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

Shaofei Huang · Zhenwei Shen · Zehao Huang · Zi-han Ding · Jiao Dai · Jizhong Han · Naiyan Wang · Si Liu

Learning Transformations To Reduce the Geometric Shift in Object Detection

Vidit Vidit · Martin Engilberge · Mathieu Salzmann

Look, Radiate, and Learn: Self-Supervised Localisation via Radio-Visual Correspondence

Mo Alloulah · Maximilian Arnold

Non-line-of-sight Imaging with Signal Superresolution Network

Jianyu Wang · Xintong Liu · Leping Xiao · Zuoqiang Shi · Lingyun Qiu · Xing Fu

ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields

Seyed Mohammad Mahdi Johari · Camilla Carta · François Fleuret

OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images

Weijia Li · Yawen Lai · Linning Xu · Yuanbo Xiangli · Yu Jinhua · Conghui He · Gui-Song Xia · Dahua Lin

Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention

Fangfu Liu · Chubin Zhang · Yu Zheng · Yueqi Duan

Multi-View Stereo Representation Revist: Region-Aware MVSNet

Yisu Zhang · Jianke Zhu · Lixiang Lin

All-in-focus Imaging from Event Focal Stack

Hanyue Lou · Minggui Teng · Yixin Yang · Boxin Shi

Wide-angle Rectification via Content-aware Conformal Mapping

Qi Zhang · Hongdong Li · Qing Wang

Single Image Depth Prediction Made Better: A Multivariate Gaussian Take

Ce Liu · Suryansh Kumar · Shuhang Gu · Radu Timofte · Luc Van Gool

DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients

Rémi Pautrat · Daniel Barath · Viktor Larsson · Martin Oswald · Marc Pollefeys

VisFusion: Visibility-aware Online 3D Scene Reconstruction from Videos

Huiyu Gao · Wei Mao · miaomiao Liu

Perspective Fields for Single Image Camera Calibration

Linyi Jin · Jianming Zhang · Yannick Hold-Geoffroy · Oliver Wang · Kevin Blackburn-Matzen · Matthew Sticha · David Fouhey

RUST: Latent Neural Scene Representations from Unposed Imagery

Mehdi S. M. Sajjadi · Aravindh Mahendran · Thomas Kipf · Etienne Pot · Daniel Duckworth · Mario Lucic · Klaus Greff

Learning Accurate 3D Shape Based on Stereo Polarimetric Imaging

Tianyu Huang · Haoang Li · Kejing He · Congying SUI · Bin Li · Yun-Hui Liu

The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects

Ruohan Gao · Yiming Dou · Hao Li · Tanmay Agarwal · Jeannette Bohg · Yunzhu Li · Li Fei-Fei · Jiajun Wu

Paired-Point Lifting for Enhanced Privacy-Preserving Visual Localization

Chunghwan Lee · Jaihoon Kim · Chanhyuk Yun · Je Hyeong Hong

Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data

Nilesh Kulkarni · Linyi Jin · Justin Johnson · David Fouhey

Long-term Visual Localization with Mobile Sensors

Shen Yan · Yu Liu · Long Wang · Zehong Shen · Zhen Peng · Haomin Liu · Maojun Zhang · Guofeng Zhang · Xiaowei Zhou

Learning the Distribution of Errors in Stereo Matching for Joint Disparity and Uncertainty Estimation

Liyan Chen · Weihan Wang · Philippos Mordohai

Revisiting Rotation Averaging: Uncertainties and Robust Losses

Ganlin Zhang · Viktor Larsson · Daniel Barath

Level-S

2

fM: Structure from Motion on Neural Level Set of Implicit Surfaces

Yuxi Xiao · Nan Xue · Tianfu Wu · Gui-Song Xia

Linking Garment with Person via Semantically Associated Landmarks for Virtual Try-On

Keyu Yan · Tingwei Gao · Hui Zhang · Chengjun Xie

Cross-domain 3D Hand Pose Estimation with Dual Modalities

Qiuxia Lin · Linlin Yang · Angela Yao

ScarceNet: Animal Pose Estimation with Scarce Annotations

Chen Li · Gim Lee

HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose Estimation

Linfang Zheng · Chen Wang · Yinghan Sun · Esha Dasgupta · Hua Chen · Ales Leonardis · Wei Zhang · Hyung Jin Chang

ViPLO: Vision Transformer based Pose-Conditioned Self-Loop Graph for Human-Object Interaction Detection

Jeeseung Park · Jin-Woo Park · Jong-Seok Lee

Ego-Body Pose Estimation via Ego-Head Pose Estimation

Jiaman Li · Karen Liu · Jiajun Wu

Mutual Information-Based Temporal Difference Learning for Human Pose Estimation in Video

Runyang Feng · Yixing Gao · Xueqing Ma · Tze Ho Elden Tse · Hyung Jin Chang

Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting

Xiaogang Peng · Siyuan Mao · Zizhao Wu

What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging

Zitian Tang · Wenjie Ye · Wei-Chiu Ma · Hang Zhao

Detecting Human-Object Contact in Images

Yixin Chen · Sai Kumar Dwivedi · Michael Black · Dimitrios Tzionas

In-Hand 3D Object Scanning from an RGB Sequence

Shreyas Hampali · Tomas Hodan · LUAN TRAN · Lingni Ma · Cem Keskin · Vincent Lepetit

Autonomous Manipulation Learning for Similar Deformable Objects via Only One Demonstration

Yu Ren · Ronghan Chen · Yang Cong

What You Can Reconstruct from a Shadow

Ruoshi Liu · Sachit Menon · Chengzhi Mao · Dennis Park · Simon Stent · Carl Vondrick

H2ONet: Hand-Occlusion-and-Orientation-aware Network for Real-time 3D Hand Mesh Reconstruction

Hao Xu · Tianyu Wang · Xiao Tang · Chi-Wing Fu

Learning Human Mesh Recovery in 3D Scenes

Zehong Shen · Zhi Cen · Sida Peng · Qing Shuai · Hujun Bao · Xiaowei Zhou

Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild

Gyeongsik Moon

Hi4D: 4D Instance Segmentation of Close Human Interaction

Yifei Yin · Chen Guo · Manuel Kaufmann · Juan Zarate · Jie Song · Otmar Hilliges

Deformable Mesh Transformer for 3D Human Mesh Recovery

Yusuke Yoshiyasu

Reconstructing Animatable 3D Categories from Videos

Gengshan Yang · Chaoyang Wang · Dinesh Reddy Narapureddy · Deva Ramanan

Learning Semantic-Aware Disentangled Representation for 3D Human Body Editing

Xiaokun Sun · Qiao Feng · Xiongzheng Li · Jinsong Zhang · Yu-Kun Lai · Jingyu Yang · Kun Li

Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling

Zhanhao Hu · Wenda Chu · Xiaopei Zhu · Hui Zhang · Bo Zhang · Xiaolin Hu

Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View

Shuo Wang · Xinhai Zhao · Haiming Xu · Zehui Chen · Dameng Yu · Jiahao Chang · Zhen Yang · Feng Zhao

Listening Human Behavior: 3D Human Pose Estimation with Acoustic Signals

Yuto Shibata · Yutaka Kawashima · Mariko Isogawa · Go Irie · Akisato Kimura · Yoshimitsu Aoki

NLOST: Non-Line-of-Sight Imaging with Transformer

Yue Li · Jiayong Peng · Juntian Ye · Yueyi Zhang · Feihu Xu · Zhiwei Xiong

Few-shot Non-line-of-sight Imaging with Signal-surface Collaborative Regularization

Xintong Liu · Jianyu Wang · Leping Xiao · Xing Fu · Lingyun Qiu · Zuoqiang Shi

Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM

Hengyi Wang · Jingwen Wang · Lourdes Agapito

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer

Fanghua Yu · Xintao Wang · Mingdeng Cao · Gen Li · Ying Shan · Chao Dong

HRDFuse: Monocular 360

Depth Estimation by Collaboratively Learning Holistic-with-Regional Depth Distributions

Hao Ai · Zidong Cao · Yan-Pei Cao · Ying Shan · Lin Wang

K3DN: Disparity-aware Kernel Estimation for Dual-Pixel Defocus Deblurring

Yan Yang · Liyuan Pan · Liu Liu · miaomiao Liu

Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography

Ilya Chugunov · Yuxuan Zhang · Felix Heide

DynamicStereo: Consistent Dynamic Depth from Stereo Videos

Nikita Karaev · Ignacio Rocco · Benjamin Graham · Natalia Neverova · Andrea Vedaldi · Christian Rupprecht

End-to-End Vectorized HD-map Construction with Piecewise Bezier Curve

Limeng Qiao · Wenjie Ding · Xi Qiu · Chi Zhang

Enhanced Stable View Synthesis

Nishant Jain · Suryansh Kumar · Luc Van Gool

Scalable, Detailed and Mask-Free Universal Photometric Stereo

Satoshi Ikehata

PRISE: Demystifying Deep Lucas-Kanade with Strongly Star-Convex Constraints for Multimodel Image Alignment

Yiqing Zhang · Xinming Huang · Ziming Zhang

Visual Localization using Imperfect 3D Models from the Internet

Vojtech Panek · Zuzana Kukelova · Torsten Sattler

HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization

Zhihao Liang · Zhangjin Huang · Changxing Ding · Kui Jia

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

Garrick Brazil · Abhinav Kumar · Julian Straub · Nikhila Ravi · Justin Johnson · Georgia Gkioxari

Objaverse: A Universe of Annotated 3D Objects

Matt Deitke · Dustin Schwenk · Jordi Salvador Marcos · Luca Weihs · Oscar Michel · Eli VanderBilt · Ludwig Schmidt · Kiana Ehsani · Aniruddha Kembhavi · Ali Farhadi

Privacy-Preserving Representations are not Enough: Recovering Scene Content from Camera Poses

Kunal Chelani · Torsten Sattler · Fredrik Kahl · Zuzana Kukelova

Learning a Depth Covariance Function

Eric Dexheimer · Andrew Davison

Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning

Ajinkya Tejankar · Maziar Sanjabi · Qifan Wang · Sinong Wang · Hamed Firooz · Hamed Pirsiavash · Liang Tan

Backdoor Defense via Deconfounded Representation Learning

Zaixi Zhang · Qi Liu · Zhicai Wang · Zepu Lu · Qingyong Hu

Backdoor Cleansing with Unlabeled Data

Lu Pang · Tao Sun · Haibin Ling · Chao Chen

Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack

Hideaki Takahashi · Jingjing Liu · Yang Liu

ELASTIC AGGREGATION FOR FEDERATED OPTIMIZATION

Chen Dengsheng · Jie Hu · Vince Tan · Xiaoming Wei · Enhua Wu

DynaFed: Tackling Client Data Heterogeneity with Global Dynamics

Renjie PI · WEIZHONG ZHANG · Yueqi Xie · Jiahui Gao · Xiaoyu Wang · Sunghun Kim · Qifeng Chen

How to Prevent the Poor Performance Clients for Personalized Federated Learning?

Zhe Qu · Xingyu Li · Xiao Han · Rui Duan · Chengchao Shen · Lixing Chen

Cloud-Device Collaborative Adaptation to Continual Changing Environments in the Real-world

Yulu Gan · Mingjie Pan · Rongyu Zhang · Zijian Ling · Lingran Zhao · Jiaming Liu · Shanghang Zhang

Diversity-Measurable Anomaly Detection

Wenrui Liu · Hong Chang · Bingpeng Ma · Shiguang Shan · Xilin CHEN

Look Around for Anomalies: Weakly-supervised Anomaly Detection via Context-Motion Relational Learning

MyeongAh Cho · Minjung Kim · Sangwon Hwang · Chaewon Park · Kyungjae Lee · Sangyoun Lee

Semi-supervised Hand Appearance Recovery via Structure Disentanglement and Dual Adversarial Discrimination

Zimeng Zhao · Binghui Zuo · Zhiyu Long · Yangang Wang

Adversarial Normalization: I Can visualize Everything (ICE)

Hoyoung Choi · Seungwan Jin · Kyungsik Han

Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection

Chuangchuang Tan · Yao Zhao · Shikui Wei · Guanghua Gu · Yunchao Wei

GLeaD: Improving GANs with A Generator-Leading Task

Qingyan Bai · Ceyuan Yang · Yinghao Xu · Xihui Liu · Yujiu Yang · Yujun Shen

Data-Free Sketch-Based Image Retrieval

Abhra Chaudhuri · Ayan Kumar Bhunia · Yi-Zhe Song · Anjan Dutta

OpenMix: Exploring Outlier Samples for Misclassification Detection

Fei Zhu · Zhen Cheng · Xu-yao Zhang · Cheng-lin Liu

Genie: Show Me the Data for Quantization

Yongkweon Jeon · Chungman Lee · Ho-young Kim

How to Prevent the Continuous Damage of Noises to Model training?

Xiaotian Yu · Yang Jiang · Tianqi Shi · Zunlei Feng · Yuexuan Wang · Mingli Song · Li Sun

Gradient-based Uncertainty Attribution for Explainable Bayesian Deep Learning

Hanjing Wang · Dhiraj Joshi · Shiqiang Wang · Qiang Ji

FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits

Polina Karpikova · Ekaterina Radionova · Anastasia Yaschenko · Andrei Spiridonov · Leonid Kostyushko · Riccardo Fabbricatore · Aleksei Ivakhnenko

Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks

Jierun Chen · Shiu-hong Kao · Hao He · Weipeng Zhuo · Song Wen · Chul-Ho Lee · S.-H. Chan

FFCV: Accelerating Training by Removing Data Bottlenecks

Guillaume Leclerc · Andrew Ilyas · Logan Engstrom · Sung Min Park · Hadi Salman · Aleksander Madry

Disentangled Representation Learning for Unsupervised Neural Quantization

Haechan Noh · Sangeek Hyun · Woojin Jeong · Hanshin Lim · Jae-Pil Heo

HOTNAS: Hierarchical Optimal Transport for Neural Architecture Search

Jiechao Yang · Yong Liu · Hongteng Xu

Solving relaxations of MAP-MRF problems: Combinatorial in-face Frank-Wolfe directions

Vladimir Kolmogorov

Transformer-Based Learned Optimization

Erik Gärtner · Luke Metz · Misha Andriluka · C. Freeman · Cristian Sminchisescu

Multi-Agent Automated Machine Learning

Zhaozhi Wang · Kefan Su · Jian Zhang · Huizhu Jia · Qixiang Ye · Xiaodong Xie · Zongqing Lu

Accelerating Dataset Distillation via Model Augmentation

Lei Zhang · Jie Zhang · Bowen Lei · Subhabrata Mukherjee · Xiang Pan · Bo Zhao · Caiwen Ding · Yao Li · Dongkuan Xu

PA&DA: Jointly Sampling Path and Data for Consistent NAS

Shun Lu · Yu Hu · Longxing Yang · Zihao Sun · Jilin Mei · Jianchao Tan · Chengru Song

Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning

Sanghwan Kim · Lorenzo Noci · Antonio Orvieto · Thomas Hofmann

EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization

Junha Song · Jungsoo Lee · In So Kweon · Sungha Choi

CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

James Smith · Leonid Karlinsky · Vyshnavi Gutta · Paola Cascante-Bonilla · Donghyun Kim · Assaf Arbelle · Rameswar Panda · Rogerio Feris · Zsolt Kira

DisWOT: Student Architecture Search for Distillation WithOut Training

Peijie Dong · Lujun Li · Zimian Wei

Real-Time Evaluation in Online Continual Learning: A New Hope

Yasir Ghunaim · Adel Bibi · Kumail Alhamoud · Motasem Alfarra · Hasan Hammoud Hammoud · Ameya Prabhu · Philip Torr · Bernard Ghanem

Dealing with Cross-Task Class Discrimination in Online Continual Learning

Yiduo Guo · Bing Liu · Dongyan Zhao

Class Attention Transfer Based Knowledge Distillation

Ziyao Guo · Haonan Yan · HUI LI · Xiaodong Lin

Dense Network Expansion for Class Incremental Learning

Zhiyuan Hu · Yunsheng Li · Jiancheng Lyu · Dashan Gao · Nuno Vasconcelos

Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning

Kaiyou Song · Jin Xie · Shan Zhang · Zimeng Luo

Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation

Linglan Zhao · Jing Lu · Yunlu Xu · Zhanzhan Cheng · Dashan Guo · Yi Niu · Xiangzhong Fang

Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners

Zitian Chen · Yikang Shen · Mingyu Ding · Zhenfang Chen · Hengshuang Zhao · Erik Learned-Miller · Chuang Gan

Train-Once-for-All Personalization

Hong-You Chen · YANDONG LI · Yin Cui · Mingda Zhang · Wei-Lun Chao · Li Zhang

Generalizable Implicit Neural Representations with Instance Pattern Composers

Chiheon Kim · Doyup Lee · Saehoon Kim · Minsu Cho · Wook-Shin Han

Deep Frequency Filtering for Domain Generalization

Shiqi Lin · Zhizheng Zhang · Zhipeng Huang · Yan Lu · Cuiling Lan · Peng Chu · Quanzeng You · Jiang Wang · Zicheng Liu · Viraj Navkal · Amey Parulkar · Zhibo Chen

Back to the Source: Diffusion-Driven Adaptation to Test-Time Corruption

Jin Gao · Jialing Zhang · Xihui Liu · Trevor Darrell · Evan Shelhamer · Dequan Wang

Decompose, Adjust, Compose: Effective Normalization by Playing with Frequency for Domain Generalization

Sangrok Lee · Jongseong Bae · Ha Kim Kim

Enhanced Multimodal Representation Learning with Cross-modal KD

mengxi Chen · Linyu XING · Yu Wang · Ya Zhang

Equiangular Basis Vectors

Yang Shen · Xu-Hao Sun · Xiu-Shen Wei

DARE-GRAM : Unsupervised Domain Adaptation Regression by Aligning Inverse Gram Matrices

Ismail Nejjar · Qin Wang · Olga Fink

Towards Better Stability and Adaptability: Improve Online Self-Training for Model Adaptation in Semantic Segmentation

Dong Zhao · Shuang Wang · Qi Zang · Dou Quan · XIUTIAO YE · Licheng Jiao

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

Lukas Hoyer · Dengxin Dai · Haoran Wang · Luc Van Gool

Neural Dependencies Emerging from Learning Massive Categories

Ruili Feng · Kecheng Zheng · Kai Zhu · Yujun Shen · Jian Zhao · Yukun Huang · Deli Zhao · Jingren Zhou · Michael Jordan · Zheng-Jun Zha

Co-training

2

L

submodels for image recognition

Hugo Touvron · Matthieu CORD · Maxime Oquab · Piotr Bojanowski · Jakob Verbeek · Herve Jegou

On-the-fly Category Discovery

Ruoyi Du · Dongliang Chang · Kongming Liang · Timothy Hospedales · Yi-Zhe Song · Zhanyu Ma

Generative Bias for Robust Visual Question Answering

Jae Won Cho · Dong-Jin Kim · Hyeonggon Ryu · In So Kweon

RMLVQA: A Margin Loss Approach For Visual Question Answering with Language Biases

Abhipsa Basu · Sravanti Addepalli · Venkatesh Babu Radhakrishnan

Twin Contrastive Learning with Noisy Labels

Zhizhong Huang · Junping Zhang · Hongming Shan

Fine-Grained Classification with Noisy Labels

Qi Wei · Lei Feng · Haoliang Sun · Ren Wang · Chenhui Guo · Yilong Yin

ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning

Islam Nassar · Munawar Hayat · Ehsan Abbasnejad · Hamid Rezatofighi · Gholamreza Haffari

Zero-shot Model Diagnosis

Jinqi Luo · Zhaoning Wang · Chen Henry Wu · Dong Huang · Fernando de la Torre

Mind the Label Shift of Augmentation-based Graph OOD Generalization

Junchi Yu · Jian Liang · Ran He

RONO: Robust Discriminative Learning with Noisy Labels for 2D-3D Cross-Modal Retrieval

Yanglin Feng · Hongyuan Zhu · Dezhong Peng · Xi Peng · Peng Hu

Deep Incomplete Multi-view Clustering with Cross-view Partial Sample and Prototype Alignment

Jiaqi Jin · Siwei Wang · Zhibin Dong · Xinwang Liu · En Zhu

MetaViewer: Towards A Unified Multi-View Representation

Ren Wang · Haoliang Sun · Yuling Ma · Xiaoming Xi · Yilong Yin

Rethinking Out-of-Distribution Detection: Masked Image Modeling is All You Need

Jingyao Li · Pengguang Chen · Zexin He · Shaozuo Yu · Shu Liu · Jiaya Jia

Towards Trustable Skin Cancer Diagnosis via Rewriting Model’s Decision

Siyuan Yan · zhen yu · Xuelin Zhang · Dwarikanath Mahapatra · Shekhar Chandra · Monika Janda · H. Peter Soyer · Zongyuan Ge

METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens

Zhanyu Wang · Lingqiao Liu · Lei Wang · Luping Zhou

Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images

Ramin Nakhli · Puria Azadi Moghadam · Haoyang Mi · Hossein Farahani · Alexander Baras · Blake Gilks · Ali Bashashati

Ambiguous Medical Image Segmentation using Diffusion Models

AIMON RAHMAN · Jeya Maria Jose Valanarasu · Ilker Hacihaliloglu · Vishal Patel

Directional Connectivity-based Segmentation of Medical Images

Ziyun Yang · Sina Farsiu

Bidirectional Copy-Paste for Semi-Supervised Medical Image Segmentation

Yunhao Bai · Duowen Chen · Qingli Li · Wei Shen · Yan Wang

AutoLabel: CLIP-based framework for Open-set Video Domain Adaptation

Giacomo Zara · Subhankar Roy · Paolo Rota · Elisa Ricci

Zero-shot Generative Model Adaptation via Image-specific Prompt Learning

Jiayi Guo · Chaofei Wang · You Wu · Eric Zhang · Kai Wang · Xingqian Xu · Shiji Song · Humphrey Shi · Gao Huang

2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection

Mikhail Kennerley · Jian-Gang Wang · Bharadwaj Veeravalli · Robby Tan

Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection

Muhammad Akhtar Munir · Muhammad Khan Khan · Salman Khan · Fahad Khan

Learning Transformation-Predictive Representations for Detection and Description of Local Features

Zihao Wang · Chunxu Wu · Yifei Yang · Zhen Li

Annealing-based Label-Transfer Learning for Open World Object Detection

Yuqing Ma · Hainan Li · Zhange Zhang · Jinyang Guo · Shanghang Zhang · Ruihao Gong · Xianglong Liu

PROB: Probabilistic Objectness for Open World Object Detection

Orr Zohar · Kuan-Chieh Wang · Serena Yeung

Detecting Everything in the Open World: Towards Universal Object Detection

Zhenyu Wang · Ya-Li Li · Xi Chen · Ser-Nam Lim · Antonio Torralba · Hengshuang Zhao · Shengjin Wang

DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection

Zongheng Tang · Yifan Sun · Si Liu · Yi Yang

Self-supervised AutoFlow

Hsin-Ping Huang · Charles Herrmann · Junhwa Hur · Erika Lu · Kyle Sargent · Austin Stone · Ming-Hsuan Yang · Deqing Sun

Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding

Lingchen Meng · Xiyang Dai · Yinpeng Chen · Pengchuan Zhang · Dongdong Chen · Mengchen Liu · Jianfeng Wang · Zuxuan Wu · Lu Yuan · Yu-Gang Jiang

Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems

Yangyang Shu · Anton Hengel · Lingqiao Liu

Full or weak annotations? An adaptive strategy for budget-constrained annotation campaigns

Javier Gamazo Tejero · Martin Zinkernagel · Sebastian Wolf · Raphael Sznitman · Pablo Márquez Neila

Class-Incremental Exemplar Compression for Class-Incremental Learning

Zilin Luo · Yaoyao Liu · Bernt Schiele · Qianru Sun

The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation

Beomyoung Kim · Joonhyun Jeong · Dongyoon Han · Sung Ju Hwang

Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervised Semantic Segmentation

Zhen Zhao · Lihe Yang · Sifan Long · Jimin Pi · Luping Zhou · Jingdong Wang

Weakly Supervised Semantic Segmentation via Adversarial Learning of Classifier and Reconstructor

Hyeokjun Kweon · Sung-Hoon Yoon · Kuk-Jin YOON

Learning Orthogonal Prototypes for Generalized Few-shot Semantic Segmentation

Sun-Ao Liu · Yiheng Zhang · Zhaofan Qiu · Hongtao Xie · Yongdong Zhang · Ting Yao

Beyond mAP: Towards better evaluation of instance segmentation

Rohit Kumar Jena · Lukas Zhornyak · Nehal Doiphode · Pratik Chaudhari · Vivek Buch · James Gee · Jianbo Shi

Dynamic Focus-aware Positional Queries for Semantic Segmentation

Haoyu He · Jianfei Cai · Zizheng Pan · Jing Liu · Jing Zhang · Dacheng Tao · Bohan Zhuang

Focus On Details: Online Multi-object Tracking with Diverse Fine-grained Representation

Hao Ren · Shoudong Han · Huilin Ding · Ziwen Zhang · Hongwei Wang · Faquan Wang

DynaMask: Dynamic Mask Selection for Instance Segmentation

Ruihuang Li · Chenhang HE · Shuai Li · Yabin Zhang · Lei Zhang

A Strong Baseline for Generalized Few-Shot Semantic Segmentation

Seyed Mohammadsina Hajimiri · Malik Boudiaf · Ismail Ayed · Jose Dolz

Compositor: Bottom-up Clustering and Compositing for Robust Part and Object Segmentation

Ju He · Jieneng Chen · Ming-Xian Lin · Qihang Yu · Alan Yuille

Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis

Yuxiang Wei · Zhilong Ji · Xiaohe Wu · Jinfeng Bai · Lei Zhang · Wangmeng Zuo

Primitive Generation and Semantic-related Alignment for Universal Zero-Shot Segmentation

SHUTING HE · Henghui Ding · Wei Jiang

UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration

Jingyi Zhang · Jiaxing Huang · Xiaoqin Zhang · Shijian Lu

StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition

Yanqing Shen · Sanping Zhou · Jingwen Fu · Ruotong Wang · Shitao Chen · Nanning Zheng

CLIP-S

4

: Language-Guided Self-Supervised Semantic Segmentation

Wenbin He · Suphanut Jamonnak · Liang Gou · Liu Ren

Learning Conditional Attributes for Compositional Zero-Shot Learning

Qingsheng Wang · Lingqiao Liu · Chenchen Jing · Hao Chen · Guoqiang Liang · PENG WANG · Chunhua Shen

Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection

Luting Wang · Yi Liu · Penghui Du · Zihan Ding · Yue Liao · Qiaosong Qi · Biaolong Chen · Si Liu

ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation

Ziqin Zhou · Yinjie Lei · Bowen Zhang · Lingqiao Liu · Yifan Liu

Learning to Generate Text-grounded Mask for Open-world Semantic Segmentation from Only Image-Text Pairs

Junbum Cha · Jonghwan Mun · Byungseok Roh

Mobile User Interface Element Detection Via Adaptively Prompt Tuning

Weiqiang Wang · Zhuoer Xu · Haoxing Chen · jun lan · Changhua Meng · Weiqiang Wang

Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers

Dahun Kim · Anelia Angelova · Weicheng Kuo

Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling

yongshuai huang · Ning Lu · Dapeng Chen · Yibo Li · Zecheng Xie · Shenggao Zhu · Liangcai Gao · Wei Peng

End-to-End 3D Dense Captioning with Vote2Cap-DETR

Sijin Chen · Hongyuan Zhu · Xin Chen · Yinjie Lei · Gang Yu · Tao Chen

Visual DNA: Representing and Comparing Images using Distributions of Neuron Activations

Benjamin Ramtoula · Matthew Gadd · Paul Newman · Daniele De Martini

Hint-Aug: Drawing Hints from Foundation Vision Transformers towards Boosted Few-shot Parameter-Efficient Tuning

Zhongzhi Yu · Shang Wu · Shunyao Zhang · Yonggan Fu · Yingyan Lin

Improving Zero-shot Generalization and Robustness of Multi-modal Models

Yunhao Ge · Jie Ren · Andrew Gallagher · Yuxiao Wang · Ming-Hsuan Yang · Hartwig Adam · Laurent Itti · Balaji Lakshminarayanan · Jiaping Zhao

Asymmetric Feature Fusion for Image Retrieval

Hui Wu · Min Wang · Wengang Zhou · Zhenbo Lu · Houqiang Li

Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning

Dmytro Kotovenko · Pingchuan Ma · Timo Milbich · Björn Ommer

Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-commerce

Yang Jin · Yongzhi Li · Zehuan Yuan · Yadong MU

Learning Attribute and Class Specific Representation Duet for Fine-grained Fashion Analysis

Yang Jiao · Yan Gao · Jingjing Meng · Jin Shang · Yi Sun

HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning

Chia-Wen Kuo · Zsolt Kira

Non-Contrastive Learning Meets Language-Image Pre-Training

Jinghao Zhou · Li Dong · Zhe Gan · Lijuan Wang · Furu Wei

ViLEM: Visual-Language Error Modeling for Image-Text Retrieval

Yuxin Chen · Zongyang Ma · ziqi zhang · Zhongang Qi · Chunfeng Yuan · Ying Shan · Bing Li · Weiming Hu · Xiaohu Qie · Jianping WU

CLIPPO: Image-and-Language Understanding from Pixels Only

Michael Tschannen · Basil Mustafa · Neil Houlsby

MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining

Xiaoyi Dong · Jianmin Bao · Yinglin Zheng · Ting Zhang · Dongdong Chen · Hao Yang · Ming Zeng · Weiming Zhang · Lu Yuan · Dong Chen · Fang Wen · Nenghai Yu

Context-aware Alignment and Mutual Masking for 3D-Language Pre-training

Zhao Jin · Munawar Hayat · Yuwei Yang · Yulan Guo · Yinjie Lei

SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text

Pinaki Nath Chowdhury · Ayan Kumar Bhunia · Aneeshan Sain · Subhadeep Koley · Tao Xiang · Yi-Zhe Song

Learning Bottleneck Concepts in Image Classification

Bowen Wang · Liangzhi Li · Yuta Nakashima · Hajime Nagahara

GIVL: Improving Geographical Inclusivity of Vision-and-Language Models with Pre-Training Methods

Da Yin · Feng Gao · Govind Thattai · Michael Johnston · Kai-Wei Chang

Grounding Counterfactual Explanation of Image Classifiers to Textual Concept Space

Siwon Kim · Jinoh Oh · SUNGJIN LEE · Seunghak Yu · Jaeyoung Do · Tara Taghavi

Overlooked factors in concept-based explanations: Dataset choice, concept learnability, and human capability

Vikram V. Ramaswamy · Sunnie S. Y. Kim · Ruth Fong · Olga Russakovsky

LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding

Gen Li · Varun Jampani · Deqing Sun · Laura Sevilla-Lara

Task Residual for Tuning Vision-Language Models

Tao Yu · Zhihe Lu · Xin Jin · Zhibo Chen · Xinchao Wang

Hierarchical Prompt Learning for Multi-Task Learning

Yajing Liu · Yuning Lu · Hao Liu · Yaozu An · Zhuoran Xu · Yao Zhuokun · Zhang Baofeng · Zhiwei Xiong · Chenguang Gui

Diversity-Aware Meta Visual Prompting

Qidong Huang · Xiaoyi Dong · Dongdong Chen · Weiming Zhang · Feifei Wang · Gang Hua · Nenghai Yu

From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models

Jiaxian Guo · Junnan Li · Dongxu Li · Anthony Tiong · Boyang Li · Dacheng Tao · Steven Hoi

Language Adaptive Weight Generation for Multi-task Visual Grounding

Wei Su · Peihan Miao · Huanzhang Dou · Gaoang Wang · Liang Qiao · Zheyang Li · Xi Li

Fusing Pre-trained Language Models with Multimodal Prompts through Reinforcement Learning

Youngjae Yu · Jiwan Chung · Heeseung Yun · Jack Hessel · Jae Sung Park · Ximing Lu · Rowan Zellers · Prithviraj Ammanabrolu · Ronan Le Bras · Gunhee Kim · Yejin Choi

Are Deep Neural Networks SMARTer than Second Graders?

Anoop Cherian · Kuan-Chuan Peng · Suhas Lohit · Kevin Smith · Joshua Tenenbaum

A-CAP: Anticipation Captioning with Commonsense Knowledge

MINH DUC VO · An Luong · Akihiro Sugimoto · Hideki Nakayama

A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning

Aishwarya Kamath · Peter Anderson · Su Wang · Jing Yu Koh · Alexander Ku · Austin Waters · Yinfei Yang · Jason Baldridge · Zarana Parekh

Improving Vision-and-Language Navigation by Generating Future-View Image Semantics

Jialu Li · Mohit Bansal

Layout-based Causal Inference for Object Navigation

Sixian Zhang · Xinhang Song · Weijie Li · Yubing Bai · Xinyao Yu · Shuqiang Jiang

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

Shengkun Tang · Yaqing Wang · Zhenglun Kong · Tianchi Zhang · Yao Li · Caiwen Ding · Yanzhi Wang · Yi Liang · Dongkuan Xu

Distilling Cross-Temporal Contexts for Continuous Sign Language Recognition

Leming Guo · Wanli Xue · Qing Guo · Bo Liu · Kaihua Zhang · Tiantian Yuan · Shengyong Chen

Multivariate, Multi-frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation

Feiyu Chen · Jie Shao · Shuyuan Zhu · Heng Tao Shen

Modular Memorability: Tiered Representations for Video Memorability Prediction

Théo Dumont · Juan Hevia · Camilo Fosco

VindLU: A Recipe for Effective Video-and-Language Pretraining

Feng Cheng · Xizi Wang · Jie Lei · David Crandall · Mohit Bansal · Gediminas Bertasius

Procedure-Aware Pretraining for Instructional Video Understanding

Honglu Zhou · Roberto Martín-Martín · Mubbasir Kapadia · Silvio Savarese · Juan Carlos Niebles

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning

Antoine Yang · Arsha Nagrani · Paul Hongsuck Seo · Antoine Miech · Jordi Pont-Tuset · Ivan Laptev · Josef Sivic · Cordelia Schmid

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Wenhao Wu · Haipeng Luo · Bo Fang · Jingdong Wang · Wanli Ouyang

Leveraging Temporal Context in Low Representational Power Regimes

Camilo Fosco · SouYoung Jin · Emilie Josephs · Aude Oliva

Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation

Tsu-Jui Fu · Licheng Yu · Ning Zhang · Cheng-Yang Fu · Jong-Chyi Su · William Yang Wang · Sean Bell

NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation

Haoqian Wu · Keyu Chen · Haozhe Liu · Mingchen Zhuge · Bing Li · Ruizhi Qiao · Xiujun Shu · Bei Gan · Liangsheng Xu · Bo Ren · Mengmeng Xu · Wentian Zhang · Raghavendra Ramachandra · Chia-Wen Lin · Bernard Ghanem

Perception and Semantic Aware Regularization for Sequential Confidence Calibration

Zhenghua Peng · Yu Luo · Tianshui Chen · Keke Xu · Shuangping Huang

Boosting Weakly-Supervised Temporal Action Localization with Text Information

Guozhang Li · De Cheng · Xinpeng Ding · Nannan Wang · Xiaoyu Wang · Xinbo Gao

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

Chen Zhao · Shuming Liu · Karttikeya Mangalam · Bernard Ghanem

Search-Map-Search: A Frame Selection Paradigm for Action Recognition

Mingjun Zhao · Yakun Yu · Xiaoli Wang · Lei Yang · Di Niu

Therbligs In Action: Video Understanding through Motion Primitives

Eadom Dessalene · Michael Maynord · Cornelia Fermuller · Yiannis Aloimonos

Learning Discriminative Representations for Skeleton Based Action Recognition

Huanyu Zhou · Qingjie Liu · Yunhong Wang

MOSO: Decomposing MOtion, Scene and Object for Video Prediction

Mingzhen Sun · Weining Wang · Xinxin Zhu · Jing Liu

EVAL: Explainable Video Anomaly Localization

Ashish Singh · Michael Jones · Erik Learned-Miller

Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation

Liulei Li · Wenguan Wang · Tianfei Zhou · Jianwu Li · Yi Yang

Representation Learning for Visual Object Tracking by Masked Appearance Transfer

Haojie Zhao · Dong Wang · Huchuan Lu

Generalized Relation Modeling for Transformer Tracking

Shenyuan Gao · Chunluan Zhou · Jun Zhang

Panoptic Video Scene Graph Generation

Jingkang Yang · Wenxuan Peng · Xiangtai Li · ZUJIN GUO · Liangyu Chen · Bo Li · Zheng Ma · Wayne Zhang · Kaiyang Zhou · CHEN CHANGE LOY · Ziwei Liu

Devil’s on the Edges: Selective Quad Attention for Scene Graph Generation

Deunsol Jung · Sanghyun Kim · Won Hwa Kim · Minsu Cho

Focused and Collaborative Feedback Integration for Interactive Image Segmentation

Qiaoqiao Wei · Hui Zhang · Jun-Hai Yong

Knowledge Distillation for 6D Pose Estimation by Aligning Distributions of Local Predictions

Shuxuan Guo · Yinlin Hu · Jose Alvarez · Mathieu Salzmann

PartMix: Regularization Strategy to Learn Part Discovery for Visible-Infrared Person Re-identification

Minsu Kim · Seungryong Kim · Jungin Park · Seongheon Park · Kwanghoon Sohn

Integrally Pre-Trained Transformer Pyramid Networks

Yunjie Tian · Lingxi Xie · Zhaozhi Wang · Longhui Wei · XIAOPENG ZHANG · Jianbin Jiao · Yaowei Wang · Qi Tian · Qixiang Ye

Explaining Image Classifiers with Multiscale Directional Image Representation

Stefan Kolek · Robert Windesheim · Hector Andrade Loarca · Gitta Kutyniok · Ron Levie

Neuron Structure Modeling for Generalized Remote Physiological Measurement

Hao LU · Zitong Yu · Xuesong Niu · Ying-Cong Chen

Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves

Sora Takashima · Ryo Hayamizu · Nakamasa Inoue · Hirokatsu Kataoka · Rio Yokota

Model-Agnostic Gender Debiased Image Captioning

Yusuke Hirota · Yuta Nakashima · Noa Garcia

ImageBind: One Embedding Space To Bind Them All

Rohit Girdhar · Alaaeldin El-Nouby · Zhuang Liu · Mannat Singh · Kalyan Vasudev Alwala · Armand Joulin · Ishan Misra

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

Muhammad Naeem Naeem · Gul Zain Khan · Yongqin Xian · Muhammad Zeshan Afzal · Didier Stricker · Luc Van Gool · Federico Tombari

Learning Semantic Relationship among Instances for Image-Text Matching

Zheren Fu · Zhendong Mao · Yan Song · Yongdong Zhang

Learning Customized Visual Models with Retrieval-Augmented Knowledge

Haotian Liu · Kilho Son · Jianwei Yang · Ce Liu · Jianfeng Gao · Yong Jae Lee · Chunyuan Li

M

6

Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for \ Modern Document Layout Analysis

Hiuyi Cheng · Peirong Zhang · Sihang Wu · Jiaxin Zhang · Qiyuan · Zecheng Xie · Jing Li · Kai Ding · Lianwen Jin

Towards Modality-Agnostic Person Re-identification with Descriptive Query

Cuiqun Chen · Mang Ye · Ding Jiang

Generalized Decoding for Pixel, Image, and Language

Xueyan Zou · Zi-Yi Dou · Jianwei Yang · Zhe Gan · Linjie Li · Chunyuan Li · Xiyang Dai · Harkirat Behl · Jianfeng Wang · Lu Yuan · Nanyun Peng · Lijuan Wang · Yong Jae Lee · Jianfeng Gao

Correlational Image Modeling for Self-Supervised Visual Pre-Training

Wei Li · Jiahao Xie · CHEN CHANGE LOY

Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token embeddings to Finite Discrete Tokens

Yuxiao Chen · Jianbo Yuan · Yu Tian · Shijie Geng · Xinyu Li · Ding Zhou · Dimitris Metaxas · Hongxia Yang

What Can Human Sketches Do for Object Detection?

Pinaki Nath Chowdhury · Ayan Kumar Bhunia · Aneeshan Sain · Subhadeep Koley · Tao Xiang · Yi-Zhe Song

Local-guided Global: Paired Similarity Representation for Visual Reinforcement Learning

Hyesong Choi · Hunsang Lee · Wonil Song · Sangryul Jeon · Kwanghoon Sohn · Dongbo Min

OCTET: Object-Aware Counterfactual Explanations

Mehdi Zemni · Mickael Chen · Eloi Zablocki · Hedi Ben younes · Patrick Perez · Matthieu CORD

Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks

Weihua Chen · Xianzhe Xu · Jian Jia · Hao Luo · Yaohua Wang · Fan Wang · Rong Jin · Xiuyu Sun

Advancing Visual Grounding with Scene Knowledge: Benchmark and Method

Zhihong Chen · Ruifei Zhang · Yibing Song · Xiang Wan · Guanbin Li

FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training

Yunpeng Han · Lisai Zhang · Qingcai Chen · chen zhijian · Zhonghua Li · Jianxin Yang · Zhao Cao

Learning to Exploit Temporal Structure for Biomedical Vision–Language Processing

Shruthi Bannur · Stephanie Hyland · Qianchu Liu · Fernando Pérez-García · Maximilian Ilse · Daniel Castro · Benedikt Boecking · Harshita Sharma · Kenza Bouzid · Anja Thieme · Anton Schwaighofer · Maria Teodora Wetscherek · Matthew Lungren · Aditya Nori · Javier Alvarez Valle · Ozan Oktay

Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition

Xinghan Wang · Xin Xu · Yadong MU

Fine-grained Audible Video Description

Xuyang Shen · Dong Li · Jinxing Zhou · Zhen Qin · Bowen He · Xiaodong Han · Aixuan Li · Yuchao Dai · Lingpeng Kong · Meng Wang · Yu Qiao · Yiran Zhong

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

Reuben Tan · Arijit Ray · Andrea Burns · Bryan Plummer · Justin Salamon · Oriol Nieto · Bryan Russell · Kate Saenko

Audio-Visual Grouping Network for Sound Localization from Mixtures

Shentong Mo · Yapeng Tian

Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations

Sagnik Majumder · Hao Jiang · Pierre Moulon · Ethan Henderson · Paul Calamia · Kristen Grauman · Vamsi Krishna Ithapu

Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation

Lingting Zhu · Xian Liu · Xuanyu Liu · Rui Qian · Ziwei Liu · Lequan Yu

Spatio-Temporal Pixel-Level Contrastive Learning-based Source-Free Domain Adaptation for Video Semantic Segmentation

Shao-Yuan Lo · Poojan Oza · Sumanth Chennupati · Patricio Galindo · Vishal Patel

MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

Minghan Li · Shuai Li · Wangmeng Xiang · Lei Zhang

System-status-aware Adaptive Network for Online Streaming Video Understanding

Lin Geng Foo · GONG JIA · Zhipeng Fan · Jun Liu

Frame Flexible Network

Yitian Zhang · Yue Bai · Chang Liu · Huan Wang · Sheng Li · Yun Fu

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

Chao Feng · Ziyang Chen · Andrew Owens

MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation

ROY MILES · Mehmet Kerim Yucel · Bruno Manganelli · Albert Saa-Garriga

Improving Robustness of Semantic Segmentation to Motion-Blur using Class-Centric Augmentation

Aakanksha Aakanksha · Rajagopalan Ambasamduram

MAGVIT: Masked Generative Video Transformer

Lijun Yu · Yong Cheng · Kihyuk Sohn · Jose Lezama · Han Zhang · Huiwen Chang · Alexander Hauptmann · Ming-Hsuan Yang · Yuan Hao · Irfan Essa · Lu Jiang

SCOTCH and SODA: A Transformer Video Shadow Detection Framework

Lihao Liu · Jean Prost · Lei Zhu · Nicolas Papadakis · Pietro Lio · Carola-Bibiane Schönlieb · Angelica Aviles-Rivero

Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Chenyang Lei · Xuanchi Ren · Zhaoxiang Zhang · Qifeng Chen

Probabilistic Debiasing of Scene Graphs

Bashirul Biswas Biswas · Qiang Ji

ViTs for SITS: Vision Transformers for Satellite Image Time Series

Michail Tarasiou · Erik Chavez · Stefanos Zafeiriou

OmniMAE: Single Model Masked Pretraining on Images and Videos

Rohit Girdhar · Alaaeldin El-Nouby · Mannat Singh · Kalyan Vasudev Alwala · Armand Joulin · Ishan Misra

BASiS: Batch Aligned Spectral Embedding Space

Or Streicher · Ido Cohen · Guy Gilboa

Evolved Part Masking for Self-Supervised Learning

Zhanzhou FENG · Shiliang Shiliang

Hard Patches Mining for Masked Image Modeling

Haochen Wang · Kaiyou Song · Junsong Fan · Yuxi Wang · Jin Xie · Zhaoxiang Zhang

Pose-disentangled Contrastive Learning for Self-supervised Facial Representation

Yuanyuan Liu · Wenbin Wang · Yibing Zhan · Shaoze Feng · Kejun Liu · Zhe Chen

OpenGait: Revisiting Gait Recognition Towards Better Practicality

Chao Fan · Junhao Liang · Chuanfu Shen · Saihui Hou · Yongzhen Huang · Shiqi Yu

Autoregressive Visual Tracking

Xing Wei · Yifan Bai · Yongchao Zheng · Dahu Shi · Yihong Gong

Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking

Jinkun Cao · Jiangmiao Pang · Xinshuo Weng · Rawal Khirodkar · Kris Kitani

GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields

Alessandro Ruzzi · Xiangwei Shi · Xi Wang · Gengyan Li · Shalini De Mello · Hyung Jin Chang · Xucong Zhang · Otmar Hilliges

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

Matt Deitke · Rose Hendrix · Ali Farhadi · Kiana Ehsani · Aniruddha Kembhavi

Learning Human-to-Robot Handovers from Point Clouds

Sammy Christen · Wei Yang · Claudia Pérez-D’Arpino · Otmar Hilliges · Dieter Fox · Yu-Wei Chao

MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion

Chiyu Jiang · Andre Cornman · Cheolho Park · Benjamin Sapp · Yin Zhou · Dragomir Anguelov

Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction

Yi Xu · Armin Bazarjani · Hyung-gun Chi · Chiho Choi · Yun Fu

MixSim: A Hierarchical Framework for Mixed Reality Traffic Simulation

Simon Suo · Kelvin Wong · Justin Xu · James Tu · Alexander Cui · Sergio Casas · Raquel Urtasun

Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving

Xiwen Liang · Minzhe Niu · Jianhua Han · Hang Xu · Chunjing Xu · Xiaodan Liang

Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark

Xiaofeng Wang · Zheng Zhu · Yunpeng Zhang · Guan Huang · Yun Ye · Wenbo Xu · Ziwei Chen · Xingang Wang

BAEFormer: Bi-directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation

Cong Pan · Yonghao He · Junran Peng · Qian Zhang · Wei Sui · Zhaoxiang Zhang

PVO: Panoptic Visual Odometry

Weicai Ye · Xinyue Lan · SHUO CHEN · Yuhang Ming · Xingyuan Yu · Hujun Bao · Zhaopeng Cui · Guofeng Zhang

Unsupervised Cumulative Domain Adaptation for Foggy Scene Optical Flow

Zhou Hanyu · Yi Chang · YAN WENDING · Luxin Yan

Domain Generalized Stereo Matching via Hierarchical Visual Transformation

Tianyu Chang · Xun Yang · Tianzhu Zhang · Meng Wang

Unsupervised Visible-Infrared Person Re-Identification via Progressive Graph Matching and Alternate Learning

Wu Zesen · Mang Ye

Geometric Visual Similarity Learning in 3D Medical Image Self-Supervised Pre-training

Yuting He · Guanyu Yang · Rongjun Ge · Yang Chen · Jean-louis Coatrieux · Boyu Wang · Shuo Li

Progressive Neighbor Consistency Mining for Correspondence Pruning

Xin Liu · Jufeng Yang

Visual Prompt Multi-Modal Tracking

Jiawen Zhu · Simiao Lai · Xin Chen · Dong Wang · Huchuan Lu

Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting

Haiping Wang · Yuan Liu · Zhen Dong · Yulan Guo · Yushen Liu · Wenping Wang · Bisheng Yang

PointCert: Point Cloud Classification with Deterministic Certified Robustness Guarantees

Jinghuai Zhang · Jinyuan Jia · Hongbin Liu · Neil Gong

Rethinking the Approximation Error in 3D Surface Fitting for Point Cloud Normal Estimation

Hang Du · Xuejun Yan · Jingjing Wang · Di Xie · Shiliang Pu

FAC: 3D Representation Learning via Foreground Aware Feature Contrast

Kangcheng Liu · Aoran Xiao · Xiaoqin Zhang · Shijian Lu · Ling Shao

ProxyFormer: Proxy Alignment Assisted Point Cloud Completion with Missing Part Sensitive Transformer

Shanshan Li · Pan Gao · Xiaoyang Tan · Mingqiang Wei

PointVector: A Vector Representation In Point Cloud Analysis

Xin Deng · wenyu Zhang · Qing Ding · Xinming Zhang

Fast Point Cloud Generation with Straight Flows

Lemeng Wu · Dilin Wang · Chengyue Gong · Xingchao Liu · Yunyang Xiong · Rakesh Ranjan · Raghuraman Krishnamoorthi · Vikas Chandra · qiang liu

ACL-SPC: Adaptive Closed-Loop system for Self-Supervised Point Cloud Completion

Sangmin Hong · Mohsen Yavartanoo · Reyhaneh Neshatavar Haghighi Shiraz · Kyoung Mu Lee

Open-set Semantic Segmentation for Point Clouds via Adversarial Prototype Framework

Jianan Li · Qiulei Dong

GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds

Honghui Yang · Tong He · Jiaheng Liu · Hua Chen · Boxi Wu · Binbin Lin · Xiaofei He · Wanli Ouyang

Novel Class Discovery for 3D Point Cloud Semantic Segmentation

Luigi Riz · Cristiano Saltori · Elisa Ricci · Fabio Poiesi

3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds

Aoran Xiao · Jiaxing Huang · Weihao Xuan · Ruijie Ren · Kangcheng Liu · Dayan Guan · Abdulmotaleb El Saddik · Shijian Lu · Eric Xing

Less is More: Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation

Li Li · Hubert P. H. Shum · Toby Breckon

Instant Domain Augmentation for LiDAR Semantic Segmentation

Kwonyoung Ryu · Soonmin Hwang · Jaesik Park

Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

Fangqiang Ding · Andras Palffy · Dariu Gavrila · Xiaoxuan Lu

MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

Yingwei Li · Charles R. Qi · Yin Zhou · Chenxi Liu · Dragomir Anguelov

Towards Unsupervised Object Detection from LiDAR Point Clouds

Lunjun Zhang · Anqi Joyce Yang · Yuwen Xiong · Sergio Casas · Bin Yang · Mengye Ren · Raquel Urtasun

DeepMapping2: Self-supervised Large-scale LiDAR Map Optimization

Chao Chen · Xinhao Liu · Yiming Li · Li Ding · Chen Feng

ConQueR: Query Contrast Voxel-DETR for 3D Object Detection

Benjin ZHU · Zhe Wang · Shaoshuai Shi · Hang Xu · Lanqing HONG · Hongsheng Li

SGLoc: Scene Geometry Encoding for Outdoor LiDAR Localization

Wen Li · Shangshu Yu · Cheng Wang · Guosheng Hu · Siqi Shen · Chenglu Wen

Depth Estimation from Camera Image and mmWave Radar Point Cloud

Akash Deep Singh · Yunhao Ba · Ankur Sarker · Howard Zhang · Achuta Kadambi · Stefano Soatto · Mani Srivastava · Alex Wong

Towards Building Self-Aware Object Detectors via Reliable Uncertainty Quantification and Calibration

Kemal Oksuz · Tom Joy · Puneet Dokania

Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection

Bo Zhang · Jiakang Yuan · Botian Shi · Tao Chen · Yikang LI · Yu Qiao

Collaboration Helps Camera Overtake LiDAR in 3D Detection

Yue Hu · Yifan Lu · Runsheng Xu · Weidi Xie · Siheng Chen · Yanfeng Wang

BEV@DC: Bird’s-Eye View Assisted Training for Depth Completion

Wending Zhou · Xu Yan · Yinghong Liao · Yuankai Lin · Jin Huang · Gangming Zhao · Shuguang Cui · Zhen Li

Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

Yuanhui Huang · Wenzhao Zheng · Yunpeng Zhang · Jie Zhou · Jiwen Lu

Viewpoint Equivariance for Multi-View 3D Object Detection

Dian Chen · Jie Li · Vitor Guizilini · Rareș Ambruș · Adrien Gaidon

3D Concept Learning and Reasoning from Multi-View Images

Yining Hong · Chunru Lin · Yilun Du · Zhenfang Chen · Joshua Tenenbaum · Chuang Gan

Role of Transients in Two-Bounce Non-Line-of-Sight Imaging

Siddharth Somasundaram · Akshat Dave · Connor Henley · Ashok Veeraraghavan · Ramesh Raskar

3D Spatial Multimodal Knowledge Accumulation for Scene Graph Prediction in Point Cloud

Mingtao Feng · Haoran Hou · Liang Zhang · Zijie Wu · Yulan Guo · Ajmal Mian

Revisiting the Stack-Based Inverse Tone Mapping

Ning Zhang · Yuyao Ye · Yang Zhao · Ronggang Wang

MVImgNet: A Large-scale Dataset of Multi-view Images

Xianggang Yu · Mutian Xu · Yidan Zhang · Haolin Liu · Chongjie Ye · Yushuang Wu · Zizheng Yan · Chenming Zhu · Zhangyang Xiong · Tianyou Liang · Guanying Chen · Shuguang Cui · Xiaoguang Han

Fully Self-Supervised Depth Estimation from Defocus Clue

Haozhe Si · Bin Zhao · Dong Wang · Yunpeng Gao · Mulin Chen · Zhigang Wang · Xuelong Li

Zero-Shot Dual-Lens Super-Resolution

Ruikang Xu · Mingde Yao · Zhiwei Xiong

Temporally Consistent Online Depth Estimation Using Point-Based Fusion

Numair Khan · Eric Penner · Douglas Lanman · Lei Xiao

Learning to Detect Mirrors from Videos via Dual Correspondences

Jiaying Lin · Xin Tan · Rynson Lau

Renderable Neural Radiance Map for Visual Navigation

obin kwon · Jeongho Park · Songhwai Oh

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

Yiming Li · Zhiding Yu · Chris Choy · Chaowei Xiao · Jose Alvarez · Sanja Fidler · Chen Feng · Anima Anandkumar

Behind the Scenes: Density Fields for Single View Reconstruction

Felix Wimbauer · Nan Yang · Christian Rupprecht · Daniel Cremers

Multiview Compressive Coding for 3D Reconstruction

Chao-Yuan Wu · Justin Johnson · Jitendra Malik · Christoph Feichtenhofer · Georgia Gkioxari

Virtual Occlusions Through Implicit Depth

Jamie Watson · Mohamed Sayed · Zawar Imam Qureshi · Gabriel Brostow · Sara Vicente · Oisin Aodha · Michael Firman

Panoptic Lifting for 3D Scene Understanding with Neural Fields

Yawar Siddiqui · Lorenzo Porzi · Samuel Rota Bulò · Norman Müller · Matthias Niessner · Angela Dai · Peter Kontschieder

Neural Part Priors: Learning to Optimize Part-Based Object Completion in RGB-D Scans

Alexey Bokhovkin · Angela Dai

BAAM: Monocular 3D pose and shape reconstruction with bi-contextual attention module and attention-guided modeling

Hyo-Jun Lee · Hanul Kim · Su-Min Choi · Seong-Gyun Jeong · Yeong Jun Koh

BKinD-3D: Self-Supervised 3D Keypoint Discovery from Multi-View Videos

Jennifer J. Sun · Lili Karashchuk · Amil Dravid · Serim Ryou · Sonia Fereidooni · John Tuthill · Aggelos Katsaggelos · Bingni Brunton · Georgia Gkioxari · Ann Kennedy · Yisong Yue · Pietro Perona

Four-view geometry with unknown radial distortion

Petr Hrubý · Viktor Korotynskiy · Timothy Duff · Luke Oeding · Marc Pollefeys · Tomas Pajdla · Viktor Larsson

Two-view Geometry Scoring Without Correspondences

Axel Barroso-Laguna · Eric Brachmann · Victor Prisacariu · Gabriel Brostow · Daniyar Turmukhambetov

Neural Voting Field for Camera-Space 3D Hand Pose Estimation

Lin Huang · Chung-Ching Lin · Kevin Lin · Lin Liang · Lijuan Wang · Junsong Yuan · Zicheng Liu

expOSE: Accurate Initialization-Free Projective Factorization using Exponential Regularization

José Iglesias Iglesias · Amanda Nilsson · Carl Olsson

Object Pose Estimation with Statistical Guarantees: Conformal Keypoint Detection and Geometric Uncertainty Propagation

Heng Yang · Marco Pavone

Crowd3D: Towards Hundreds of People Reconstruction from a Single Image

Hao Wen · Jing Huang · Huili Cui · Haozhe Lin · Yu-Kun Lai · LU FANG · Kun Li

Rigidity-Aware Detection for 6D Object Pose Estimation

Hai Yang · Rui Song · Jiaojiao Li · Mathieu Salzmann · Yinlin Hu

Robot Structure Prior Guided Temporal Attention for Camera-to-Robot Pose Estimation from Image Sequence

Yang Tian · Jiyao Zhang · Zekai Yin · Hao Dong

GFIE: A Dataset and Baseline for Gaze-Following from 2D to 3D in Indoor Environments

Zhengxi Hu · Yuxue Yang · Xiaolin Zhai · Dingye Yang · Bohan Zhou · Jingtai Liu

TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers

Cheng Zhang · Hai Liu · Yongjian Deng · Bochen Xie · Youfu Li

Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation

Xiaolong Shen · Zongxin Yang · Xiaohan Wang · Jianxin Ma · Chang Zhou · Yi Yang

PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation

Qitao Zhao · Ce Zheng · Mengyuan Liu · Pichao WANG · Chen Chen

BITE: Beyond Priors for Improved Three-D Dog Pose Estimation

Nadine Rueegg · Shashank Tripathi · Konrad Schindler · Michael Black · Silvia Zuffi

TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments

Yu Sun · Qian Bao · Wu Liu · Tao Mei · Michael Black

NeuralDome: A Neural Modeling Pipeline on Multi-View Human-Object Interactions

Juze Zhang · Haimin Luo · Hongdi Yang · Xinru Xu · Qianyang Wu · Ye Shi · Jingyi Yu · Lan Xu · Jingya Wang

Target-referenced Reactive Grasping for Dynamic Objects

Jirong Liu · Ruo Zhang · Hao-Shu Fang · Minghao Gou · Hongjie Fang · Chenxi Wang · Sheng Xu · Hengxu Yan · Cewu Lu

Command-driven Articulated Object Understanding and Manipulation

Ruihang Chu · Zhengzhe Liu · Xiaoqing Ye · Xiao Tan · XIAOJUAN QI · Chi-Wing Fu · Jiaya Jia

Visual-Tactile Sensing for In-Hand Object Reconstruction

Wenqiang Xu · Zhenjun Yu · Han Xue · Ruolin Ye · Siqiong Yao · Cewu Lu

MagicPony: Learning Articulated 3D Animals in the Wild

Shangzhe Wu · Ruining Li · Tomas Jakab · Christian Rupprecht · Andrea Vedaldi

Learning Analytical Posterior Probability for Human Mesh Recovery

Qi Fang · Kang Chen · Yinghui Fan · Qing Shuai · Jiefeng Li · Weidong Zhang

Marching-Primitives: Shape Abstraction from Signed Distance Function

Weixiao Liu · Yuwei Wu · Sipu Ruan · Gregory Chirikjian

Learning Neural Volumetric Representations of Dynamic Humans in Minutes

Chen Geng · Sida Peng · Zhen Xu · Hujun Bao · Xiaowei Zhou

Complete 3D Human Reconstruction from a Single Incomplete Image

Junying Wang · Jae Shin Yoon · Tuanfeng Wang · Krishna Kumar Singh · Ulrich Neumann

DIFu: Depth-guided Implicit Function for Clothed Human Reconstruction

Dae-Young Song · HeeKyung Lee · Jeongil Seo · Donghyeon Cho

BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion

Michael Black · Priyanka Patel · Joachim Tesch · Jinlong Yang

Invertible Neural Skinning

Yash Kant · Aliaksandr Siarohin · Riza Alp Guler · Menglei Chai · Jian Ren · Sergey Tulyakov · Igor Gilitschenski

Zero-shot Pose Transfer for Unrigged Stylized 3D Characters

Jiashun Wang · Xueting Li · Sifei Liu · Shalini De Mello · Orazio Gallo · Xiaolong Wang · Jan Kautz

Biomechanics-guided Facial Action Unit Detection through Force Modeling

Zijun Cui · Chenyi Kuang · Tian Gao · Kartik Talamadupula · Qiang Ji

Hand Avatar: Free-Pose Hand Animation and Rendering from Monocular Video

Xingyu Chen · Baoyuan Wang · Heung-Yeung Shum

High-fidelity Clothed Avatar Reconstruction from a Single Image

Tingting Liao · Xiaomei Zhang · Yuliang Xiu · Hongwei Yi · Xudong Liu · Guo-Jun Qi · Yong Zhang · Xuan Wang · Xiangyu Zhu · Zhen Lei

NeuWigs: A Neural Dynamic model for Volumetric Hair Capture and Animation

Ziyan Wang · Giljoo Nam · Tuur Stuyck · Stephen Lombardi · Chen Cao · Jason Saragih · Michael Zollhöfer · Jessica Hodgins · Christoph Lassner

FitMe: Deep Photorealistic 3D Morphable Model Avatars

Alexandros Lattas · Stylianos Moschoglou · Stylianos Ploumpis · Baris Gecer · Jiankang Deng · Stefanos Zafeiriou

FaceLit: Neural 3D Relightable Faces

Anurag Ranjan · Kwang Moo Yi · Jen-Hao Chang · Oncel Tuzel

Learning a Morphable Face Reflectance Model from Low-cost Data

Yuxuan Han · Zhibo Wang · Feng Xu

Fine-Grained Face Swapping via Regional GAN Inversion

Zhian Liu · Maomao Li · Yong Zhang · Cairong Wang · Qi Zhang · Jue Wang · Yongwei Nie

DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion

Wenliang Zhao · Yongming Rao · Weikang Shi · Zuyan Liu · Jie Zhou · Jiwen Lu

Unsupervised 3D Shape Reconstruction by Part Retrieval and Assembly

Xianghao Xu · Paul Guerrero · Matthew Fisher · Siddhartha Chaudhuri · Daniel Ritchie

PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image

Jianhui Li · Jianmin Li · Haoji Zhang · Shilong Liu · Zhengyi Wang · Zihao Xiao · Kaiwen Zheng · Jun Zhu

NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation

Yu Yin · Kamran Ghasedi · HsiangTao Wu · Jiaolong Yang · Xin Tong · Yun Fu

Quantitative Manipulation of Custom Attributes on 3D-Aware Image Synthesis

Hoseok Do · EunKyung Yoo · Taehyeong Kim · Chul Lee · Jin Choi

SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene

Minjung Son · Jeong Joon Park · Leonidas Guibas · Gordon Wetzstein

NeuralField-LDM: Scene Generation with Hierarchical Latent Diffusion Models

Seung Wook Kim · Bradley Brown · Kangxue Yin · Karsten Kreis · Katja Schwarz · Daiqing Li · Robin Rombach · Antonio Torralba · Sanja Fidler

NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images

Yunfan Ye · Renjiao Yi · Zhirui Gao · Chenyang Zhu · Zhiping Cai · Kai Xu

NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction

Bowen Cai · Jinchi Huang · Rongfei Jia · chengfei lv · Huan Fu

PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices

Radu Alexandru Rosu · Sven Behnke

Neuralangelo: High-Fidelity Neural Surface Reconstruction

Zhaoshuo Li · Thomas Müller · Alex Evans · Russ Taylor · Mathias Unberath · Ming-Yu Liu · Chen-Hsuan Lin

RealFusion: 360

Reconstruction of Any Object from a Single Image

Luke Melas-Kyriazi · Iro Laina · Christian Rupprecht · Andrea Vedaldi

Neural Lens Modeling

Wenqi Xian · Aljaz Bozic · Noah Snavely · Christoph Lassner

RGBD2: Generative Scene Synthesis via Incremental View Inpainting using RGBD Diffusion Models

Jiabao Lei · Jiapeng Tang · Kui Jia

Controllable Light Diffusion for Portraits

David Futschik · Kelvin Ritland · James Vecore · Sean Fanello · Sergio Orts-Escolano · Brian Curless · Daniel Sýkora · Rohit Pandey

Weakly-supervised Single-view Image Relighting

Renjiao Yi · Chenyang Zhu · Kai Xu

MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying Lighting Estimation

JunYong Choi · SeokYeong Lee · Haesol Park · Seung-Won Jung · Ig-Jae Kim · Junghyun Cho

DANI-Net: Uncalibrated Photometric Stereo by Differentiable Shadow Handling, Anisotropic Reflectance Modeling, and Neural Inverse Rendering

Zongrui Li · Qian Zheng · Boxin Shi · Gang Pan · Xudong Jiang

Neural Fields meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes

Zian Wang · Tianchang Shen · Jun Gao · SHENGYU HUANG · Jacob Munkberg · Jon Hasselgren · Zan Gojcic · Wenzheng Chen · Sanja Fidler

Pointersect: Neural Rendering with Cloud-Ray Intersection

Jen-Hao Chang · Wei-Yu Chen · Anurag Ranjan · Kwang Moo Yi · Oncel Tuzel

Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields

Tao Hu · Xiaogang Xu · Shu Liu · Jiaya Jia

StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields

Kunhao Liu · Fangneng Zhan · Yiwen Chen · Jiahui Zhang · Yingchen Yu · Abdulmotaleb El Saddik · Shijian Lu · Eric Xing

EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points

Chengwei Zheng · Wenbin Lin · Feng Xu

Learning Neural Duplex Radiance Fields for Real-Time View Synthesis

Ziyu Wan · Christian Richardt · Aljaz Bozic · Chao Li · Vijay Rengarajan · Seonghyeon Nam · Xiaoyu Xiang · Tuotuo Li · Bo Zhu · Rakesh Ranjan · Jing Liao

Grid-guided Neural Radiance Fields for Large Urban Scenes

Linning Xu · Yuanbo Xiangli · Sida Peng · Xingang Pan · Nanxuan Zhao · Christian Theobalt · Bo Dai · Dahua Lin

NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects

Zhiwen Yan · Chen Li · Gim Lee

Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision

Xiaoshuai Zhang · Abhijit Kundu · Thomas Funkhouser · Leonidas Guibas · Hao Su · Kyle Genova

Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields

Yue Chen · Xingyu Chen · Xuan Wang · Qi Zhang · Yu Guo · Ying Shan · Fei Wang

FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization

Jiawei Yang · Marco Pavone · Yue Wang

RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis

Xudong Huang · Wei Li · Jie Hu · Hanting Chen · Yunhe Wang

Swept-Angle Synthetic Wavelength Interferometry

Alankar Kotwal · Anat Levin · Ioannis Gkioulekas

Edge-aware Regional Message Passing Controller for Image Forgery Localization

Dong Li · Jiaying Zhu · Menglu Wang · Jiawei Liu · Xueyang Fu · Zheng-Jun Zha

Revisiting Residual Networks for Adversarial Robustness

Shihua Huang · Zhichao Lu · Kalyanmoy Deb · Vishnu Naresh Boddeti

CFA: Class-wise Calibrated Fair Adversarial Training

Zeming Wei · Yifei Wang · Yiwen Guo · Yisen Wang

Feature Separation and Recalibration for Adversarial Robustness

Woo Jae Kim · Yoonki Cho · Junsik Jung · Sung-eui Yoon

Improving the Transferability of Adversarial Samples by Path-Augmented Method

Jianping Zhang · Jen-tse Huang · Wenxuan Wang · Yichen LI · Weibin Wu · Xiaosen Wang · Yuxin Su · Michael Lyu

StyLess: Boosting the Transferability of Adversarial Examples

Kaisheng Liang · Bin Xiao

Minimizing Maximum Model Discrepancy for Transferable Black-box Targeted Attacks

Anqi Zhao · Tong Chu · Yahao Liu · Wen Li · Jingjing Li · Lixin Duan

Adversarially Robust Neural Architecture Search for Graph Neural Networks

Beini Xie · Heng Chang · Ziwei Zhang · Xin Wang · Daixin Wang · Zhiqiang Zhang · Rex Ying · Wenwu Zhu

Color Backdoor: A Robust Poisoning Attack in Color Space

Wenbo Jiang · Hongwei Li · Guowen Xu · Tianwei Zhang

Effective Ambiguity Attack Against Passport-based DNN Intellectual Property Protection Schemes through Fully Connected Layer Substitution

Yiming Chen · Jinyu Tian · Xiangyu Chen · Jiantao Zhou

Single Image Backdoor Inversion via Robust Smoothed Classifiers

Mingjie Sun · J Kolter

Multi-view Adversarial Discriminator: Mine the Non-causal Factors for Object Detection in Unseen Domains

Mingjun Xu · Lingyun Qin · Weijie Chen · Shiliang Pu · Lei Zhang

RiDDLE: Reversible and Diversified De-identification with Latent Encryptor

Dongze Li · Wei Wang · Kang Zhao · Jing Dong · Tieniu Tan

CaPriDe Learning: Confidential and Private Decentralized Learning based on Encryption-friendly Distillation Loss

Nurbek Tastan · Karthik Nandakumar

Federated Learning with Data-Agnostic Distribution Fusion

Jian-hui Duan · Wenzhong Li · Derun Zou · Ruichen Li · Sanglu Lu

Learning Federated Visual Prompt in Null Space for MRI Reconstruction

Chun-Mei Feng · Bangjun Li · Xinxing Xu · Yong Liu · Huazhu Fu · Wangmeng Zuo

Decentralized Learning with Multi-Headed Distillation

Andrey Zhmoginov · Mark Sandler · Nolan Miller · Gus Kristiansen · Max Vladymyrov

Efficient Second-Order Plane Adjustment

Lipu Zhou

Learning Correspondence Uncertainty via Differentiable Nonlinear Least Squares

Dominik Muhle · Lukas Koestler · Krishna Murthy Jatavallabhula · Daniel Cremers

Learning Articulated Shape with Keypoint Pseudo-labels from Web Images

Anastasis Stathopoulos · Georgios Pavlakos · Ligong Han · Dimitris Metaxas

ObjectMatch: Robust Registration using Canonical Object Correspondences

Can Gümeli · Angela Dai · Matthias Niessner

Pose Synchronization under Multiple Pair-wise Relative Poses

Yifan Sun · Qixing Huang

MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding

Jun Chen · Ming Hu · Darren Coker · Michael L. Berumen · Blair Costelloe · Sara Beery · Anna Rohrbach · Mohamed Elhoseiny

DiffPose: Toward More Reliable 3D Pose Estimation

GONG JIA · Lin Geng Foo · Zhipeng Fan · Qiuhong Ke · Hossein Rahmani · Jun Liu

Scene-aware Egocentric 3D Human Pose Estimation

Jian Wang · Diogo Luvizon · Weipeng Xu · Lingjie Liu · Kripasindhu Sarkar · Christian Theobalt

Unified Pose Sequence Modeling

Lin Geng Foo · Tianjiao Li · Hossein Rahmani · Qiuhong Ke · Jun Liu

A Characteristic Function-based Method for Bottom-up Human Pose Estimation

Haoxuan Qu · Yujun Cai · Lin Geng Foo · Ajay Kumar · Jun Liu

AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation

Takehiko Ohkawa · Kun He · Fadime Sener · Tomas Hodan · LUAN TRAN · Cem Keskin

Harmonious Feature Learning for Interactive Hand-Object Pose Estimation

Zhifeng Lin · Changxing Ding · Huan Yao · Zengsheng Kuang · Shaoli Huang

CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions

Ming Yan · Xin Wang · Yudi Dai · Siqi Shen · Chenglu Wen · Lan Xu · Yuexin Ma · Cheng Wang

MIME: Human-Aware 3D Scene Generation

Hongwei Yi · Chun-Hao Huang · Shashank Tripathi · Lea Hering · Justus Thies · Michael Black

ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction

Zhengdi Yu · Shaoli Huang · Chen Fang · Toby Breckon · Jue Wang

ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation

Zicong Fan · Omid Taheri · Dimitrios Tzionas · Muhammed Kocabas · Manuel Kaufmann · Michael Black · Otmar Hilliges

NIKI: Neural Inverse Kinematics with Invertible Neural Networks for 3D Human Pose and Shape Estimation

Jiefeng Li · Siyuan Bian · Qi Liu · Jiasheng Tang · Fan Wang · Cewu Lu

P

C

2

: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction

Luke Melas-Kyriazi · Christian Rupprecht · Andrea Vedaldi

ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency

Zixuan Huang · Varun Jampani · Ngoc Anh Thai · Yuanzhen Li · Stefan Stojanov · James Rehg

Human Body Shape Completion with Implicit Shape and Flow Learning

Boyao Zhou · Di Meng · Jean-Sébastien Franco · Edmond Boyer

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction

Zerui Chen · Shizhe Chen · Cordelia Schmid · Ivan Laptev

Sampling is Matter: Point-guided 3D Human Mesh Reconstruction

Jeong Hwan Kim · Mi-Gyeong Gwon · Hyunwoo Park · Hyukmin Kwon · Gi-Mun Um · Wonjun Kim

High-fidelity 3D Human Digitization from Single 2K Resolution Images

Sang-Hun Han · Min-Gyu Park · Ju Yoon · Ju-Mi Kang · YOUNG-JAE PARK · Hae-Gon Jeon

Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition

Chen Guo · Tianjian Jiang · Xu Chen · Jie Song · Otmar Hilliges

CLOTH4D: A Dataset for Clothed Human Reconstruction

XINGXING ZOU · Xintong Han · Waikeung Wong

RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset

Zhongjin Luo · Shengcai Cai · Jinguo Dong · Ruibo Ming · Liangdong Qiu · Xiaohang Zhan · Xiaoguang Han

OmniAvatar: Geometry-Guided Controllable 3D Head Synthesis

Hongyi Xu · Guoxian Song · Zihang Jiang · Jianfeng Zhang · Yichun Shi · Jing Liu · Wanchun Ma · Jiashi Feng · Linjie Luo

HARP: Personalized Hand Reconstruction from Monocular RGB Videos

Korrawe Karunratanakul · Sergey Prokudin · Otmar Hilliges · Siyu Tang

Reconstructing Signing Avatars From Video Using Linguistic Priors

Maria-Paola Forte · Peter Kulits · Chun-Hao Huang · Vasileios Choutas · Dimitrios Tzionas · Katherine J. Kuchenbecker · Michael Black

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Jinbo Xing · Menghan Xia · Yuechen ZHANG · Xiaodong Cun · Jue Wang · Tien-Tsin Wong

MEGANE: Morphable Eyeglass and Avatar Network

Junxuan Li · Shunsuke Saito · Tomas Simon · Stephen Lombardi · Hongdong Li · Jason Saragih

Parametric Implicit Face Representation for Audio-Driven Facial Reenactment

Ricong Huang · Peiwen Lai · Yipeng Qin · Guanbin Li

3D-aware Facial Landmark Detection via Multi-view Consistent Training on Synthetic Data

Libing Zeng · Lele Chen · Wentao Bao · Zhong Li · Yi Xu · Junsong Yuan · Nima Kalantari

DiffusionRig: Learning Personalized Priors for Facial Appearance Editing

Zheng Ding · Cecilia Zhang · Zhihao Xia · Lars Jebe · Zhuowen Tu · Xiuming Zhang

HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

Yujian Zheng · Zi-Rong Jin · Moran Li · Haibin Huang · Chongyang Ma · Shuguang Cui · Xiaoguang Han

DCFace: Synthetic Face Generation with Dual Condition Diffusion Model

Minchul Kim · Feng Liu · Anil Jain · Xiaoming Liu

3D-Aware Face Swapping

Yixuan Li · Chao Ma · Yichao Yan · Wenhan Zhu · Xiaokang Yang

CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing

Ambareesh Revanur · Debraj Basu · Shradha Agrawal · Dhwanit Agarwal · Deepak Pai

Local 3D Editing via 3D Distillation of CLIP Knowledge

Junha Hyung · Sungwon Hwang · Daejin Kim · Hyunji Lee · Jaegul Choo

Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

Gal Metzer · Elad Richardson · Or Patashnik · Raja Giryes · Daniel Cohen-Or

3D-aware multi-class image-to-image translation with NeRFs

Senmao Li · Joost van de Weijer · Yaxing Wang · Fahad Khan · Meiqin Liu · jian Yang

Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

Muheng Li · Yueqi Duan · Jie Zhou · Jiwen Lu

Infinite Photorealistic Worlds using Procedural Generation

Alexander Raistrick · Lahav Lipson · Zeyu Ma · Lingjie Mei · Mingzhe Wang · Yiming Zuo · Karhan Kayan · Hongyu Wen · Beining Han · Yihan Wang · Alejandro Newell · Hei Law · Ankit Goyal · Kaiyu Yang · Jia Deng

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation

Haochen Wang · Xiaodan Du · Jiahao Li · Raymond A. Yeh · Greg Shakhnarovich

RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation

Titas Anciukevicius · Zexiang Xu · Matthew Fisher · Paul Henderson · Hakan Bilen · Niloy Mitra · Paul Guerrero

PET-NeuS: Positional Encoding Tri-planes for Neural Surfaces

Yiqun Wang · Ivan Skorokhodov · Peter Wonka

SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction

Zhizhuo Zhou · Shubham Tulsiani

Dionysus: Recovering Scene Structures by Dividing into Semantic Pieces

Likang Wang · Lei Chen

3D shape reconstruction of semi-transparent worms

Thomas Ilett · Omer Yuval · Thomas Ranner · Netta Cohen · David Hogg

Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container

Jinguang Tong · Sundaram Muthu · Fahira Afzal Maken · Chuong Nguyen · Hongdong Li

HumanGen: Generating Human Radiance Fields with Explicit Priors

Suyi Jiang · Haoran Jiang · Ziyu Wang · Haimin Luo · Wenzheng Chen · Lan Xu

Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection

Ruoshi Liu · Carl Vondrick

Accidental Light Probes

Hong-Xing Yu · Samir Agarwala · Charles Herrmann · Richard Szeliski · Noah Snavely · Jiajun Wu · Deqing Sun

Inverse Rendering of Translucent Objects using Physical and Neural Renderers

Chenhao Li · Trung Ngo · Hajime Nagahara

Multi-view Inverse Rendering for Large-scale Real-world Indoor Scenes

Zhen Li · Lingli Wang · Mofang Cheng · Cihui Pan · Jiaqi Yang

K-Planes: Explicit Radiance Fields in Space, Time, and Appearance

Sara Fridovich-Keil · Giacomo Meanti · Frederik Warburg · Benjamin Recht · Angjoo Kanazawa

Efficient Map Sparsification Based on 2D and 3D Discretized Grids

Xiaoyu Zhang · Yun-Hui Liu

Modernizing Old Photos Using Multiple References via Photorealistic Style Transfer

Agus Gunawan · Soo Ye Kim · Hyeonjun Sim · Jae-Ho Lee · Munchurl Kim

DINER: Depth-aware Image-based NEural Radiance fields

Malte Prinzler · Otmar Hilliges · Justus Thies

Cross-Guided Optimization of Radiance Fields with Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis

Youngho Yoon · Kuk-Jin YOON

NeRFLight: Fast and Light Neural Radiance Fields using a Shared Feature Grid

Fernando Rivas-Manzaneque · Jorge Sierra-Acosta · Adrian Penate-Sanchez · Francesc Moreno-Noguer · Angela Ribeiro

Multi-Space Neural Radiance Fields

Ze-Xin Yin · Jiaxiong Qiu · Ming-Ming Cheng · Bo Ren

DyLiN: Making Light Field Networks Dynamic

Heng Yu · Joel Julin · Zoltan Milacski · Koichiro Niinuma · Laszlo Jeni

DP-NeRF: Deblurred Neural Radiance Field with Physical Scene Priors

Do-Gyoon Lee · Minhyeok Lee · Chajin Shin · Sangyoun Lee

SUDS: Scalable Urban Dynamic Scenes

Haithem Turki · Jason Zhang · Francesco Ferroni · Deva Ramanan

NeRFLix: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

Kun Zhou · Wenbo Li · Yi Wang · Tao Hu · Nianjuan Jiang · Xiaoguang Han · Jiangbo Lu

Polarimetric iToF: Measuring High-Fidelity Depth through Scattering Media

Daniel Jeon · Andreas Meuleman · Seung-Hwan Baek · Min Kim Kim

MaLP: Manipulation Localization Using a Proactive Scheme

Vishal Asnani · Xi Yin · Tal Hassner · Xiaoming Liu

Physically Adversarial Infrared Patches with Learnable Shapes and Locations

Xingxing Wei · Jie Yu · Yao Huang

Towards Benchmarking and Assessing Visual Naturalness of PhysicalWorld Adversarial Attacks

Simin Li · Shuning Zhang · Gujun Chen · dong wang · Pu Feng · Jiakai Wang · Aishan Liu · Xin Yi · Xianglong Liu

Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts

Francesco Croce · Sylvestre-Alvise Rebuffi · Evan Shelhamer · Sven Gowal

Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression

Junho Kim · Byung-Kwan Lee · Yong Man Ro

Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation

Phoenix Williams · Ke Li

Enhancing the Self-Universality for Transferable Targeted Attacks

Zhipeng Wei · Jingjing Chen · Zuxuan Wu · Yu-Gang Jiang

Evading DeepFake Detectors via Adversarial Statistical Consistency

Hou Yang · Qing Guo · Yihao Huang · Xiaofei Xie · Lei Ma · Jianjun Zhao

CAP: Robust Point Cloud Classification via Semantic and Structural Modeling

Daizong Ding · Erling Jiang · Yuanmin Huang · Mi Zhang · Wenxuan Li · Min Yang

Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

Yi Yu · Yufei Wang · Wenhan Yang · Shijian Lu · Yap-peng Tan · Alex Kot

FedSeg: Class-Heterogeneous Federated Learning for Semantic Segmentation

Jiaxu Miao · Zongxin Yang · Leilei Fan · Yi Yang

Multimodal Industrial Anomaly Detection via Hybrid Fusion

Yue Wang · Jinlong Peng · Jiangning Zhang · Ran Yi · Yabiao Wang · Chengjie Wang

Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection

HUI LYU · Zhongqi Yue · Qianru Sun · Bin Luo · Zhen Cui · Hanwang Zhang

Attribute-preserving Face Dataset Anonymization via Latent Code Optimization

Simone Barattin · Christos Tzelepis · Ioannis Patras · Nicu Sebe

HandsOff: Labeled Dataset Generation with No Additional Human Annotations

Austin Xu · Mariya Vasileva · Achal Dave · Arjun Seshadri

Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models

Matthew Olson · Shusen Liu · Rushil Anirudh · Jayaraman J. Thiagarajan · Peer-timo Bremer · Weng-Keen Wong

Learning to Generate Image Embeddings with User-level Differential Privacy

Zheng Xu · Maxwell Collins · Yuxiao Wang · Liviu Panait · Sewoong Oh · Sean Augenstein · Ting Liu · Florian Schroff · Hugh McMahan

Adaptive Data-Free Quantization

Biao Qian · Yang Wang · Richang Hong · Meng Wang

Solving Oscillation Problem in Post-Training Quantization Through a Theoretical Perspective

Yuexiao Ma · Huixia Li · Xiawu Zheng · Xuefeng Xiao · Rui Wang · Shilei Wen · Xin Pan · Fei Chao · Rongrong Ji

One-Shot Model for Mixed-Precision Quantization

Ivan Koryakovskiy · Alexandra Yakovleva · Valentin Buchnev · Temur Isaev · Gleb Odinokikh

Training debiased subnetworks with contrastive weight pruning

Geon Yeong Park · Sangmin Lee · Sang Wan Lee · Jong Ye

Understanding Masked Autoencoders via Hierarchical Latent Variable Models

Lingjing Kong · Martin Q. Ma · Guangyi Chen · Eric Xing · Yuejie Chi · Louis-Philippe Morency · Kun Zhang

MobileOne: An Improved One Millisecond Mobile Backbone

Pavan Kumar Anasosalu Vasu · James Gabriel · Jeff Zhu · Oncel Tuzel · Anurag Ranjan

Rate Gradient Approximation Attack Threats Deep Spiking Neural Networks

Tong Bu · Jianhao Ding · Zecheng Hao · Zhaofei Yu

Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation

Qi Xu · Yaxin Li · Jiangrong Shen · Jian Liu · Huajin Tang · Gang Pan

From Node Interaction to Hop Interaction: New Effective and Scalable Graph Learning Paradigm

Jie Chen · Zilong Li · Zhu Yin · Junping Zhang · Jian Pu

A General Regret Bound of Preconditioned Gradient Method for DNN Training

Hongwei Yong · Ying Sun · Lei Zhang

Improved Distribution Matching for Dataset Condensation

Ganlong Zhao · Guanbin Li · Yipeng Qin · Yizhou Yu

Imitation Learning as State Matching via Differentiable Physics

Siwei Chen · Xiao Ma · Zhongwen Xu

Trainable Projected Gradient Method for Robust Fine-tuning

Junjiao Tian · Xiaoliang Dai · Chih-Yao Ma · Zecheng He · Yen-Cheng Liu · Zsolt Kira

Improving Generalization of Meta Learning with Inverted Regularization at Inner-level

Lianzhe Wang · Shiji Zhou · Shanghang Zhang · Xu Chu · Heng Chang · Wenwu Zhu

SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation

Ruihuang Li · Chenhang HE · Yabin Zhang · Shuai Li · Liyi Chen · Lei Zhang

Rethinking the Correlation in Few-Shot Segmentation: A Buoys View

Yuan Wang · Rui Sun · Tianzhu Zhang

Reliability in Semantic Segmentation: Are We on the Right Track?

Pau de Jorge Aranda · Riccardo Volpi · Philip Torr · Grégory Rogez

ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation

Kehan Li · Zhennan Wang · Zesen Cheng · Runyi Yu · Yian Zhao · Guoli Song · Chang Liu · Li Yuan · Jie Chen

PartDistillation: Learning Parts from Instance Segmentation

Jang Hyun Cho · Philipp Kraehenbuehl · Vignesh Ramanathan

PACO: Parts and Attributes of Common Objects

Vignesh Ramanathan · Anmol Kalia · Vladan Petrovic · Yi Wen · Baixue Zheng · Baishan Guo · Rui Wang · Aaron Marquez · Rama Kovvuri · Abhishek Kadian · Amir Mousavi · Yiwen Song · Abhimanyu Dubey · Dhruv Mahajan

MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation

Yong Yang · Qiong Chen · Yuan Feng · Tianlin Huang

Generative Semantic Segmentation

Jiaqi Chen · Jiachen Lu · Xiatian Zhu · Li Zhang

GeoLayoutLM: Geometric Pre-training for Visual Information Extraction

Chuwei Luo · Changxu Cheng · Qi Zheng · Cong Yao

GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts

Haoran Geng · Helin Xu · Chengyang Zhao · Chao Xu · Li Yi · Siyuan Huang · He Wang

A Simple Framework for Text-Supervised Semantic Segmentation

Muyang Yi · Quan Cui · Hao Wu · Cheng Yang · Osamu Yoshie · Hongtao Lu

Learning to Detect and Segment for Open Vocabulary Object Detection

tao wang

Open-vocabulary Attribute Detection

Maria Bravo · Sudhanshu Mittal · Simon Ging · Thomas Brox

CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching

Xiaoshi Wu · Feng Zhu · Rui Zhao · Hongsheng Li

CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP

Runnan Chen · Youquan Liu · Lingdong Kong · Xinge ZHU · Yuexin Ma · Yikang LI · Yuenan Hou · Yu Qiao · Wenping Wang

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

Runyu Ding · Jihan Yang · Chuhui Xue · Wenqing Zhang · Song Bai · XIAOJUAN QI

CrOC: Cross-View Online Clustering for Dense Visual Representation Learning

Thomas Stegmüller · Tim Lebailly · Behzad Bozorgtabar · Tinne Tuytelaars · Jean-Philippe Thiran

ScanDMM: A Deep Markov Model of Scanpath Prediction for 360° Images

Xiangjie Sui · Yuming Fang · Hanwei Zhu · Shiqi Wang · Zhou Wang

Turning a CLIP Model into a Scene Text Detector

Wenwen Yu · Yuliang Liu · Wei Hua · Deqiang Jiang · Bo Ren · Xiang Bai

Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

Filip Radenovic · Abhimanyu Dubey · Abhishek Kadian · Todor Mihaylov · Simon Vandenhende · Yash Patel · Yi Wen · Vignesh Ramanathan · Dhruv Mahajan

Uncurated Image-Text Datasets: Shedding Light on Demographic Bias

Noa Garcia · Yusuke Hirota · YANKUN WU · Yuta Nakashima

EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata

Chenhao Zheng · Ayush Shrivastava · Andrew Owens

Cross-Domain Image Captioning with Discriminative Finetuning

Roberto Dessi · Michele Bevilacqua · Eleonora Gualdoni · Nathanaël Rakotonirina · Francesca Franzon · Marco Baroni

Similarity Maps for Self-Training Weakly-Supervised Phrase Grounding

Tal Shaharabany · Lior Wolf

Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation

Sara Sarto · Manuele Barraco · Marcella Cornia · Lorenzo Baraldi · Rita Cucchiara

Detecting and Grounding Multi-Modal Media Manipulation

Rui Shao · Tianxing Wu · Ziwei Liu

DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation

Yueming Lyu · Tianwei Lin · Fu Li · Dongliang He · Jing Dong · Tieniu Tan

Parts2Words: Learning Joint Embedding of Point Clouds and Texts by Bidirectional Matching between Parts and Words

Chuan Tang · Xi Yang · Bojian Wu · Zhizhong Han · Yi Chang

Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR

Aneeshan Sain · Ayan Kumar Bhunia · Subhadeep Koley · Pinaki Nath Chowdhury · Soumitri Chattopadhyay · Tao Xiang · Yi-Zhe Song

GeneCIS: A Benchmark for General Conditional Image Similarity

Sagar Vaze · Nicolas Carion · Ishan Misra

Picture that Sketch: Photorealistic Image Generation from Abstract Sketches

Subhadeep Koley · Ayan Kumar Bhunia · Aneeshan Sain · Pinaki Nath Chowdhury · Tao Xiang · Yi-Zhe Song

Hyperbolic Contrastive Learning for Visual Representations beyond Objects

Songwei Ge · Shlok Mishra · Simon Kornblith · Chun-Liang Li · David Jacobs

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

Xinlong Wang · Wen Wang · Yue Cao · Chunhua Shen · Tiejun Huang

DeAR: Debiasing Vision-Language Models with Additive Residuals

Ashish Seth · Mayur Hemani · Chirag Agarwal

Leverage Interactive Affinity for Affordance Learning

Hongchen Luo · Wei Zhai · Jing Zhang · Yang Cao · Dacheng Tao

Affordance Grounding from Demonstration Video to Target Image

Joya Chen · Difei Gao · Kevin Qinghong Lin · Mike Zheng Shou

Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning

Yatai Ji · Rong-Cheng Tu · jie jiang · Weijie Kong · Chengfei Cai · Wenzhe Zhao · WANG HongFa · Yujiu Yang · Wei Liu

Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding

Morris Alper · Michael Fiman · Hadar Averbuch-Elor

Probabilistic Prompt Learning for Dense Prediction

Hyeongjun Kwon · Taeyong Song · Somi Jeong · Jin Kim · Jinhyun Jang · Kwanghoon Sohn

Visual-Language Prompt Tuning with Knowledge-guided Context Optimization

Hantao Yao · Rui Zhang · Changsheng Xu

The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training

Gi-Cheon Kang · Sungdong Kim · Jinhwa Kim · Donghyun Kwak · Byoung-Tak Zhang

Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasoning

Shi Chen · Qi Zhao

Logical Implications for Visual Question Answering Consistency

Sergio Tascon Morales · Pablo Márquez Neila · Raphael Sznitman

Abstract Visual Reasoning: An Algebraic Approach for Solving Raven’s Progressive Matrices

Jingyi Xu · Tushar Vaidya · Yufei Wu · Saket Chandra · Zhangsheng Lai · Kai Fong Ernest Chong

NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory

Santhosh Kumar Ramakrishnan · Ziad Al-Halah · Kristen Grauman

Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding

Minyoung Hwang · Jaeyeon Jeong · Minsoo Kim · Yoonseon Oh · Songhwai Oh

3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification

Jiazhao Zhang · Liu Dai · Fanpeng Meng · Qingnan Fan · Xuelin Chen · Kai Xu · He Wang

VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

Mengyin Liu · jie jiang · Chao Zhu · Xu-Cheng Yin

An Actor-Centric Causality Graph for Asynchronous Temporal Inference in Group Activity

Zhao Xie · Tian Gao · Kewei Wu · Jiao Chang

Affection: Learning Affective Explanations for Real-World Visual Data

Panos Achlioptas · Maks Ovsjanikov · Leonidas Guibas · Sergey Tulyakov

Decoupled Multimodal Distilling for Emotion Recognition

Yong Li · Yuanzhi Wang · Zhen Cui

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models

Wenhao Wu · Xiaohan Wang · Haipeng Luo · Jingdong Wang · Yi Yang · Wanli Ouyang

Learning Video Representations from Large Language Models

Yue Zhao · Ishan Misra · Philipp Kraehenbuehl · Rohit Girdhar

ProTéGé: Untrimmed Pretraining for Video Temporal Grounding by Video Temporal Grounding

Lan Wang · Gaurav Mittal · Sandra Sajeev · Ye Yu · Matthew Hall · Vishnu Naresh Boddeti · Mei Chen

Fine-tuned CLIP Models are Efficient Video Learners

Hanoona Bangalath · Muhammad Uzair Khattak · Muhammad Maaz · Salman Khan · Fahad Khan

Movies2Scenes: Using Movie Metadata to Learn Scene Representation

Shixing Chen · Chun-Hao Liu · Xiang Hao · Xiaohan Nie · Maxim Arap · Raffay Hamid

Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks

Hyolim Kang · Hanjung Kim · Joungbin An · Minsu Cho · Seon Joo Kim

Reducing the Label Bias for Timestamp Supervised Temporal Action Segmentation

Kaiyuan Liu · Yunheng Li · Shenglan Liu · Tan · Zihang Shao

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition

Yuyang Wanyan · Xiaoshan Yang · Chaofan Chen · Changsheng Xu

MMG-Ego4D: Multimodal Generalization in Egocentric Action Recognition

Xinyu Gong · Sreyas Mohan · Naina Dhingra · Jean-Charles Bazin · YILEI LI · Zhangyang Wang · Rakesh Ranjan

Prompt-Guided Zero-Shot Anomaly Action Recognition using Pretrained Deep Skeleton Features

Fumiaki Sato · Ryo Hachiuma · Taiki Sekii

TempSAL – Uncovering Temporal Information for Deep Saliency Prediction

Bahar Aydemir · Ludo Hoffstetter · Tong Zhang · Mathieu Salzmann · Sabine Süsstrunk

Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction

Xuehao Gao · Shaoyi Du · Yang Wu · Yang Yang

CASP-Net: Rethinking Video Saliency Prediction from an Audio-Visual Consistency Perceptual Perspective

Junwen Xiong · Ganglai Wang · Peng Zhang · Wei Huang · Yufei Zha · Guangtao Zhai

Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment

Sungbin Kim · Arda Senocak · Hyunwoo Ha · Andrew Owens · Tae-Hyun Oh

Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning

Weixuan Sun · Jiayi Zhang · Jianyuan Wang · Zheyuan Liu · Yiran Zhong · Tianpeng Feng · Yandong Guo · Yanhao Zhang · Nick Barnes

Novel-view Acoustic Synthesis

Changan Chen · Alexander Richard · Roman Shapovalov · Vamsi Krishna Ithapu · Natalia Neverova · Kristen Grauman · Andrea Vedaldi

Relational Space-Time Query in Long-Form Videos

Xitong Yang · FU-JEN CHU · Raghav Goyal · Matt Feiszli · Lorenzo Torresani · Du Tran

Selec

本文来自网络,不代表协通编程立场,如若转载,请注明出处:https://net2asp.com/dbd9fcbc52.html