Gang YU (俞刚)

I am a Research Director at Tencent. My research interests focus on the computer vision and artificical intelligence, specifically on the topic of generative AI, object detection, segmentation, human keypoint, human action recognition, and 3D reconstruction. I obtained PhD from NTU in 2014 supervised by Prof. Junsong Yuan. Before joining Tencent, I worked as a research leader for five years in Megvii (Face++) under the supervision of Dr. Sun Jian.

Google Scholar / CV / Zhihu



Gang Yu
News

We have organized a tutorial Mobile Visual Analytics in CVPR2021.

Our team obtained the first place of Mobile AI Challenge in the Depth Estimation Challenge (CVPR 2021).

We have organized a tutorial Human Pose Estimation and Action Recognition [Skeleton, Action] (ICIP 2019).

We have organized a tutorial Object Detection in Recent Three Years [Detection, AutoML, Fine-Grained] (ICME 2019).

We organized the Detection In the Wild (DIW2019) Challenge in CVPR2019.

Our team obtained the first place of nuScenes 3D Detection and BDD100K & D²-City Detection Domain Adaptation in the Workshop on Autonomous Driving (CVPR 2019).

Our team obtained the first place of COCO Detecction, COCO Keypoint Detection, COCO Panoptic Segmentation, Mapillary Panoptic Segmentation (four Champions) in the COCO + Mapillary Joint Challenge. ChinaMedia 1 ChinaMedia 2(ECCV 2018) .

Our team obtained the first place of WiderFace Detecction in the Wider Challenge (ECCV 2018).

Our team obtained the first place of Video Instance Segmentation [Report][Slides] in the WAD2018 (Workshop on Autonomous Driving) Challenge (CVPR 2018).

Our team obtained the first place of AVA [Report][Slides] and second place of Moments in time [Report] in the ActivityNet2018 Challenge (CVPR 2018).

Presentation: Beyond RetinaNet and Mask R-CNN, Jiangmen (将门), 2018

Presentation: Introduction to Object Detection, PKU & CAS, 2018

Our team obtained the first place of COCO 2017 Challenge (Detection Track & Keypoint Track) (ICCV 2017).





Conference

Executing your Commands via Motion Diffusion in Latent Space
Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, BIN FU, Tao Chen, Gang Yu
CVPR, 2023

STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
Zhenglin Zhou, Huaxia Li, Hong Liu, Nanyang Wang, Gang Yu , Rongrong Ji
CVPR, 2023

End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Tao Chen, Gang Yu
CVPR, 2023

Capturing the motion of every joint: 3D human pose and shape estimation with independent tokens
Sen Yang, Wen Heng, Gang Liu, Guozhong Luo, Wankou Yang, Gang Yu
ICLR, 2023

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Qiang Wan, Zilong Huang, Jiachen Lu, Gang Yu, Li Zhang
ICLR, 2023

Hierarchical Normalization for Robust Monocular Depth Estimation
Chi Zhang, Wei Yin, Zhibin Wang, Gang Yu, Bin Fu, Chunhua Shen
NeurIPS, 2022

Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D Representations
Fukun Yin, Wen Liu, Zilong Huang, Pei Cheng, Tao Chen, Gang Yu
NeurIPS, 2022

D&D: Learning Human Dynamics from Dynamic Camera
Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
ECCV, 2022 (ORAL)

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
Wenqiang Zhang, Zilong Huang, Guozhong Luo, Tao Chen, Xinggang Wang, Wenyu Liu, Gang Yu, Chunhua Shen
CVPR, 2022

Object-aware Long-short-range Spatial Alignment for Few-Shot Fine-Grained Image Classification
Yike Wu, Bo Zhang, Gang Yu, Weixi Zhang, Bin Wang, Tao Chen, Jiayuan Fan
ACM MM, 2021

Attribute-specific Control Units in StyleGAN for Fine-grained Image Manipulation
Rui Wang, Jian Chen, Gang Yu, Li Sun, Changqian Yu, Changxin Gao, Nong Sang
ACM MM, 2021

State-Aware Tracker for Real-Time Video Object Segmentation
Xi Chen, Zuoxin Li, Ye Yuan, Gang Yu, Jian-Xin Shen, Donglian Qi
CVPR, 2020

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
Guan'an Wang, Shuo Yang, Huanyu Liu, Zhicheng Wang, Yang Yang, Shuliang Wang, Gang Yu, Erjin Zhou, Jian Sun
CVPR, 2020

Context Prior for Scene Segmentation
Changqian Yu, Jingbo Wang, Changxin Gao, Gang Yu, Chunhua Shen, Nong Sang
CVPR, 2020

SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines
Yinda Xu, Zeyu Wang, Zuoxin Li, Ye Yuan, Gang Yu
AAAI, 2020

Learnable Tree Filter for Structure-preserving Feature Transform
Lin Song, Yanwei Li, Zeming Li, Gang Yu, Hongbin Sun, Jian Sun, Nanning Zheng
NIPS, 2019

ThunderNet: Towards Real-time Generic Object Detection
Zheng Qin, Zeming Li, Zhaoning Zhang, Yiping Bao, Gang Yu, Yuxing Peng, Jian Sun
ICCV, 2019

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen
ICCV, 2019

Objects365: A Large-scale, High-quality Dataset for Object Detection
Shuai Shao, Zeming Li, Tianyuan Zhang, Chao Peng, Gang Yu, Jing Li, Xiangyu Zhang, Jian Sun
ICCV, 2019

An End-to-end Network for Panoptic Segmentation
Huanyu Liu, Chao Peng, Changqian Yu, Jingbo Wang, Xu Liu, Gang Yu, Wei Jiang
CVPR, 2019

TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
Lin Song, Shiwei Zhang, Gang Yu, Hongbin Sun
CVPR, 2019

Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN
Shiyi Lan, Ruichi Yu, Gang Yu, Larry Davis
CVPR, 2019

Shape Robust Text Detection with Progressive Scale Expansion Network
Wenhai Wang, Xiang Li, Enze Xie, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao
CVPR, 2019

Scene Text Detection with Supervised Pyramid Context Network
Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li
AAAI, 2019

Attention-based Multi-Context Guiding for Few-Shot Semantic Segmentation
Tao Hu, Pengwan Yang, Chiliang Zhang, Gang Yu, Yadong Mu, Cees Snoek
AAAI, 2019

DetNet: A Backbone network for Object Detection
Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
ECCV, 2018

BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang
ECCV, 2018

Associating Inter-Image Salient Instances forWeakly Supervised Semantic Segmentation
Ruochen Fan, Qibin Hou, Ming-ming Chen, Gang Yu, Ralph R. Martin, Shi-min Hu
ECCV, 2018

MegDet: A Large Mini-Batch Object Detector
Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun
CVPR, 2018

Cascaded Pyramid Network for Multi-Person Pose Estimation [Code]
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, Jian Sun
CVPR, 2018

Learning a Discriminative Feature Network for Semantic Segmentation
Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang
CVPR, 2018

R-FCN++: Towards Accurate Region-based Fully Convolutional Networks for Object Detection
Zeming Li, Yilun Chen, Gang Yu, Xiangyu Zhang, Jian Sun
AAAI, 2018

Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network
Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun
CVPR, 2017

Fast Action Proposals for Human Action Detection and Search
Gang Yu, Junsong Yuan
CVPR, 2015

Discriminative Orderlet Mining For Real-time Recognition of Human-Object Interaction [Project]
Gang Yu, Zicheng Liu, Junsong Yuan
ACCV, 2014

Scalable Forest Hashing for Fast Similarity Search
Gang Yu, Junsong Yuan
ICME, 2014

Propagative Hough Voting for Human Activity Recognition
Gang Yu, Junsong Yuan, Zicheng Liu
ECCV, 2012

Randomized Spatial Partition for Scene Recognition
Yuning Jiang, Junsong Yuan, Gang Yu
ECCV, 2012

Predicting Human Activities using Spatio-Temporal Structure of Interest Points
Gang Yu, Junsong Yuan, Zicheng Liu
ACM MM, 2012

Unsupervised Random Forest Indexing for Fast Action Search
Gang Yu, Junsong Yuan, Zicheng Liu
CVPR, 2011

Real-time HumanAction Search using Random Forest based Hough Voting
Gang Yu, Junsong Yuan, Zicheng Liu
ACM MM, 2011



Journal

BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation
Changqian Yu, Changxin Gao, Jingbo Wang, Gang Yu, Chunhua Shen, Nong Sang
International Journal of Computer Vision, 2021

Propagative Hough Voting for Human Activity Detection and Recognition
Gang Yu, Junsong Yuan, Zicheng Liu
IEEE Trans. on Circuits and Systems for Video Technology, Vol.25, Issue 1, pp.87-98, 2014

Action Search by Example using Randomized Visual Vocabularies
Gang Yu, Junsong Yuan, Zicheng Liu
IEEE Trans. on Image Processing, Vol.22, Issue 1, pp. 377-390, 2013

Fast Action Detection via Discriminative Random Forest Voting and Top-K Subvolume Search
Gang Yu, Norberto A., Junsong Yuan, Zicheng Liu
IEEE Trans. on Multimedia, Vol.13, Issue 3, pp. 507-517, 2013

Book

Human Action Analysis with Randomized Trees
Gang Yu, Junsong Yuan, Zicheng Liu
SpringerBriefs, Springer, 2014

Pre-prints

Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang, Youcheng Ben, Guozhong Luo, Pei Cheng, Gang Yu, Bin Fu
Arxiv, 2021

Rethinking on Multi-Stage Networks for Human Pose Estimation
Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Sun
Arxiv, 2019

Light-Head R-CNN: In Defense of Two-Stage Object Detector [Code]
Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
Arxiv, 2017

CrowdHuman: A Benchmark for Detecting Human in a Crowd [Project]
Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, Jian Sun
Arxiv, 2018

Links

Team member: Chao Peng (彭超), Zeming Li (黎泽明), Changqian Yu (余昌黔), Shuai Shao (邵帅)