Gang YU (俞刚)
I am a Research Director at Tencent. My research interests focus on the computer vision and artificical intelligence, specifically on the topic of generative AI, object detection, segmentation, human keypoint, human action recognition, and 3D reconstruction. I obtained PhD from NTU in 2014 supervised by Prof. Junsong Yuan. Before joining Tencent, I worked as a research leader for five years in Megvii (Face++) under the supervision of Dr. Sun Jian.
Google Scholar /
CV / Zhihu
|
|
News
We have organized a tutorial Mobile Visual Analytics in CVPR2021.
Our team obtained the first place of Mobile AI Challenge in the Depth Estimation Challenge (CVPR 2021).
We have organized a tutorial Human Pose Estimation and Action Recognition [Skeleton, Action] (ICIP 2019).
We have organized a tutorial Object Detection in Recent Three Years [Detection, AutoML, Fine-Grained] (ICME 2019).
We organized the Detection In the Wild (DIW2019) Challenge in CVPR2019.
Our team obtained the first place of nuScenes 3D Detection and BDD100K & D²-City Detection Domain Adaptation in the Workshop on Autonomous Driving (CVPR 2019).
Our team obtained the first place of COCO Detecction, COCO Keypoint Detection, COCO Panoptic Segmentation, Mapillary Panoptic Segmentation (four Champions) in the COCO + Mapillary Joint Challenge. ChinaMedia 1 ChinaMedia 2(ECCV 2018) .
Our team obtained the first place of WiderFace Detecction in the Wider Challenge (ECCV 2018).
Our team obtained the first place of Video Instance Segmentation [Report][Slides] in the WAD2018 (Workshop on Autonomous Driving) Challenge (CVPR 2018).
Our team obtained the first place of AVA [Report][Slides] and second place of Moments in time [Report] in the ActivityNet2018 Challenge (CVPR 2018).
Presentation: Beyond RetinaNet and Mask R-CNN, Jiangmen (将门), 2018
Presentation: Introduction to Object Detection, PKU & CAS, 2018
Our team obtained the first place of COCO 2017 Challenge (Detection Track & Keypoint Track) (ICCV 2017).
|
Conference
|
Executing your Commands via Motion Diffusion in Latent Space
Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, BIN FU, Tao Chen, Gang Yu
CVPR, 2023
|
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
Zhenglin Zhou, Huaxia Li, Hong Liu, Nanyang Wang, Gang Yu , Rongrong Ji
CVPR, 2023
|
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Tao Chen, Gang Yu
CVPR, 2023
|
Capturing the motion of every joint: 3D human pose and shape estimation with independent tokens
Sen Yang, Wen Heng, Gang Liu, Guozhong Luo, Wankou Yang, Gang Yu
ICLR, 2023
|
SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Qiang Wan, Zilong Huang, Jiachen Lu, Gang Yu, Li Zhang
ICLR, 2023
|
Hierarchical Normalization for Robust Monocular Depth Estimation
Chi Zhang, Wei Yin, Zhibin Wang, Gang Yu, Bin Fu, Chunhua Shen
NeurIPS, 2022
|
Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D Representations
Fukun Yin, Wen Liu, Zilong Huang, Pei Cheng, Tao Chen, Gang Yu
NeurIPS, 2022
|
D&D: Learning Human Dynamics from Dynamic Camera
Jiefeng Li, Siyuan Bian, Chao Xu, Gang Liu, Gang Yu, Cewu Lu
ECCV, 2022 (ORAL)
|
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
Wenqiang Zhang, Zilong Huang, Guozhong Luo, Tao Chen, Xinggang Wang, Wenyu Liu, Gang Yu, Chunhua Shen
CVPR, 2022
|
Object-aware Long-short-range Spatial Alignment for Few-Shot Fine-Grained Image Classification
Yike Wu, Bo Zhang, Gang Yu, Weixi Zhang, Bin Wang, Tao Chen, Jiayuan Fan
ACM MM, 2021
|
Attribute-specific Control Units in StyleGAN for Fine-grained Image Manipulation
Rui Wang, Jian Chen, Gang Yu, Li Sun, Changqian Yu, Changxin Gao, Nong Sang
ACM MM, 2021
|
State-Aware Tracker for Real-Time Video Object Segmentation
Xi Chen, Zuoxin Li, Ye Yuan, Gang Yu, Jian-Xin Shen, Donglian Qi
CVPR, 2020
|
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
Guan'an Wang, Shuo Yang, Huanyu Liu, Zhicheng Wang, Yang Yang, Shuliang Wang, Gang Yu, Erjin Zhou, Jian Sun
CVPR, 2020
|
Context Prior for Scene Segmentation
Changqian Yu, Jingbo Wang, Changxin Gao, Gang Yu, Chunhua Shen, Nong Sang
CVPR, 2020
|
SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines
Yinda Xu, Zeyu Wang, Zuoxin Li, Ye Yuan, Gang Yu
AAAI, 2020
|
Learnable Tree Filter for Structure-preserving Feature Transform
Lin Song, Yanwei Li, Zeming Li, Gang Yu, Hongbin Sun, Jian Sun, Nanning Zheng
NIPS, 2019
|
ThunderNet: Towards Real-time Generic Object Detection
Zheng Qin, Zeming Li, Zhaoning Zhang, Yiping Bao, Gang Yu, Yuxing Peng, Jian Sun
ICCV, 2019
|
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen
ICCV, 2019
|
Objects365: A Large-scale, High-quality Dataset for Object Detection Shuai Shao, Zeming Li, Tianyuan Zhang, Chao Peng, Gang Yu, Jing Li, Xiangyu Zhang, Jian Sun
ICCV, 2019
|
An End-to-end Network for Panoptic Segmentation
Huanyu Liu, Chao Peng, Changqian Yu, Jingbo Wang, Xu Liu, Gang Yu, Wei Jiang
CVPR, 2019
|
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
Lin Song, Shiwei Zhang, Gang Yu, Hongbin Sun
CVPR, 2019
|
Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN
Shiyi Lan, Ruichi Yu, Gang Yu, Larry Davis
CVPR, 2019
|
Shape Robust Text Detection with Progressive Scale Expansion Network
Wenhai Wang, Xiang Li, Enze Xie, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao
CVPR, 2019
|
Scene Text Detection with Supervised Pyramid Context Network
Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li
AAAI, 2019
|
Attention-based Multi-Context Guiding for Few-Shot Semantic Segmentation
Tao Hu, Pengwan Yang, Chiliang Zhang, Gang Yu, Yadong Mu, Cees Snoek
AAAI, 2019
|
DetNet: A Backbone network for Object
Detection
Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
ECCV, 2018
|
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang
ECCV, 2018
|
Associating Inter-Image Salient Instances forWeakly Supervised Semantic Segmentation
Ruochen Fan, Qibin Hou, Ming-ming Chen, Gang Yu, Ralph R. Martin, Shi-min Hu
ECCV, 2018
|
MegDet: A Large Mini-Batch Object Detector
Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun
CVPR, 2018
|
Cascaded Pyramid Network for Multi-Person Pose Estimation [Code]
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, Jian Sun
CVPR, 2018
|
Learning a Discriminative Feature Network for Semantic Segmentation
Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, Nong Sang
CVPR, 2018
|
R-FCN++: Towards Accurate Region-based Fully Convolutional Networks for Object Detection
Zeming Li, Yilun Chen, Gang Yu, Xiangyu Zhang, Jian Sun
AAAI, 2018
|
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network
Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun
CVPR, 2017
|
Fast Action Proposals for Human Action Detection and Search
Gang Yu, Junsong Yuan
CVPR, 2015
|
Discriminative Orderlet Mining For Real-time Recognition of Human-Object Interaction [Project]
Gang Yu, Zicheng Liu, Junsong Yuan
ACCV, 2014
|
Scalable Forest Hashing for Fast Similarity Search
Gang Yu, Junsong Yuan
ICME, 2014
|
Propagative Hough Voting for Human Activity Recognition
Gang Yu, Junsong Yuan, Zicheng Liu
ECCV, 2012
|
Randomized Spatial Partition for Scene Recognition
Yuning Jiang, Junsong Yuan, Gang Yu
ECCV, 2012
|
Predicting Human Activities using Spatio-Temporal Structure of Interest Points
Gang Yu, Junsong Yuan, Zicheng Liu
ACM MM, 2012
|
Unsupervised Random Forest Indexing for Fast Action Search
Gang Yu, Junsong Yuan, Zicheng Liu
CVPR, 2011
|
Real-time HumanAction Search using Random Forest based Hough Voting
Gang Yu, Junsong Yuan, Zicheng Liu
ACM MM, 2011
|
Journal
|
BiSeNet V2: Bilateral Network with Guided Aggregation for Real-time Semantic Segmentation
Changqian Yu, Changxin Gao, Jingbo Wang, Gang Yu, Chunhua Shen, Nong Sang
International Journal of Computer Vision, 2021
|
Propagative Hough Voting for Human Activity Detection and Recognition
Gang Yu, Junsong Yuan, Zicheng Liu
IEEE Trans. on Circuits and Systems for Video Technology, Vol.25, Issue 1, pp.87-98, 2014
|
Action Search by Example using Randomized Visual Vocabularies
Gang Yu, Junsong Yuan, Zicheng Liu
IEEE Trans. on Image Processing, Vol.22, Issue 1, pp. 377-390, 2013
|
Fast Action Detection via Discriminative Random Forest Voting and Top-K Subvolume Search
Gang Yu, Norberto A., Junsong Yuan, Zicheng Liu
IEEE Trans. on Multimedia, Vol.13, Issue 3, pp. 507-517, 2013
|
Pre-prints
|
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang, Youcheng Ben, Guozhong Luo, Pei Cheng, Gang Yu, Bin Fu
Arxiv, 2021
|
Rethinking on Multi-Stage Networks for Human Pose Estimation
Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi Xiao, Gang Yu, Hongtao Lu, Yichen Wei, Jian Sun
Arxiv, 2019
|
Light-Head R-CNN: In Defense of Two-Stage Object Detector [Code]
Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
Arxiv, 2017
|
CrowdHuman: A Benchmark for Detecting Human in a Crowd [Project]
Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, Jian Sun
Arxiv, 2018
|
|