Tete Xiao
Email : jasonhsiao97 [AT] gmail [DOT] com

| CV | Google Scholar | Github |

I'm a 3rd-year Ph.D. student at the Berkeley Artificial Intelligence Research (BAIR) Lab at University of California, Berkeley, advised by Prof. Trevor Darrell. My research interests lie in the field of Computer Vision and Machine Learning, particularly in enabling intelligent agents to perceive, comprehend and reason with less direct supervision. I'm also affiliated with Facebook AI Research (FAIR) in collaboration with Piotr Dollár and Ross Girshick. Prior to UC Berkeley, I collaborated with Bolei Zhou on a series of works, and was mentored by Yuning Jiang and supervised by Jian Sun.

I graduated from Peking University (PKU) in 2019, the most progressive university in China, summa cum laude with a Bachelor's degree. I started computer programming in primary school and had been participating in programming contests throughout high school and college. I received China National Scholarship in 2016 and was selected as a 2019 Snap Research Scholar for my research and curriculum.

I have a great fondness for the arts besides my academic work. I'm deeply passionate about musical works, especially those in classical music (both instrumental & vocal) and jazz. I'm also into law & public policy. I'm a political activist in both the U.S. and my home country.

  • [Oct. 2021] One paper accepted to NeurIPS21!
  • [July 2021] Two papers accepted to ICCV21!
  • [Jan. 2021] Two papers (including one selected for oral presentation) accepted to ICLR21!
  • [Mar. 2020] Our work on compositional action recognition is accepted to CVPR20! Check out the project page with the new dataset "Something-else"!
  • [July 2019] One paper on explainable human-object interaction is accepted to ICCV19!
  • [May 2019] I will join the wonderful Berkeley Artificial Intelligence Research (BAIR) Lab as a Ph.D. student at the lovely UC Berkeley in August 2019. Go Bears!
  • ---- show more ----

Early Convolutions Help Transformers See Better
Tete Xiao, Mannat Singh, Eric Mintun, Trevor Darrell, Piotr Dollár*,
Ross Girshick*
Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS), 2021
*: equal contribution
| arXiv |

We analyze the substandard optimization behavior of ViT and propose a simple fix that dramatically increases optimization stability and also improves peak performance.

Region Similarity Representation Learning
Tete Xiao*, Colorado Reed*, Xiaolong Wang, Kurt Keutzer, Trevor Darrell
International Conference on Computer Vision (ICCV), 2021
*: equal contribution
| arXiv | code |

An approach to self-supervised representation learning for localization-based tasks such as object detection and segmentation.

What Should Not Be Contrastive in Contrastive Learning
Tete Xiao, Xiaolong Wang, Alexei A. Efros, Trevor Darrell
International Conference on Learning Representations (ICLR), 2021
| arXiv | video |

To contrast, or not to contrast, that is the question.

Learning Cross-domain Correspondence for Control with Dynamics Cycle-consistency
Qiang Zhang, Tete Xiao, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang
International Conference on Learning Representations (ICLR), 2021
Oral presentation
| project page | arXiv | video |

Learning correspondence across domains differing in representation (vision vs. internal state), physics parameters, and morphology.

Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks
Joanna Materzynska, Tete Xiao, Roei Herzig, Huijuan Xu, Xiaolong Wang, Trevor Darrell
: equal advising
Conference on Computer Vision and Pattern Recognition (CVPR), 2020
| project page | arXiv | dataset |

Using Spatial-Temporal Interaction Networks (STIN) for compositional action recognition plus a new annotated dataset Something-else.

Reasoning About Human-Object Interactions Through Dual Attention Networks
Tete Xiao, Quanfu Fan, Dan Gutfreund, Mathew Monfort, Aude Oliva, Bolei Zhou
International Conference on Computer Vision (ICCV), 2019
| project page | arXiv |

Dual Attention Network model reasoning about human-object interactions.

Semantic Understanding of Scenes through the ADE20K Dataset
Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, Antonio Torralba
International Journal of Computer Vision (IJCV) 127, 302–321 (2019)
| project page | pdf | arXiv | pytorch model | demo |

ADE20K dataset with comprehensive analysis and applications.

Unified Perceptual Parsing for Scene Understanding
Tete Xiao*, Yingcheng Liu*, Bolei Zhou*, Yuning Jiang, Jian Sun
European Conference on Computer Vision (ECCV), 2018
*: equal contribution
| arXiv | code |

Pyramid-like parser UPerNet used for Unified Perceptual Parsing task to recognize as many visual concepts as possible from a given image.

Acquisition of Localization Confidence for Accurate Object Detection
Borui Jiang*, Ruixuan Luo*, Jiayuan Mao*, Tete Xiao, Yuning Jiang
European Conference on Computer Vision (ECCV), 2018
Oral presentation
*: equal contribution
| arXiv | code |

Dissecting object localization through IouNet and Precise RoI Pooling.

Learning Visually-grounded Semantics from Contrastive Adversarial Samples
Haoyue Shi*, Jiayuan Mao*, Tete Xiao*, Yuning Jiang, Jian Sun
International Conference on Computational Linguistics (COLING), 2018
*: equal contribution
| arXiv | code |

Constructing constrastive image-caption pairs for learning visually-grounded semantics.

MegDet: A Large Mini-Batch Object Detector
Chao Peng*, Tete Xiao*, Zeming Li*, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun
Conference on Computer Vision and Pattern Recognition (CVPR), 2018 (Spotlight)
*: equal contribution
| arXiv |

Scaling-up training of object detectors; winner of MSCOCO Challenge 2017.

Repulsion Loss: Detecting Pedestrians in a Crowd
Xinlong Wang, Tete Xiao, Yuning Jiang, Shuai Shao, Jian Sun, Chunhua Shen
Conference on Computer Vision and Pattern Recognition (CVPR), 2018
| arXiv |

The pedestrian detector that works better for crowd occlusion.

What Can Help Pedestrian Detection?
Jiayuan Mao*, Tete Xiao*, Yuning Jiang, Zhimin Cao
Conference on Computer Vision and Pattern Recognition (CVPR), 2017
| arXiv |

  • MSCOCO Challenge, 2017
  • Snap Research Scholarship, 2019
  • China National Scholarship, Peking Univsity
  • Scholarship for the Outstanding Talented, Peking Univsity
  • Schlumberger Scholarship, Peking Univsity
  • Founder Group Scholarship, Peking Univsity
  • Gold Medals, ACM International Collegiate Programming Contest (ACM-ICPC) Asia Regional, 2016 & 2017
  • Bronze Medal, National Olympiad in Informatics (NOI), 2014
  • Champion, Shandong Province Team Selection Contest for NOI, 2014
Teaching Faculty, Practice in Programming (17-18 spring)

Teaching Faculty, Artificial Intelligence and Computer Vision (18-19 spring)

Berkeley Artificial Intelligence Research Lab
Berkeley Way West, 2121 Berkeley Way
Berkeley, CA 94704

Website design:
Avatar photo: taken in Jerusalem in July 2019 by my good friend Yingcheng Liu.