Yuxin Wu

Yuxin Wu

I’m building large multimodal models at a startup.

Prior to this, I worked at Google Brain on foundation models, and at Facebook AI Research on computer vision. I have expertise in research, libraries and infrastructure for deep learning and computer vision.

My previous works at FAIR have received Best Paper Honorable Mention in ECCV 2018, Best Paper Nomination in CVPR 2020, and Mark Everingham Prize in ICCV 2021. I also created detectron2, one of the most popular Facebook AI projects.

  • Master in Computer Vision, 2016

    Carnegie Mellon University

  • Bachelor in Computer Science, 2015

    Tsinghua University

Selected Publications

Momentum Contrast for Unsupervised Visual Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2020 (Oral)
Best Paper Nomination
PointRend: Image Segmentation as Rendering
Computer Vision and Pattern Recognition (CVPR), 2020 (Oral)
Feature Denoising for Improving Adversarial Robustness
The first ImageNet classifier that survives strong white-box adversarial attacks.
Computer Vision and Pattern Recognition (CVPR), 2019
Group Normalization
European Conference on Computer Vision (ECCV), 2018 (Oral)
Best Paper Honorable Mention (top 3)
House3D: A Rich and Realistic 3D Environment
International Conference on Learning Representations (ICLR), 2018
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Platform that powers large-scale RL projects such as OpenGO.
Neural Information Processing Systems (NeurIPS), 2017 (Oral)

Open Source Projects

Full list of OSS contributions can be found here.


A computer vision library with a focus on detection-related tasks.

  • Widely used in research community.
  • Support training and deployment for dozens of Meta’s products & services, e.g. 1 2 3 4.
A neural net training interface on TensorFlow, with focus on speed + flexibility. Includes highly-optimized trainers, data pipelines, and solid paper reproductions.
Adversarial Attack on Face Recognition
Black-box adversarial attacks on AWS/Azure’s public face recognition APIs, the first successful attack of its kind.
Panorama stitching written from scratch in C++. Includes SIFT feature detection, RANSAC, bundle adjustment (w/ analytical gradients), straightening and warping.
Cracking encrypted wechat message history from Android phones by reverse-engineering.


  • Mark Everingham Prize in ICCV 2021 for the detectron2 project.
    Announcement Media

  • Best Paper Nominee in CVPR 2020 for the paper “Momentum Contrast”.

  • Winner of defense track in Competition on Adversarial Attacks and Defenses (CAAD) 2018.

    We trained an ImageNet classifier with state-of-the-art robustness against adversarial attacks.

  • Winner of CTF in Competition on Adversarial Attacks and Defenses (CAAD) CTF 2018. Competition Recording (Chinese) Media (Chinese)

    We performed successful adversarial attacks / defenses against other teams during the live competition.

  • Best Paper Honorable Mention award in ECCV 2018 for the paper “Group Normalization”. Announcement

  • Google Open Source Peer Bonus in 2017 for the tensorpack project. Announcement

  • Winner of VizDoom AI Competition in CIG 2016.

    Our Doom bot, “F1”, beat competitors in death matches by a large margin.
    Competition Media Video

  • Champion of Student Cluster Competition in both ISC 2015 and ASC 2015.

    Optimize software and hardware for low-power high-performance computation. Media Media

  • Finalist in SIGMOD Programming Contest in SIGMOD 2014, as team leader of “blxlrsmb”.

    Design and implement a large social network database to support efficient queries.

  • Capture the Flags (CTF, a security competition): 11th in SECCON CTF 2014, 8th in Codegate CTF 2015, 5th in DEFCON CTF 2015 as a team member of “blue-lotus”.