Brian Chen

prof_pic.jpg

I’m a visiting researcher at Meta Reality Lab/FAIR Embodied AI. I received my Ph.D. at Dept. Of Computer Science, Columbia University, in DVMM lab advised by Prof. Shih-Fu Chang.

My research interests focus on Computer Vision, Multimodal Learning, and Self-supervised Learning. Particularly, I am interested in learning representations from videos.

In Meta, I worked on a project to develop egocentric agents on Aria that understand everyday tasks specified in natural language and generate goal plans for further assistance. We leverage the large language model (LLM) LLaMA2 and video training for zero-shot goal planning. Our model can generate future plans conditioned on the given video and input goals.

During my Ph.D. period, I worked on the DARPA AIDA project, which mainly focused on incorporating cross-domain knowledge (images, videos, text, and audio) for knowledge graph construction.

I am closely working with IBM Research and MIT CSAIL on the Sight and Sound Project (since 2020 - ), aiming at learning representations from video and audio.

Prior to joining Columbia Univ., I finished my Bachelor and Master degrees at the Dept. of Computer Science and Information Eng., National Taiwan University, in 2015 and 2017 respectively, advised by Prof. Shou-De Lin.

News

May 9, 2023 Obtained Ph.D. at Dept. Of Computer Science, Columbia University.
Advised by Prof. Shih-Fu Chang.
Oct 3, 2022 Started Visiting Researcher at Meta Reality Lab/FAIR Embodied AI
Collaborate with Ruta Desai, Rishi Hazra, and Tushar Nagarajan.
Jun 1, 2021 Started Research Intern at Salesforce Research
Under the guidance of Ramprasaath Selvaraju, Juan Carlos Niebles, and Nikhil Naik.
Jun 1, 2020 Started Research Intern at IBM Research
Under the guidance of Samuel Thomas, Brian Kingsbury, and Hilde Kuehne.
Aug 1, 2019 Started Research Intern at NTU
Under the guidance of Hanwang Zhang

Selected Publications

  1. ICCV
    EgoTV: Egocentric Task Verification from Natural Language Task Descriptions
    Rishi Hazra, Brian Chen, Akshara Rai, Nitin Kamra, and Ruta Desai
    In International Conference on Computer Vision (ICCV) 2023
  2. WACV
    PreViTS: Contrastive Pretraining with Video Tracking Supervision
    Brian Chen, Ramprasaath R Selvaraju, Shih-Fu Chang, Juan Carlos Niebles, and Nikhil Naik
    In Winter Conference on Applications of Computer Vision 2023
  3. CVPR
    Everything at once-multi-modal fusion transformer for video retrieval
    Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio S Feris, David Harwath, James Glass, and Hilde Kuehne
    In Proceedings of the ieee/cvf conference on computer vision and pattern recognition 2022
  4. EMNLP
    Joint Multimedia Event Extraction from Video and Article
    Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, and Shih-Fu Chang
    In Empirical Methods in Natural Language Processing findings (EMNLP) 2021
  5. ICCV
    Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
    Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, and others
    In International Conference on Computer Vision (ICCV) 2021
  6. AAAI
    General Partial Label Learning via Dual Bipartite Graph Autoencoder
    Brian Chen, Bo Wu, Alireza Zareian, Hanwang Zhang, and Shih-Fu Chang
    In AAAI Conference on Artificial Intelligence (AAAI) 2020