Brian Chen
I’m a visiting researcher at Meta Reality Lab/FAIR Embodied AI. I received my Ph.D. at Dept. Of Computer Science, Columbia University, in DVMM lab advised by Prof. Shih-Fu Chang.
My research interests focus on Computer Vision
, Multimodal Learning
, and Self-supervised Learning
. Particularly, I am interested in learning representations from videos.
In Meta, I worked on a project to develop egocentric agents on Aria that understand everyday tasks specified in natural language and generate goal plans for further assistance. We leverage the large language model (LLM) LLaMA2 and video training for zero-shot goal planning. Our model can generate future plans conditioned on the given video and input goals.
During my Ph.D. period, I worked on the DARPA AIDA project, which mainly focused on incorporating cross-domain knowledge (images, videos, text, and audio) for knowledge graph construction.
I am closely working with IBM Research and MIT CSAIL on the Sight and Sound Project (since 2020 - ), aiming at learning representations from video and audio.
Prior to joining Columbia Univ., I finished my Bachelor and Master degrees at the Dept. of Computer Science and Information Eng., National Taiwan University, in 2015 and 2017 respectively, advised by Prof. Shou-De Lin.
News
May 9, 2023 |
Obtained Ph.D. at Dept. Of Computer Science, Columbia University. Advised by Prof. Shih-Fu Chang. |
---|---|
Oct 3, 2022 |
Started Visiting Researcher at Meta Reality Lab/FAIR Embodied AI Collaborate with Ruta Desai, Rishi Hazra, and Tushar Nagarajan. |
Jun 1, 2021 |
Started Research Intern at Salesforce Research Under the guidance of Ramprasaath Selvaraju, Juan Carlos Niebles, and Nikhil Naik. |
Jun 1, 2020 |
Started Research Intern at IBM Research Under the guidance of Samuel Thomas, Brian Kingsbury, and Hilde Kuehne. |
Aug 1, 2019 |
Started Research Intern at NTU Under the guidance of Hanwang Zhang |
Selected Publications
-
ICCVEgoTV: Egocentric Task Verification from Natural Language Task DescriptionsIn International Conference on Computer Vision (ICCV) 2023
-
WACVPreViTS: Contrastive Pretraining with Video Tracking SupervisionIn Winter Conference on Applications of Computer Vision 2023
-
CVPREverything at once-multi-modal fusion transformer for video retrievalIn Proceedings of the ieee/cvf conference on computer vision and pattern recognition 2022
-
EMNLPJoint Multimedia Event Extraction from Video and ArticleIn Empirical Methods in Natural Language Processing findings (EMNLP) 2021
-
ICCVMultimodal Clustering Networks for Self-supervised Learning from Unlabeled VideosIn International Conference on Computer Vision (ICCV) 2021
-
AAAIGeneral Partial Label Learning via Dual Bipartite Graph AutoencoderIn AAAI Conference on Artificial Intelligence (AAAI) 2020