Brian Chen

I’m a senior researcher at Samsung Research America working on video summarization and text-to-image editing/style transfer. Before I joined SRA, I was a visiting researcher at Meta Reality Lab/FAIR Embodied AI. I received my Ph.D. at Dept. Of Computer Science, Columbia University, in DVMM lab advised by Prof. Shih-Fu Chang.

My research interests focus on Computer Vision, Multimodal Learning, and Self-supervised Learning. Particularly, I am interested in learning representations from videos.

In Meta, I worked on a project to develop egocentric agents on Aria that understand everyday tasks specified in natural language and generate goal plans for further assistance. We leverage the large language model (LLM) LLaMA2 and video training for zero-shot goal planning. Our model can generate future plans conditioned on the given video and input goals.

During my Ph.D. period, I worked on the DARPA AIDA project, which mainly focused on incorporating cross-domain knowledge (images, videos, text, and audio) for knowledge graph construction.

I am closely working with IBM Research and MIT CSAIL on the Sight and Sound Project (since 2020 - ), aiming at learning representations from video and audio.

Prior to joining Columbia Univ., I finished my Bachelor and Master degrees at the Dept. of Computer Science and Information Eng., National Taiwan University, in 2015 and 2017 respectively, advised by Prof. Shou-De Lin.

news

Oct 03, 2023	Joined Samsung Research America as a Senior Researcher.
May 09, 2023	Obtained Ph.D. at Dept. Of Computer Science, Columbia University. Advised by Prof. Shih-Fu Chang.
Oct 03, 2022	Started Visiting Researcher at Meta Reality Lab/FAIR Embodied AI Collaborate with Ruta Desai, Rishi Hazra, and Tushar Nagarajan.
Jun 01, 2021	Started Research Intern at Salesforce Research Under the guidance of Ramprasaath Selvaraju, Juan Carlos Niebles, and Nikhil Naik.
Jun 01, 2020	Started Research Intern at IBM Research Under the guidance of Samuel Thomas, Brian Kingsbury, and Hilde Kuehne.
Aug 01, 2019	Started Research Intern at NTU Under the guidance of Hanwang Zhang

selected publications

CVPR

What, when, and where?–Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

Brian Chen , Nina Shvetsova , Andrew Rouditchenko , and 6 more authors

In Ieee/cvf conference on computer vision and pattern recognition (CVPR) , 2024
ICCV

EgoTV: Egocentric Task Verification from Natural Language Task Descriptions

Rishi Hazra , Brian Chen , Akshara Rai , and 2 more authors

In International Conference on Computer Vision (ICCV) , 2023
WACV

PreViTS: Contrastive Pretraining with Video Tracking Supervision

Brian Chen , Ramprasaath R Selvaraju , Shih-Fu Chang , and 2 more authors

In Winter Conference on Applications of Computer Vision , 2023
CVPR

Everything at once-multi-modal fusion transformer for video retrieval

Nina Shvetsova , Brian Chen , Andrew Rouditchenko , and 6 more authors

In Proceedings of the ieee/cvf conference on computer vision and pattern recognition , 2022
EMNLP

Joint Multimedia Event Extraction from Video and Article

Brian Chen , Xudong Lin , Christopher Thomas , and 5 more authors

In Empirical Methods in Natural Language Processing findings (EMNLP) , 2021
ICCV

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

Brian Chen , Andrew Rouditchenko , Kevin Duarte , and 3 more authors

In International Conference on Computer Vision (ICCV) , 2021
AAAI

General Partial Label Learning via Dual Bipartite Graph Autoencoder

Brian Chen , Bo Wu , Alireza Zareian , and 2 more authors

In AAAI Conference on Artificial Intelligence (AAAI) , 2020