default search action
Xudong Lin 0003
Person information
- unicode name: 蔺旭东
- affiliation: Columbia University, New York, NY, USA
- affiliation (former): Tsinghua University, Department of Automation, Beijing, China
Other persons with the same name
- Xudong Lin 0001 (aka: Xu-Dong Lin 0001) — South China Agricultural University, College of Informatics, Guangzhou, China
- Xudong Lin 0002 — Hebei University of Environmental Engineering, Department of Information Engineering, Qinhuangdao, China (and 2 more)
- Xudong Lin 0004 — Yanshan University, School of Information Science and Engineering, China
- Xudong Lin 0005 (aka: Xu-dong Lin 0005) — Shenzhen University, College of Management, China
- Xudong Lin 0006 — Sichuan University of Science and Engineering, School of Mathematics and Statistics, Zigong, China
- Xudong Lin 0007 — ByteDance Inc.
- Xudong Lin 0008 — Google DeepMind
- Xudong Lin 0009 — Taras Shevchenko National University of Kyiv, Ukraine
- Xudong Lin 0010 — n Yat-sen University (Zhuhai Campus), Gravitational Wave Research Center of CNSA, MOE Key Laboratory of TianQin Mission, Zhuhai, China
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c33]Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Long Chen, Yulei Niu, Xudong Lin, Xuande Feng, Jaywon Koo, Sounak Ray, Shih-Fu Chang:
Beyond Grounding: Extracting Fine-Grained Event Hierarchies across Modalities. AAAI 2024: 17664-17672 - [c32]Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. ECCV (23) 2024: 148-166 - [c31]Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang Qian Shi, Yulei Niu, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu:
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses. EMNLP (Findings) 2024: 14839-14854 - [c30]Xudong Lin, Ali Zare, Shiyuan Huang, Ming-Hsuan Yang, Shih-Fu Chang, Li Zhang:
Personalized Video Comment Generation. EMNLP (Findings) 2024: 16806-16820 - [c29]Hammad A. Ayyubi, Tianqi Liu, Arsha Nagrani, Xudong Lin, Mingda Zhang, Anurag Arnab, Feng Han, Yukun Zhu, Xuande Feng, Kevin Zhang, Jialu Liu, Shih-Fu Chang:
VIEWS: Entity-Aware News Video Captioning. EMNLP 2024: 20220-20239 - [c28]Xudong Lin, Manling Li, Richard S. Zemel, Heng Ji, Shih-Fu Chang:
Training-free Deep Concept Injection Enables Language Models for Video Question Answering. EMNLP 2024: 22399-22416 - [c27]Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang:
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos. ICLR 2024 - [i31]Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang:
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos. CoRR abs/2403.01599 (2024) - [i30]Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna:
BLINK: Multimodal Large Language Models Can See but Not Perceive. CoRR abs/2404.12390 (2024) - [i29]Hung-Ting Su, Chun-Tong Chao, Ya-Ching Hsu, Xudong Lin, Yulei Niu, Hung-Yi Lee, Winston H. Hsu:
Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies. CoRR abs/2406.10923 (2024) - [i28]Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang Qian Shi, Yulei Niu, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu:
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses. CoRR abs/2409.14324 (2024) - 2023
- [c26]Rui Yan, Mike Zheng Shou, Yixiao Ge, Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang:
Video-Text Pre-training with Learned Regions for Retrieval. AAAI 2023: 3100-3108 - [c25]Guang Yang, Manling Li, Jiajie Zhang, Xudong Lin, Heng Ji, Shih-Fu Chang:
Video Event Extraction via Tracking Visual States of Arguments. AAAI 2023: 3136-3144 - [c24]Yu Zhou, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Non-Sequential Graph Script Induction via Multimedia Grounding. ACL (1) 2023: 5529-5545 - [c23]Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang:
In Defense of Structural Symbolic Representation for Video Event-Relation Prediction. CVPR Workshops 2023: 4940-4950 - [c22]Hung-Ting Su, Yulei Niu, Xudong Lin, Winston H. Hsu, Shih-Fu Chang:
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering. CVPR Workshops 2023: 4951-4960 - [c21]Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Kevin Qinghong Lin, Satoshi Tsutsui, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-Training. CVPR 2023: 6598-6608 - [c20]Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CVPR 2023: 14846-14855 - [c19]Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang:
Supervised Masked Knowledge Distillation for Few-Shot Transformers. CVPR 2023: 19649-19659 - [c18]Feng Wang, Manling Li, Xudong Lin, Hairong Lv, Alexander G. Schwing, Heng Ji:
Learning to Decompose Visual Features with Latent Textual Prompts. ICLR 2023 - [c17]Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang:
TempCLR: Temporal Alignment Representation with Contrastive Learning. ICLR 2023 - [i27]Andrew Lu, Xudong Lin, Yulei Niu, Shih-Fu Chang:
In Defense of Structural Symbolic Representation for Video Event-Relation Prediction. CoRR abs/2301.03410 (2023) - [i26]Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang:
Supervised Masked Knowledge Distillation for Few-Shot Transformers. CoRR abs/2303.15466 (2023) - [i25]Hung-Ting Su, Yulei Niu, Xudong Lin, Winston H. Hsu, Shih-Fu Chang:
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering. CoRR abs/2304.03754 (2023) - [i24]Yu Zhou, Sha Li, Manling Li, Xudong Lin, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Non-Sequential Graph Script Induction via Multimedia Grounding. CoRR abs/2305.17542 (2023) - [i23]Hammad A. Ayyubi, Tianqi Liu, Arsha Nagrani, Xudong Lin, Mingda Zhang, Anurag Arnab, Feng Han, Yukun Zhu, Jialu Liu, Shih-Fu Chang:
Video Summarization: Towards Entity-Aware Captions. CoRR abs/2312.02188 (2023) - 2022
- [c16]Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander G. Schwing, Heng Ji:
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding. AAAI 2022: 11200-11208 - [c15]Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CVPR 2022: 3303-3312 - [c14]Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani:
Learning To Recognize Procedural Activities with Distant Supervision. CVPR 2022: 13843-13853 - [c13]Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang:
CLIP-Event: Connecting Text and Images with Event Structures. CVPR 2022: 16399-16408 - [c12]Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad A. Ayyubi, Heng Ji, Shih-Fu Chang:
Weakly-Supervised Temporal Article Grounding. EMNLP 2022: 9402-9413 - [c11]Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners. NeurIPS 2022 - [i22]Manling Li, Ruochen Xu, Shuohang Wang, Luowei Zhou, Xudong Lin, Chenguang Zhu, Michael Zeng, Heng Ji, Shih-Fu Chang:
CLIP-Event: Connecting Text and Images with Event Structures. CoRR abs/2201.05078 (2022) - [i21]Xudong Lin, Fabio Petroni, Gedas Bertasius, Marcus Rohrbach, Shih-Fu Chang, Lorenzo Torresani:
Learning To Recognize Procedural Activities with Distant Supervision. CoRR abs/2201.10990 (2022) - [i20]Alex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-training. CoRR abs/2203.07303 (2022) - [i19]Guanyu Cai, Yixiao Ge, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, Xiaohu Qie, Jianping Wu, Mike Zheng Shou:
Revitalize Region Feature for Democratizing Video-Language Pre-training. CoRR abs/2203.07720 (2022) - [i18]Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji:
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners. CoRR abs/2205.10747 (2022) - [i17]Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CoRR abs/2206.02082 (2022) - [i16]Hammad A. Ayyubi, Christopher Thomas, Lovish Chum, Rahul Lokesh, Yulei Niu, Xudong Lin, Long Chen, Jaywon Koo, Sounak Ray, Shih-Fu Chang:
Multimodal Event Graphs: Towards Event Centric Understanding of Multimodal World. CoRR abs/2206.07207 (2022) - [i15]Feng Wang, Manling Li, Xudong Lin, Hairong Lv, Alexander G. Schwing, Heng Ji:
Learning to Decompose Visual Features with Latent Textual Prompts. CoRR abs/2210.04287 (2022) - [i14]Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad A. Ayyubi, Heng Ji, Shih-Fu Chang:
Weakly-Supervised Temporal Article Grounding. CoRR abs/2210.12444 (2022) - [i13]Guang Yang, Manling Li, Jiajie Zhang, Xudong Lin, Shih-Fu Chang, Heng Ji:
Video Event Extraction via Tracking Visual States of Arguments. CoRR abs/2211.01781 (2022) - [i12]Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang:
TempCLR: Temporal Alignment Representation with Contrastive Learning. CoRR abs/2212.13738 (2022) - 2021
- [c10]Sijie Song, Xudong Lin, Jiaying Liu, Zongming Guo, Shih-Fu Chang:
Co-Grounding Networks With Semantic Attention for Referring Expression Comprehension in Videos. CVPR 2021: 1346-1355 - [c9]Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani:
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs. CVPR 2021: 7005-7015 - [c8]Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang:
Joint Multimedia Event Extraction from Video and Article. EMNLP (Findings) 2021: 74-88 - [c7]Haoyang Wen, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi Ren Fung, Piyush Mishra, Qing Lyu, Dídac Surís, Brian Chen, Susan Windisch Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Heng Ji:
RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System. NAACL-HLT (Demonstrations) 2021: 133-143 - [i11]Xudong Lin, Gedas Bertasius, Jue Wang, Shih-Fu Chang, Devi Parikh, Lorenzo Torresani:
VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs. CoRR abs/2101.12059 (2021) - [i10]Sijie Song, Xudong Lin, Jiaying Liu, Zongming Guo, Shih-Fu Chang:
Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos. CoRR abs/2103.12346 (2021) - [i9]Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang:
Joint Multimedia Event Extraction from Video and Article. CoRR abs/2109.12776 (2021) - [i8]Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CoRR abs/2112.00656 (2021) - [i7]Rui Yan, Mike Zheng Shou, Yixiao Ge, Alex Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang:
Video-Text Pre-training with Learned Regions. CoRR abs/2112.01194 (2021) - [i6]Revanth Gangi Reddy, Xilin Rui, Manling Li, Xudong Lin, Haoyang Wen, Jaemin Cho, Lifu Huang, Mohit Bansal, Avirup Sil, Shih-Fu Chang, Alexander G. Schwing, Heng Ji:
MuMuQA: Multimedia Multi-Hop News Question Answering via Cross-Media Knowledge Extraction and Grounding. CoRR abs/2112.10728 (2021) - 2020
- [c6]Xudong Lin, Lin Ma, Wei Liu, Shih-Fu Chang:
Context-Gated Convolution. ECCV (18) 2020: 701-718
2010 – 2019
- 2019
- [c5]Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan:
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CVPR 2019: 1268-1277 - [c4]Svebor Karaman, Xudong Lin, Xuefeng Hu, Shih-Fu Chang:
Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval. ICMR 2019: 192-196 - [i5]Zheng Shou, Zhicheng Yan, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Xudong Lin, Shih-Fu Chang:
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CoRR abs/1901.03460 (2019) - [i4]Svebor Karaman, Xudong Lin, Xuefeng Hu, Shih-Fu Chang:
Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval. CoRR abs/1903.01545 (2019) - [i3]Xudong Lin, Lin Ma, Wei Liu, Shih-Fu Chang:
Context-Gated Convolution. CoRR abs/1910.05577 (2019) - [i2]Xudong Lin, Zheng Shou, Shih-Fu Chang:
LPAT: Learning to Predict Adaptive Threshold for Weakly-supervised Temporal Action Localization. CoRR abs/1910.11285 (2019) - [i1]Shiyuan Huang, Xudong Lin, Svebor Karaman, Shih-Fu Chang:
Flow-Distilled IP Two-Stream Networks for Compressed Video Action Recognition. CoRR abs/1912.04462 (2019) - 2018
- [c3]Yueqi Duan, Wenzhao Zheng, Xudong Lin, Jiwen Lu, Jie Zhou:
Deep Adversarial Metric Learning. CVPR 2018: 2780-2789 - [c2]Yueqi Duan, Ziwei Wang, Jiwen Lu, Xudong Lin, Jie Zhou:
GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning. CVPR 2018: 8270-8279 - [c1]Xudong Lin, Yueqi Duan, Qiyuan Dong, Jiwen Lu, Jie Zhou:
Deep Variational Metric Learning. ECCV (15) 2018: 714-729
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-23 20:30 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint