Research Scientist
with Sony, Tokyo 2023.10 ~
Currently I work as a research scientist in Sony, Tokyo since 2023.10. I was a Ph.D. student in Learning and Machine Perception (LAMP) team (2019.10 ~ 2023.7), advised by Dr. Joost van de Weijer in Computer Vision Center, Autonomous University of Barcelona, Spain.
#Audio-Visual Generation #Multi-Modal Learning #Transfer Learning
- During PhD focuses on how to efficiently adapt the pretrained model to real world environment under domain and category shift unsupervisedly, where the related research topics cover zero-shot learning, source-free/continual/open-set domain adaptation.
- Currently, I am working on multi-modal (especially audio-visual) generation, and I am also interested in model adaptation.
CV (updated Oct. 2024)
News
[2024.10] Pay visit to Prof. Andrew Bagdanov (MICC Lab) in University of Florence and Prof. Nicu Sebe (MHUG Lab) in University of Trento.
[2024.9] Host our workshop AVGenL: Audio-Visual Generation and Learning at ECCV 2024.
[2024.9] Our paper Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models is accepted by NeurIPS 2024.
[2024.6] Our paper SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond is accepted by ISMIR 2024.
[2024.4] We are organizing an ECCV 2024 workshop on Audio-Visual Generation and Learning, please check the site for CfP and speakers.
[2023.12] My doctoral thesis received Pioneer Awards 2023 - CERCA.
[2023.9] Our paper 'Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing' is accepted by NeurIPS 2023.
[2023.8] Extended version of NRC is accepted by IEEE TPAMI.
[2023.6] 'Casting a BAIT for Offline and Online Source-free Domain Adaptation' is finally accepted, by CVIU.
[2023.1] Have a visiting talk in Prof. Maria Brbic's group in EPFL.
[2022.11] I present our work on model adaptation under domain and category shift on TrustML Young Scientist Seminars (hosted by RIKEN AIP) on Dec .7.
[2022.9] 'Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation' is accepted by NeurIPS 2022 as Spotlight, and our paper 'Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification' is accepted by BMVC 2022.
[2021.9] 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation' is accepted by NeurIPS 2021.
[2021.7] 'Generalized Source-free Domain Adaptation' is accepted by ICCV 2021.
Full Publications
Preprint
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass
preprint, 2024 [arxiv]
Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
Saurav Jha, Shiqi Yang*, Masato Ishii, Mengjie Zhao, Christian Simon, Muhammad Jehanzeb Mirza, Dong Gong, Lina Yao, Shusuke Takahashi, Yuki Mitsufuji
preprint, 2024 [arxiv]
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Shiqi Yang, Zhi Zhong, Mengjie Zhao, Shusuke Takahashi, Masato Ishii, Takashi Shibuya, Yuki Mitsufuji
MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing
Kangneng Zhou, Daiheng Gao, Xuan Wang, Jie Zhang, Peng Zhang, Xusen Sun, Longhao Zhang, Shiqi Yang, Bang Zhang, Liefeng Bo, Yaxing Wang
preprint, 2023 [arxiv]
A Critical Look at the Current Usage of Foundation Model for Dense Recognition Task
Shiqi Yang, Atsushi Hashimoto, Yoshitaka Ushiku
OneRing: A Simple Method for Source-free Open-partial Domain Adaptation
Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost van de Weijer
preprint, 2022 [project][arXiv][code]
International Conference
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Senmao Li, Taihang Hu, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang
Advances in Neural Information Processing Systems (NeurIPS) 2024 [project][arxiv][code]
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji
International Society for Music Information Retrieval (ISMIR) 2024 [arxiv]
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing
Kai Wang, Fei Yang, Shiqi Yang, Muhammad Atif Butt, Joost van de Weijer
Advances in Neural Information Processing Systems (NeurIPS) 2023 [paper][arxiv][code]
Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification
Kai Wang, Chenshen Wu, Andrew D. Bagdanov, Xialei Liu, Shiqi Yang, Shangling Jui, Joost van de Weijer
British Machine Vision Conference (BMVC) 2022 [arxiv][code]
Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation
Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost van de Weijer
Advances in Neural Information Processing Systems (NeurIPS) 2022 Spotlight [project][paper][arXiv][code]
Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation
Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui
Advances in Neural Information Processing Systems (NeurIPS) 2021 [project][paper][arXiv][code]
Generalized Source-free Domain Adaptation
Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui
International Conference on Computer Vision (ICCV) 2021 [project][paper][arXiv][code][video]
Parallel Convolutional Networks for Image Recognition via a Discriminator
Shiqi Yang, Gang Peng
Asian Conference on Computer Vision (ACCV) 2018 [paper][arXiv]
Attention to Refine Through Multi Scales for Semantic Segmentation
Shiqi Yang, Gang Peng
Pacific-Rim Conference on Multimedia (PCM) 2018 [paper][arXiv]
Journal
Trust your Good Friends: Source-free Domain Adaptation by Reciprocal Neighborhood Clustering
Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui, Jian Yang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023 [paper][arxiv]
Casting a BAIT for Offline and Online Source-free Domain Adaptation
Shiqi Yang, Yaxing Wang, Luis Herranz, Shangling Jui, Joost van de Weijer
Computer Vision and Image Understanding (CVIU), 2023 [paper][arxiv][code]
On Implicit Attribute Localization for Generalized Zero-Shot Learning
Shiqi Yang, Kai Wang, Luis Herranz, Joost van de Weijer
Invited Talks, Awards and Activities
Pioneer Awards 2023, CERCA Research Center of Catalonia, Spain, 2023.12
Visiting talk in Prof. Maria Brbic's group in EPFL, Switzerland, 2023.1
Invited talk on TrustML Young Scientist Seminars, RIKEN AIP, Japan, 2022.12
ICVSS summer school, Sicily, Italy, 2022.7
Invited talk on AI Time Seminar on NeurIPS 2021 (Virtual), China, 2022.2
Academic Service
Organizer: ECCV 2024 Audio-Visual Generation and Learning workshop (initiator)
Conference Reviewer: ICLR; ICCV; NeurIPS; ECCV; ICML; CVPR; WACV
Journal Reviewer: IEEE TKDE, TPAMI, TAI, IJCV