AIGC 2024 Conference Overview
The 2nd International Conference on AI-generated Content (AIGC 2024) was successfully held on December 21–22, 2024, in Beijing, China. The event brought together over 600 participants and fostered dynamic discussions and meaningful collaborations. It served as a vital platform for the exchange of cutting-edge research and forward-looking insights, helping to shape the future of AI-generated content. The success of AIGC 2024 highlighted the conference's growing importance in the global AI landscape. By bridging the gap between academia and industry, the conference catalyzed collaborations that continue to push the boundaries of generative AI. As the field rapidly evolves, AIGC remains at the forefront—driving innovation and shaping the next era of AI-generated content.
Photo gallary of AIGC 2024
Plenary Speakers
Prof. Jian Sun
Xi’an Jiaotong University, China
Biography: Jian Sun is a Professor at Xi'an Jiaotong
University, where he
completed his Ph.D. in Applied Mathematics. His career includes roles as a visiting
student at Microsoft Research Asia (Nov. 2005 - March 2008), a postdoctoral researcher
at the University of Central Florida (Aug. 2009 - April 2010), and with the Willow team
at École Normale Supérieure de Paris / INRIA (Sept. 2012 - Aug. 2014). He serves on the
editorial board of the International Journal of Computer Vision (IJCV) and has been an
area chair for major conferences such as ICCV, ECCV, and MICCAI. Dr. Sun is a recipient
of the National Science Fund for Distinguished Young Scholars in China. His current
research focuses on machine learning methods, including generalizable and explainable
machine learning, optimal transport, AI applications in mathematics, as well as computer
vision and medical image analysis.
Speech Title: 生成式人工智能的数学与统计学基础
Abstract:
生成式人工智能是当前通用人工智能发展的重要方向,主要通过设计人工智能算法实现对多模态、高维复杂样本分布的学习与新样本的生成,是当前人工智能应用于自动问答、跨模态生成、AI
for
science等问题的方法基础。生成式人工智能的底层基础是数学与统计学,本报告主要介绍生成式人工智能的背景、数学与统计学基本原理以及其面临的主要挑战问题;进一步介绍以最优传输理论与方法作为基础构建可控/条件生成的人工智能方法,及其在自然图像、医学影像等领域中的应用。最后总结并展望生成式人工智能的未来发展前景。
Prof. Guoyin Wang
President of Chongqing Normal University, China
IRSS/I2CICC/CAAI/CCF Fellow, IEEE SM
Vice-President of CAAI
Biography: Guoyin Wang received the B.S., M.S., and Ph.D. degrees from
Xi’an Jiaotong
University, Xian, China, in 1992, 1994, and 1996, respectively. He worked at the
University of North Texas, and the University of Regina, Canada, as a visiting scholar
during 1998-1999. He had worked at the Chongqing University of Posts and
Telecommunications during 1996-2024, where he was a professor, the Vice-President of the
University, the director of the Chongqing Key Laboratory of Computational Intelligence,
the director of the Key Laboratory of Cyberspace Big Data Intelligent Security of the
Ministry of Education, the director of Tourism Multi-source Data Perception and Decision
Technology of the Ministry of Culture and Tourism, and the director of the
Sichuan-Chongqing Joint Key Laboratory of Digital Economy Intelligence and Security. He
was the director of the Institute of Electronic Information Technology, Chongqing
Institute of Green and Intelligent Technology, CAS, China, 2011-2017. He has been
serving as the President of Chongqing Normal University since June 2024. He is the
author of over 10 books, the editor of dozens of proceedings of international and
national conferences and has more than 300 reviewed research publications. His research
interests include rough sets, granular computing, machine learning, knowledge
technology, data mining, neural network, cognitive computing, etc. Dr. Wang was the
President of International Rough Set Society (IRSS) 2014-2017, and a council member of
the China Computer Federation (CCF) 2008-2023. He is currently a Vice-President of the
Chinese Association for Artificial Intelligence (CAAI), and the President of Chongqing
Association for Artificial Intelligence (CQAAI). He is a Fellow of IRSS, I2CICC, CAAI
and CCF.
Speech Title: Brain Cognition Inspired Artificial Intelligence
Abstract: With the synergy of big data, big computing power and large
model, artificial
intelligence (AI) has made breakthrough progress in surpassing some key human
intelligence abilities such as visual intelligence, auditory intelligence, decision
intelligence, and language intelligence in recent years. However, AI systems surpass
certain human intelligence abilities in a statistical sense as a whole only. They are
not true realization of these human intelligence abilities and behaviors. This talk
reviews the role of cognitive science in inspiring the development of the three
mainstream academic branches of AI based on Marr’s three-layer framework, explores and
analyses the limitations of the current development of AI. Future research directions
and their scientific issues that need to be focused on in brain-inspired AI research are
proposed.
Prof. Xingwei Wang
Vice President of Northeastern University, China CCF Fellow
Biography: Xingwei Wang is a distinguished professor and doctoral supervisor, holding the esteemed titles of Fellow of CCF, director of CCF, and vice president of Northeastern University. He has been awarded numerous national grants including the National Outstanding Youth Science Foundation of China, the Special Government Allowance from the State Council, and the Program for New Century Excellent Talents of the Ministry of Education. Additionally, Prof. Wang holds prominent positions in various important organizations such as being a member of the National Graduate Education Steering Committee for Professional Engineering Degree and serving as Deputy Director for both Network and Data Communications Committee & Technical Committee on Internet at China Computer Federation; he also serves as Director & Fellow at China Institute of Communications while being part of its fellow selection committee along with being an expert committee member at China Education and Research Network (CERNET) while also serving as Vice Chairman at Liaoning Internet Society. Furthermore, he is an editorial board member for prestigious journals like Chinese Journal Of Computers , Journal Of Software ,and Journal Of Computer Research And Development . Moreover,he is one among Elsevier Highly Cited Chinese Researchers Ranking .He leads Liaoning Provincial Innovation Team besides working as Director at Liaoning Provincial Key Laboratory Of Intelligent Internet Theory And Applications. His primary research interests encompass the domains of Internet, cloud computing, and network space security. To date, he has been bestowed with 2 second prizes for national scientific and technological progress, 2 first prizes for scientific and technological progress from the Ministry of Education, 1 first prize for scientific and technological progress from the China Institute of Communications, 1 second prize for technical invention from the Ministry of Education, 1 second prize for technical invention from Liaoning Province, as well as 1 second prize for natural science from Hunan Province. Additionally, he has published over a hundred papers in esteemed academic journals such as IEEE Transactions while presenting his research at renowned academic conferences like IEEE ICDCS. Moreover, his contributions include over a hundred papers indexed in SCI along with the publication of nine monographs. Furthermore, he has been granted twenty-seven national invention patents and received twenty awards at both national and provincial levels for talent cultivation.
Prof. Yue Zhang
Westlake University, China
Biography: Yue Zhang is a tenured Professor at Westlake University. His
research
interests include NLP and its underlying machine learning algorithms. His major
contributions to the field include psycholinguistically motivated machine learning
algorithm, learning-guided beam search for structured prediction, pioneering neural NLP
models including graph LSTM, and OOD generalization for NLP. He authored the Cambridge
University Press book ``Natural Language Processing -- a Machine Learning Perspective''.
He is the PC co-chair for CCL 2020 and EMNLP 2022, and action editor for Transactios for
ACL. He also served as associate editor for IEEE/ACM Transactions of Audio Speech and
Language Processing (TASLP), ACM Transactions on Asian and Low-Resource Languages
(TALLIP), IEEE Transactions on Big Data (TBD) and Computer, Speech and Language (CSL).
He won the best paper awards of IALP 2017 and COLING 2018, best paper honorable mention
of SemEval 2020, and best paper nomination for ACL 2018 and ACL 2023.
Speech Title: LLM reasoning and generalization
Abstract: In this talk, I will discuss linguistic reasoning, and the
capabilities of
formal logic reasoning for large langauge models (LLMs). I will discuss the difficulty
of learning formal reasoning from empirical risk minimization, and discuss a perspective
to this problem from causal learning theory. I will discuss causal features and
confounders, and show how learning confounders can lead to low out-of-distribution
generalization performance. Then I will discuss two general methods to address the
issue, including a data-centric method and a model-centric method, introducing several
recent works using both methods.
Keynote Speakers & Invited Speakers
Asst. Prof. Bo Han
Hong Kong Baptist University
Biography: Bo Han is currently an Assistant Professor in Machine
Learning and a Director
of Trustworthy Machine Learning and Reasoning Group at Hong Kong Baptist University,
and
a BAIHO Visiting Scientist of Imperfect Information Learning Team at RIKEN Center
for
Advanced Intelligence Project (RIKEN AIP), where his research focuses on machine
learning, deep learning, foundation models, and their applications. He was a
Visiting
Research Scholar at MBZUAI MLD (2024), a Visiting Faculty Researcher at Microsoft
Research (2022) and Alibaba DAMO Academy (2021), and a Postdoc Fellow at RIKEN AIP
(2019-2020). He received his Ph.D. degree in Computer Science from University of
Technology Sydney (2015-2019). He has served as Senior Area Chair of NeurIPS, and
Area
Chairs of NeurIPS, ICML and ICLR. He has also served as Associate Editors of IEEE
TPAMI,
MLJ and JAIR, and Editorial Board Members of JMLR and MLJ. He received Outstanding
Paper
Award at NeurIPS, Most Influential Paper at NeurIPS, Outstanding Student Paper Award
at
NeurIPS Workshop, Notable Area Chair at NeurIPS, Outstanding Area Chair at ICLR, and
Outstanding Associate Editor at IEEE TNNLS.
Speech Title: Exploring Trustworthy Foundation Models under
Imperfect Data
Abstract: In the current landscape of machine learning, it is
crucial to build
trustworthy foundation models that can operate under imperfect conditions, since
most
real-world data, such as unexpected inputs, image artifacts, and adversarial inputs,
are
easily noisy. These models need to possess human-like capabilities to learn and
reason
in uncertainty. In this talk, I will focus on three recent research advancements,
each
shedding light on the reliability, robustness, and safety in this field.
Specifically,
the reliability will be explored through the enhancement of vision-language models
by
introducing negative labels, which effectively detect out-of-distribution samples.
Meanwhile, robustness will be explored through our investigation into image
interpolation using diffusion models, addressing the challenge of information loss
to
ensure consistency and quality of generated content. Then, safety will be
highlighted by
our study on hypnotizing large language models, DeepInception, which leverages the
creation of a novel nested scenario to induce adaptive jailbreak behaviors,
revealing
vulnerabilities during interactive model engagement.
Prof. Songlin Hu
Institute of Information Engineering (IIE), the Chinese Academy of Sciences,
China
Biography: Songlin Hu is a full professor at the Institute of
Information
Engineering (IIE), the Chinese Academy of Sciences. He is also a joint professor at
the University of Chinese Academy of Sciences. His research areas include big data,
natural langurage processing, knowledge graph, etc. He has published more than 100
publications in many reputed conferences and journals, like ACL,AAAI,IJCAI,EMNLP,
SIGMOD,VLDB,ICDE, ACM/IEEE Trans, etc.
Speech Title: 大模型安全治理
Assoc. Prof. Gao Huang
Tsinghua University, China
Biography: Gao Huang is an Associate Professor affiliated with the
Department of
Automation at Tsinghua University. He obtained the PhD degree in machine learning
from Tsinghua in 2015, and spent three years at Cornell University as a postdoc. His
research interests lie in machine learning and computer vision. In particular, he is
actively working on efficient deep learning, dynamic neural networks, learning with
limited data and reinforcement learning. His work on DenseNet won the Best Paper
Award of CVPR (2017). He has collected more than 70,000 citations according to
Google Scholar.
Speech Title: 面向长序列的Transformer基础架构
Prof. Noor Zaman Jhanjhi
Taylor's University, Malaysia
Biography: Professor Dr. Noor Zaman Jhanjhi, often referred to as
N.Z. Jhanjhi,
holds the esteemed position of Professor in Computer Science with specializations in
Cybersecurity and Artificial Intelligence. He currently serves as the Program
Director for Postgraduate Research Degree Programmes in Computer Science and
Director of the Center for Smart Society (CSS5) at Taylor’s University, Malaysia.
Recognized as one of the world’s top 2% research scientists for consecutive years in
2022 and 2023, he is esteemed as one of Malaysia's top three computer science
researchers. Notably, he was honoured as an Outstanding Faculty Member by MDEC
Malaysia in 2022.
Prof. Jhanjhi boasts a prolific publication record with numerous highly indexed
works in WoS/ISI/SCI/SCIE/Scopus, accumulating a collective research impact factor
exceeding 1000 points. His Google Scholar H-index stands at an impressive 65, with
an I-10 Index approaching 291, and a Scopus H-index of 47. With over 600
publications to his credit, including several international patents in Australia,
Germany, the UK, and Japan, Prof. Jhanjhi has significantly contributed to the
academic discourse.
An accomplished editor and author, he has curated over 50 research books published
by esteemed publishers such as Springer, IGI Global USA, Taylor & Francis, IET,
Elsevier, Wiley, Bentham, and Intech Open. Prof. Jhanjhi excels in mentoring
postgraduate scholars, with over 38 scholars graduating under his tutelage. He also
serves as Associate Editor and Editorial Assistant Board member for reputable
journals and has received accolades such as the Outstanding Associate Editor award
for IEEE ACCESS.
Renowned as a top-tier reviewer by Publons (Web of Science), Prof. Jhanjhi has
evaluated over 60 theses as an external Ph.D./Master thesis examiner for
universities worldwide. His extensive academic qualifications span 10 years and
encompass accreditation bodies such as ABET, NCAAA, and NCEAC. Prof. Jhanjhi's
diverse research interests encompass Cybersecurity, AI, IoT Security, Wireless
Security, Data Science, Software Engineering, and Unmanned Aerial Vehicles (UAVs).
Additionally, he has been invited as a keynote speaker for over 60 international
conferences and has chaired numerous international conference sessions.
Speech Title: Cybersecurity Issues and Challanges in the Era of
Generative AI
Asst. Prof. Hongyang Li
The University of Hong Kong, China
Biography: Professor Li is an Assistant Professor in HKU Musketeers
Foundation
Institute of Data Science and Research Scientist at OpenDriveLab, Shanghai AI Lab.
His research focus is on autonomous driving and embodied AI. He proposed the
bird’s-eye-view perception work, BEVFormer, that won Top 100 AI Papers in 2022 and
was explicitly recognized by Jensen Huang, CEO of NVIDIA and Prof. Shashua, CEO of
Mobileye at public keynotes. He served as Area Chair for CVPR 2023, 2024, NeurIPS
2023 (Notable AC), 2024, ACM MM 2024, ICLR 2025, referee for Nature Communications.
He will serve as Workshop Chair for CVPR 2026. He is the Working Group Chair for
IEEE Standards under Vehicular Technology Society and Senior Member of IEEE.
Speech Title: Achilles' Heel in Manipulation: Key Recipe and
Missing Pieces towards
Intelligent Embodied AI
Abstract: The increasing demand for versatile robotic systems to
operate in diverse
and dynamic environments has emphasized the importance of a generalist policy, which
leverages a large cross-embodiment data corpus to facilitate broad adaptability and
high-level reasoning. However, the generalist would struggle with inefficient
inference and cost-expensive training. The specialist policy, instead, is curated
for specific domain data and excels at task-level precision with efficiency. Yet, it
lacks the generalization capacity for a wide range of applications. Inspired by
these observations, we introduce RoboDual, a synergistic dual-system that
supplements the merits of both generalist and specialist policy. A diffusion
transformer-based specialist is devised for multi-step action rollouts, exquisitely
conditioned on the high-level task understanding and discretized action output of a
vision-language-action (VLA) based generalist. Compared to OpenVLA, RoboDual
achieves a 12% improvement on CALVIN and 26.7% in real-world by adapting the
specialist policy with 20M trainable parameters only. It maintains strong
performance with merely 5% of demonstration data, and enables a 3.8 higher control
frequency in real-world deployment. Code and models would be made publicly
available.
Prof. Bing Liu
University of Illinois Chicago (UIC)
ACM/AAAI/IEEE Fellow
Biography: Bing Liu is a Distinguished Professor and the Peter L.
and Deborah K.
Wexler Professor of Computing at the University of Illinois Chicago. He earned his
Ph.D. in Artificial Intelligence from the University of Edinburgh. His research
interests span continual/lifelong learning, lifelong learning dialogue systems,
machine learning, and natural language processing. Professor Liu has published
extensively in top conferences and journals and authored five books, including two
focused on lifelong/continual learning. He has received three Test-of-Time paper
awards, one Test-of-Time honorable mention, and some of his work has been widely
featured in international media and tech press. He served as Chair of ACM SIGKDD
from 2013 to 2017 and as a program chair for numerous leading data mining
conferences. Currently, he serves as a program co-chair for the 2025 Conference on
Lifelong Learning Agents (CoLLAs-2025). Among his many honors, Professor Liu is the
2018 recipient of the ACM SIGKDD Innovation Award and is a Fellow of ACM, AAAI, and
IEEE.
Speech Title: Continual Learning Using Large Language Models
Abstract: The ability to continually learn and accumulate knowledge
over a lifetime
is a hallmark of human intelligence. It is also essential for AI agents. However,
the prevailing machine learning paradigm lacks this crucial capability. This talk
introduces the concept of continual learning, outlining its different settings, and
then delves into using large language models (LLMs) for continual learning, which
notably boosts accuracy. Following this, it presents some recent work on using
in-context learning as a strategy for continual learning, which further enhances
accuracy and adaptability.
Assoc. Prof. Liang Pang
CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese
Academy of Sciences, China
Biography: Liang Pang, an associate researcher at the CAS Key
Laboratory of AI
Safety, Institute of Computing Technology, Chinese Academy of Sciences, and a
visiting scholar at the National University of Singapore, specializes in research
areas of natural language generation and information retrieval. He has published
over 60 papers at international conferences and has accumulated more than 3000
citations on Google Scholar. Pang serves as a program committee member for
international conferences, a reviewer for academic journals, a standing committee
member of the Information Retrieval Special Committee of the Chinese Information
Processing Society, the deputy director of the Youth Working Committee of the
Chinese Information Processing Society, and a member of the Youth Innovation
Promotion Association of the Chinese Academy of Sciences. He has been honored with
the Outstanding Doctoral Dissertation Award from the Chinese Information Processing
Society, Best Paper Runner-up Award at CIKM, and received the Best Paper Hornerable
Mentioned Award at SIGIR. His proposed deep text matching model achieved a global
ranking of fourth in the Kaggle QQP Text Matching competition. He was the global
champion in reinforcement learning at the NeurIPS 2018 Multi-Agent Challenge. His
team topped the global leaderboard in the multi-hop open-domain question answering
challenge HotpotQA.
Speech Title: 检索增强大模型前沿技术与社会影响
Abstract:
近年来,检索增强大模型的范式有效地提升了大语言模型生成内容的准确性和可信性,基于检索增强大模型的流程我们可以从四个视角来讨论。在信息检索模块的视角,如何构建适用于大模型的检索模块,有助于大模型更高效的筛选出对生成有效的信息;在大语言模型模块的视角,如何教会大模型使用外部信息,有助于避免检索噪声信息对生成影响;在模块间交互的视角,如何设计信息检索模块与大语言模型模块交互配合的机制,有助于将内部参数知识与外部语料库知识充分融合;最后,在信息回路的视角,讨论智能生成内容将对信息检索内容生态造成的潜在影响。
Dr. Hoifung Poon
General Manager, Health Futures
Microsoft Research
Biography: Hoifung Poon is the General Manager at Health Futures in
Microsoft
Research and an affiliated faculty at the University of Washington Medical School.
He leads biomedical AI research and incubation, with the overarching goal of
structuring medical data to optimize delivery and accelerate discovery for precision
health. His team and collaborators are among the first to explore large language
models (LLMs) and multimodal generative AI in health applications, producing popular
open-source foundation models such as PubMedBERT, BioGPT, BiomedCLIP, LLaVA-Med,
BiomedParse. His latest publication in Nature features GigaPath, the first
whole-slide digital pathology foundation model pretrained on over 1 billion
pathology image tiles. He has led successful research partnerships with large health
providers and life science companies, creating AI systems in daily use for
applications such as molecular tumor board and clinical trial matching. He has given
tutorials on these topics at top AI conferences such as ACL, AAAI, and KDD, and his
prior work has been recognized with Best Paper Awards from premier AI venues such as
NAACL, EMNLP, and UAI. He received his PhD in Computer Science and Engineering from
the University of Washington, specializing in machine learning and NLP.
Speech Title: Advancing Health at the Speed of AI
Abstract: The dream of precision health is to develop a
data-driven, continuous
learning system where new health information is instantly incorporated to optimize
care delivery and accelerate biomedical discovery. The confluence of technological
advances and social policies has led to rapid digitization of multimodal,
longitudinal patient journeys, such as electronic medical records (EMRs), imaging,
and multiomics. Our overarching research agenda lies in advancing multimodal
generative AI for precision health, where we harness real-world data to pretrain
powerful multimodal patient embedding, which can serve as digital twins for
patients. This enables us to synthesize multimodal, longitudinal information for
millions of cancer patients, and apply the population-scale real-world evidence to
advancing precision oncology in deep partnerships with real-world stakeholders such
as large health systems and pharmaceutical companies.
Dr. Xian Wu
Director of Tencent Youtu Lab Jarvis Research Center
Biography: Xian Wu received the PhD degree from Shanghai Jiao Tong
University. He is
now a principal researcher with Tencent. Before joining Tencent, he worked as a
senior scientist manager and a staff researcher with Microsoft and IBM Research. His
research interests include medical AI, natural language processing and multi-modal
modeling. He has published papers in CVPR, NeurIPS, ACL, WWW, AAAI, IJCAI etc. He
also served as PC member of IEEE Transactions on Knowledge and Data Engineering, ACM
Transactions on Knowledge Discovery from Data, ACM Transactions on Information
Systems, ACM Transactions on Intelligent Systems and Technology, CVPR, ICCV, AAAI
etc.
Speech Title: 从深度学习到大模型,医学AI上的一些尝试
Dr. Xin Xia
Chief Expert of the Software Engineering Application Technology at Huawei,
China.
Biography: Xin Xia is the Chief Expert of Software Engineering
Application Technology at Huawei, China. Before joining Huawei, he was an ARC DECRA
Fellow and a lecturer (equivalent to a U.S. assistant professor) at the Faculty of
Information Technology, Monash University, Australia. He earned his Ph.D. in June
2014 from the College of Computer Science and Technology, Zhejiang University,
China, under the supervision of Prof. Xiaohu Yang and Prof. Jianling Sun. From July
2012 to January 2014, he was a visiting student with Prof. David Lo at Singapore
Management University. In 2022, he received the ACM SIGSOFT Early Career Researcher
Award.
Xin Xia's current research aims to assist developers and testers in improving their
productivity by focusing on data science for software engineering. Specifically, he
works on mining and analyzing data from software repositories to uncover valuable
and actionable insights. His work employs and customizes a variety of structured and
unstructured data analytics techniques, such as data mining, information retrieval,
natural language processing, search-based algorithms, and program analysis,
transforming passive software engineering data into automated tools and novel
insights.
Speech Title: 大模型下的软件工程:进展与挑战
Abstract:
软件工程大模型得到了广泛应用,同时也迎来了新的挑战,例如如何让大模型可以更好地理解软件工程业务和知识、如何更好地使能大模型输出安全可信的代码、如何评价大模型在各项软件工程能力的表现等,这也亟需我们重新思考大模型下的软件工程的未来方向。本次报告从实践角度,梳理当前软件工程大模型的挑战,并探讨未来可能的发展方向。
Prof. Jungang Xu
University of Chinese Academy of Sciences, China
Director of Cloud Computing and Intelligent Information Processing Laboratory
Biography: Jungang Xu, Professor and doctoral supervisor of
University of Chinese
Academy of Sciences, Director of Cloud Computing and Intelligent Information
Processing Laboratory, and chief Professor of Deep Learning Course of University of
Chinese Academy of Sciences. His research interests include multimodal intelligence,
intelligent decision and optimization, embodied intelligence, etc. He is the Expert
in the National Science and Technology expert Database, the expert of the Ministry
of Industry and Information Technology of China, the expert of the Beijing Municipal
Science and Technology Commission and Administrative Commision of Zhongguancun
Science Park. He is the executive member of the Special Committee on Artificial
Intelligence and Pattern Recognition, executive member of the Special Committee on
Natural Language Processing, executive member of the Special Committee on Database
in China Computer Federation, and standing member of the Special Committee on
Intelligent Service of the Chinese Association for Artificial Intelligence. He has
presided over a number of scientific research projects, such as National Key
Technology Research and Development Program, National Natural Science Foundation,
Beijing Science and Technology Plan, and Beijing Natural Science Foundation, and
published more than 100 articles. He won the second prize of China Geographic
Information Technology Progress Award in 2022.
Speech Title: 大模型的发展趋势与应用
Assoc. Prof. Cheng Yang
Beijing University of Posts and Telecommunications, China
Biography: Cheng Yang received the BE and PhD degrees from Tsinghua
University, in
2014 and 2019, respectively. He is currently an associate professor with the Beijing
University of Posts and Telecommunications. His research interests include natural
language processing and network representation learning.
Speech Title: 大语言模型智能体高效协作框架
Abstract:大语言模型(LLMs)目前已展现出推理、规划、工具使用等诸多类人智能,可作为智能体(Agent)的大脑自动化地处理各种复杂任务。然而这些大语言模型智能体是否能够像人类一样学会沟通与分工,更快更好地进行任务协作,仍然是一个亟待探索的问题。本报告将介绍大语言模型智能体协作研究的最新进展,并分析实验中发现的各类智能体合作涌现行为。
Prof. Shuanghua Yang
University of Reading, UK
IET Fellow, IEEE Senior Member
Biography: Shuang-Hua Yang is currently a professor and the Head of Department of
Computer Science at the University of Reading, the UK and the Director of Shenzhen
Key Laboratory of Safety and Security for Next Generation of Industrial Internet,
China. He was selected as a member of European Academy of Sciences and Arts in 2024,
and awarded DSc from Loughborough University in 2014 to recognize his academic
contribution to wireless monitoring research. He is a Fellow of IET and a Fellow of
InstMC, U.K. His current research interests include cyber-physical system safety and
security, and industrial Internet of Things.
Speech Title: Comprehensive Knowledge Integration for Multivariate Time Series
Anomaly Detection with Multi-view learning
Abstract: Anomaly detection in the Industrial Internet of Things (IIoT) is a
challenging task that hinges on the effective learning of multivariate time series
representations. Despite the intricate spatial and temporal relationships inherent
in IIoT systems, existing methods primarily extract features from a single
domain—either temporal or spatial (sensor-wise)—or simply combine the two
sequentially, limiting their anomaly detection capabilities. To address these
limitations, this talk introduces the Spatial-Temporal Association Discrepancy
(STAD) component, which leverages the discrepancies between spatial and temporal
features to enhance latent representation learning. Specifically, we propose the
Skip-Patching Spatial-Temporal Anomaly Detection (SSAD) framework, which integrates
spatial and temporal features in a diverse and comprehensive manner, significantly
improving learning processes. Furthermore, we present a novel framework called
Two-Views Pre-train Anomaly Detection (2ViewsAD), designed to enhance both the
generalization and robustness of learned representations. The SSAD framework
demonstrates superior performance, validating the effectiveness of combining
skip-patching techniques with spatial-temporal features to improve anomaly detection
in IIoT systems. Meanwhile, 2ViewsAD utilizes self-supervised learning during
pre-training, effectively capturing both temporal and spatial (sensor-wise)
features. This dual-view strategy enables the model to seamlessly integrate insights
from both perspectives, further boosting detection capabilities. Experimental
results confirm that 2ViewsAD achieves state-of-the-art anomaly detection
performance.
Asst. Prof. Quanming Yao
Tsinghua University, China
Biography: Dr. Quanming Yao currently is a tenure-track assistant
professor at
Department of Electronic Engineering, Tsinghua University. He was a researcher to a
senior scientist in 4Paradigm INC, where he set up and led the company's machine
learning research team. He obtained his Ph.D. degree at the Department of Computer
Science and Engineering of Hong Kong University of Science and Technology (HKUST).
He has published 80+ top conference and journal papers, with more than 10000
citations. He regularly serves as area chairs for ICML, NeurIPS and ICLR. He is also
a receipt of National Youth Talent Plan (China), inaugural winner of Ant Intech
Prize, Forbes 30 Under 30 (China), Young Scientist Awards (Hong Kong Institution of
Science), and Google Fellowship (in machine learning).
Speech Title: Parsimony Learning from Deep Networks
Abstract: The scaling law, which involves the brute-force expansion
of training
datasets and learnable parameters, has become a prevalent strategy for developing
more robust learning models. However, due to bottlenecks in data, computation, and
trust, the sustainability of the scaling law is a serious concern for the future of
deep learning. In this paper, we address this issue by developing next-generation
models in a parsimonious manner (i.e., achieving greater potential with simpler
models). The key is to drive models using domain-specific knowledge, such as
symbols, logic, and formulas, instead of relying on the scaling law. This approach
allows us to build a framework that uses this knowledge as “building blocks” to
achieve parsimony in model design, training, and interpretation. Empirical results
show that our methods surpass those that typically follow the scaling law. We also
demonstrate the application of our framework in AI for science, specifically in the
problem of drug-drug interaction prediction. We hope our research can foster more
diverse technical roadmaps in the era of foundation models.
Prof. Xindong You
Beijing Information Science & Technology University, China
Biography: Xindong You is a Professor at Beijing Information
Science and Technology
University, she is a member of both the Natural Language Processing Professional
Professional Committee and the Information Storage Committee of China Computer
Federation (CCF). She has presided near 20 research projects, including the National
Natural Science Foundation of China, the National Defense Basic Strengthening
Research Program, the Beijing Natural Science Foundation General Program, the
Equipment Pre-research Key Laboratory Fund Project, industry-commissioned horizontal
projects, the Zhejiang Provincial Natural Science Foundation, and the China
Postdoctoral Science Foundation General Program. She has also participated as a key
member in over 10 research projects, such as the 973 National Key Research and
Development Program, the Ministry of Science and Technology's Support Program, the
National Natural Science Foundation of China, Zhejiang Provincial Major Special
Projects, Zhejiang Provincial Natural Science Foundation Projects, and the
Humanities and Social Sciences Research Program funded by the Ministry of Education.
She has published more than 30 papers in domestic and international journals as the
first author or corresponding author, with three papers included in the TOP journals
of the first quartile in Chinese Academy of Sciences' journal ranking. Additionally,
She has authored one academic monograph independently, which was published with
support from the China Postdoctoral Excellent Academic Monograph Publication Fund by
Science Press.
Speech Title: Exploration of Key Technologies and Field
Applications of Knowledge
Graphs
Abstract: The technical system of symbolic knowledge graphs serves
as an effective
complement to large models, providing support for accurate domain knowledge and
complex reasoning capabilities for the industrial implementation of large models.
The combination of domain-specific large models and domain-specific knowledge graphs
can become an important means for the application of artificial intelligence in
various fields. This report mainly discusses the past, present, and future
development trends of knowledge graphs, the key technologies for constructing
knowledge graphs and their main application scenarios, as well as the research
group's application in the fields of weaponry and equipment, coal mine
electromechanical equipment, interpretability of image classification, and the
entire Mini/Micro LED industry chain, and vision future prospects for the
application of the knowledge graph.
Prof. Zhongfei (Mark) Zhang
University of New York (SUNY) at Binghamton, USA
IEEE Fellow, IAPR Fellow, AAIA Fellow
Biography: Zhongfei (Mark) Zhang is a professor at the School of
Computing,
Binghamton University, State University of New York (SUNY), USA. He received a B.S.
in Electronics Engineering (with Honors), an M.S. in Information Sciences, both from
Zhejiang University, China, and a PhD in Computer Science from the University of
Massachusetts at Amherst, USA. His research interests are in the broad areas of
machine learning, data mining, computer vision, and pattern recognition, and
specifically focus on multimedia/multimodal data understanding and mining. He was on
the faculty of Computer Science and Engineering at the University at Buffalo, SUNY,
before he joined the faculty of the School of Computing at Binghamton University,
SUNY. He is the author or co-author of the very first monographs on multimedia data
mining and on relational data clustering, respectively. He has published over 200
papers in the premier venues in his areas. He holds more than thirty inventions, has
served as members of the organization committees of several premier international
conferences in his areas including general co-chair and lead program chair, and as
editorial board members for several international journals. He served as a French
CNRS Chair Professor of Computer Science at the University of Lille 1 in France, a
JSPS Fellow and visiting professorship in Waseda University and Chuo University,
Japan, a QiuShi Chair Professor in Zhejiang University, China, as well as visiting
professorships at many universities and research labs in the world when he was on
leave from Binghamton University years ago. He received many honors including SUNY
Chancellor’s Award for Scholarship and Creative Activities, SUNY Chancellor’s
Promising Inventor Award, and best paper awards from several premier conferences in
his areas. He is a Fellow of IEEE, IAPR, and AAIA.
Speech Title: Uncertainty Analysis for Out-of-distribution
Detection
Abstract: One significant obstacle to deploying deep neural network
(DNN) models in
real-world applications is that deep learning systems often break down in novel
situations which were never seen during the training of the system. This is related
to the out-of-distribution detection problem in the literature. Specifically, DNNs
tend to yield unreliable predictive estimates and make high-confident yet incorrect
predictions when exposed to inputs drawn from unfamiliar distributions.
Consequently, accurate predictive uncertainty analysis of DNNs is critical in many
high-stake applications such as medical diagnosis, self-driving vehicles, and
financial decision-making, where silent mistakes can lead to catastrophic
consequences. In this talk, I will first introduce the uncertain analysis issue
through a novel uncertainty factorization model as the theoretical foundation for
this study. Based on this model, I will then introduce a general and flexible
framework for predictive uncertainty estimation with promising evaluation results in
several out-of-distribution detection tasks on both vision and language datasets.
Prof. Dongyan Zhao
Wangxuan Institute of Computer Technology
Peking University, China
Biography: Dongyan Zhao is a professor with the Wangxuan Institute
of Computer
Technology (WICT), Peking University (PKU), China. He received the BS, MS, and PhD
degrees in computer science from the Department of Computer Science and Technology,
PKU. He His major research interests include natural language processing, semantic
data management and knowledge-based intelligent system.
Speech Title: 基于大规模语言模型的智能问答
Asst. Prof. Lei Lu
King’s College London & University of Oxford, UK
Biography: Dr. Lei Lu is an Assistant Professor at King’s College
London, and a
Visiting Research Fellow at University of Oxford. Prior to this, he was a Senior
Research Associate at the Institute of Biomedical Engineering, University of Oxford.
Dr. Lu’s work focuses on clinical machine learning and computational informatics for
healthcare applications. This involves developing multimodal AI and generative model
for medical diagnosis, patient phenotyping, health prediction, and biomarker
identification. He contributes to the academic community by serving as conference
session chair and workshop committee for IJCAI, CIKM, and ICRA. His papers were
published in IEEE TPAMI, TCYB, JBHI, TBME, and EHJ-DH. He received the IET J.A.
Lodge award in 2021, which presents to one early-career researcher annually with
distinction in the UK and abroad.
Speech Title: Deep Learning for Advancing Cardiovascular Healthcare
Abstract: Electrocardiogram (ECG) is widely considered the primary
test for
evaluating cardiovascular diseases. However, the use of AI models to advance these
medical practices and learn new clinical insights from ECGs remains largely
unexplored. Utilising a data set of 2.3 million ECGs collected from patients with 7
years follow-up, we developed a DNN model with state-of-the-art granularity for the
interpretable diagnosis of cardiac abnormalities, gender identification, and
hypertension screening solely from ECGs, which are then used to stratify the risk of
mortality. Our model demonstrated cardiologist-level accuracy in interpretable
cardiac diagnosis, and the potential to facilitate clinical knowledge discovery for
gender and hypertension detection which are not readily available. In addition, we
explored the design of optimal DNN models through of a novel Neural Architecture
Search (NAS) approach, which was able to find networks outperformed the
state-of-the-art models with fewer than 5% parameters.
Asst. Prof. Liangqiong Qu
The University of Hong Kong, Hong Kong S.A.R., China
Biography: Dr. Liangqiong Qu is an Assistant Professor in the
Department of
Statistics and Actuarial Science and the Institute of Data Science, The University
of Hong Kong. Previously, she was a postdoctoral research fellow at Stanford
University, working with Prof. Daniel Rubin. Before joining Stanford, she was a
postdoctoral research fellow at The University of North Carolina at Chapel Hill,
working with Prof. Dinggang Shen. She obtained her joint Ph.D. degree in University
of Chinese Academy of Sciences and City University of Hong Kong under the
supervision of Prof. Yandong Tang, Prof. Qingxiong Yang, and Prof. Rynson W.H. Lau.
Her research interests span the area of artificial intelligence, computer vision and
medical imaging processing. More information about Dr. Qu can be found at her
personal website: https://liangqiong.github.io/.
Speech Title: Advancing federated learning via Heterogeneity
Evaluation,
Optimization, and Privacy Preservation
Abstract: Federated Learning (FL) offers a promising solution for
training robust
deep learning models on large and representative data without sharing it across
institutions. Nonetheless, the widespread adoption of FL in healthcare is hindered
by two key challenges: (1) The lack of federated learning methods robust to data,
device, and state variabilities across sites. Existing approaches for addressing
device and state heterogeneities are often evaluated in simulated FL environments,
raising concerns about their real-world performance. Additionally, assessing a new
FL device/state optimization method’s ability to adapt to varying degrees of such
heterogeneity is challenging due to the lack of diverse real-world datasets and
quantification metrics. (2) Potential privacy leakage risks through shared model
weights and the absence of intuitive tools for securely executing FL algorithms.
While advanced privacy preservation FL techniques exist, they usually involve
considerable trade-offs between accuracy and utility.
In this talk, we will illustrate how we address the foregoing challenges by
establishing a practical and versatile FL platform that integrates real-world
evaluation benchmarks, heterogeneous optimization methods, and privacy protection
strategies.
Assoc. Prof. Yanan Sui
Tsinghua University, China
Biography: Yanan Sui (YananSui.com), Associate Professor at
Tsinghua University, is
dedicated to the research of human neuro-musculo-skeletal modeling and control, with
applications in embodied intelligence and brain-machine interaction. He received his
B.S. from Tsinghua University, his Ph.D. from Caltech, and did postdoctoral work at
Caltech and Stanford University. His work on safe optimization has been included in
textbooks at Stanford and other universities. He co-won the Best Conference Paper
Award and the Best Paper Award on Human-Robot Interaction at the 2020 International
Conference on Robotics and Automation. His work has been successfully applied to the
clinical treatment of neural injuries in China and the United States. He has served
as area chair for leading AI conferences. For his contribution to the
interdisciplinary field of artificial intelligence and neural engineering, he was
selected as one of MIT Technology Review's Innovators Under 35 in China.
Speech Title: Self Model for Embodied Intelligence
Abstract: Modeling and control of the human musculoskeletal system
is important for
understanding human motor function, developing embodied intelligence, and optimizing
human-robot interaction systems. However, current models are restricted to a limited
range of body parts and often with a reduced number of muscles. There is also a lack
of algorithms capable of controlling over 600 muscles to generate reasonable human
movements. To fill this gap, we build a musculoskeletal model with 90 body segments,
206 joints, and ~700 muscle-tendon units, allowing simulation of whole-body dynamics
and interaction with various devices. We develop a new algorithm using
low-dimensional representation and hierarchical deep reinforcement learning to
achieve state-of-the-art whole-body control. We validate the effectiveness of our
model and algorithm in simulations using real human locomotion data. This work
promotes a deeper understanding of human motion control and better design of
interactive robots.