AIGC 2024 Conference Overview

The 2^nd International Conference on AI-generated Content (AIGC 2024) was successfully held on December 21–22, 2024, in Beijing, China. The event brought together over 600 participants and fostered dynamic discussions and meaningful collaborations. It served as a vital platform for the exchange of cutting-edge research and forward-looking insights, helping to shape the future of AI-generated content. The success of AIGC 2024 highlighted the conference's growing importance in the global AI landscape. By bridging the gap between academia and industry, the conference catalyzed collaborations that continue to push the boundaries of generative AI. As the field rapidly evolves, AIGC remains at the forefront—driving innovation and shaping the next era of AI-generated content.

Photo gallary of AIGC 2024

AIGC 2024 Publication History

Following the conference, the proceedings of AIGC 2024 were published by SPIE (Proceedings Volume 13649). Indexed by prestigious databases such as EI and Scopus, this volume serves as a valuable collection of the latest research on AI-generated content. By offering a platform for scholars and practitioners to share their work, AIGC 2024 made a significant contribution to advancing academic discussions in artificial intelligence and content creation.

Proceedings Volume 13649

International Conference on AI-Generated Content (AIGC 2024)

Feng Zhao , Duoqian Miao

View the digital version of this volume at SPIE Digital Libarary.

View on SPIE Digital Library

Plenary Speakers

Prof. Jian Sun
Xi’an Jiaotong University, China

Biography: Jian Sun is a Professor at Xi'an Jiaotong University, where he completed his Ph.D. in Applied Mathematics. His career includes roles as a visiting student at Microsoft Research Asia (Nov. 2005 - March 2008), a postdoctoral researcher at the University of Central Florida (Aug. 2009 - April 2010), and with the Willow team at École Normale Supérieure de Paris / INRIA (Sept. 2012 - Aug. 2014). He serves on the editorial board of the International Journal of Computer Vision (IJCV) and has been an area chair for major conferences such as ICCV, ECCV, and MICCAI. Dr. Sun is a recipient of the National Science Fund for Distinguished Young Scholars in China. His current research focuses on machine learning methods, including generalizable and explainable machine learning, optimal transport, AI applications in mathematics, as well as computer vision and medical image analysis.
Speech Title: 生成式人工智能的数学与统计学基础
Abstract: 生成式人工智能是当前通用人工智能发展的重要方向，主要通过设计人工智能算法实现对多模态、高维复杂样本分布的学习与新样本的生成，是当前人工智能应用于自动问答、跨模态生成、AI for science等问题的方法基础。生成式人工智能的底层基础是数学与统计学，本报告主要介绍生成式人工智能的背景、数学与统计学基本原理以及其面临的主要挑战问题；进一步介绍以最优传输理论与方法作为基础构建可控／条件生成的人工智能方法，及其在自然图像、医学影像等领域中的应用。最后总结并展望生成式人工智能的未来发展前景。

Prof. Guoyin Wang
President of Chongqing Normal University, China
IRSS/I2CICC/CAAI/CCF Fellow, IEEE SM
Vice-President of CAAI

Biography: Guoyin Wang received the B.S., M.S., and Ph.D. degrees from Xi’an Jiaotong University, Xian, China, in 1992, 1994, and 1996, respectively. He worked at the University of North Texas, and the University of Regina, Canada, as a visiting scholar during 1998-1999. He had worked at the Chongqing University of Posts and Telecommunications during 1996-2024, where he was a professor, the Vice-President of the University, the director of the Chongqing Key Laboratory of Computational Intelligence, the director of the Key Laboratory of Cyberspace Big Data Intelligent Security of the Ministry of Education, the director of Tourism Multi-source Data Perception and Decision Technology of the Ministry of Culture and Tourism, and the director of the Sichuan-Chongqing Joint Key Laboratory of Digital Economy Intelligence and Security. He was the director of the Institute of Electronic Information Technology, Chongqing Institute of Green and Intelligent Technology, CAS, China, 2011-2017. He has been serving as the President of Chongqing Normal University since June 2024. He is the author of over 10 books, the editor of dozens of proceedings of international and national conferences and has more than 300 reviewed research publications. His research interests include rough sets, granular computing, machine learning, knowledge technology, data mining, neural network, cognitive computing, etc. Dr. Wang was the President of International Rough Set Society (IRSS) 2014-2017, and a council member of the China Computer Federation (CCF) 2008-2023. He is currently a Vice-President of the Chinese Association for Artificial Intelligence (CAAI), and the President of Chongqing Association for Artificial Intelligence (CQAAI). He is a Fellow of IRSS, I2CICC, CAAI and CCF.
Speech Title: Brain Cognition Inspired Artificial Intelligence
Abstract: With the synergy of big data, big computing power and large model, artificial intelligence (AI) has made breakthrough progress in surpassing some key human intelligence abilities such as visual intelligence, auditory intelligence, decision intelligence, and language intelligence in recent years. However, AI systems surpass certain human intelligence abilities in a statistical sense as a whole only. They are not true realization of these human intelligence abilities and behaviors. This talk reviews the role of cognitive science in inspiring the development of the three mainstream academic branches of AI based on Marr’s three-layer framework, explores and analyses the limitations of the current development of AI. Future research directions and their scientific issues that need to be focused on in brain-inspired AI research are proposed.

Prof. Xingwei Wang
Vice President of Northeastern University, China CCF Fellow

Biography: Xingwei Wang is a distinguished professor and doctoral supervisor, holding the esteemed titles of Fellow of CCF, director of CCF, and vice president of Northeastern University. He has been awarded numerous national grants including the National Outstanding Youth Science Foundation of China, the Special Government Allowance from the State Council, and the Program for New Century Excellent Talents of the Ministry of Education. Additionally, Prof. Wang holds prominent positions in various important organizations such as being a member of the National Graduate Education Steering Committee for Professional Engineering Degree and serving as Deputy Director for both Network and Data Communications Committee & Technical Committee on Internet at China Computer Federation; he also serves as Director & Fellow at China Institute of Communications while being part of its fellow selection committee along with being an expert committee member at China Education and Research Network (CERNET) while also serving as Vice Chairman at Liaoning Internet Society. Furthermore, he is an editorial board member for prestigious journals like Chinese Journal Of Computers , Journal Of Software ,and Journal Of Computer Research And Development . Moreover，he is one among Elsevier Highly Cited Chinese Researchers Ranking .He leads Liaoning Provincial Innovation Team besides working as Director at Liaoning Provincial Key Laboratory Of Intelligent Internet Theory And Applications. His primary research interests encompass the domains of Internet, cloud computing, and network space security. To date, he has been bestowed with 2 second prizes for national scientific and technological progress, 2 first prizes for scientific and technological progress from the Ministry of Education, 1 first prize for scientific and technological progress from the China Institute of Communications, 1 second prize for technical invention from the Ministry of Education, 1 second prize for technical invention from Liaoning Province, as well as 1 second prize for natural science from Hunan Province. Additionally, he has published over a hundred papers in esteemed academic journals such as IEEE Transactions while presenting his research at renowned academic conferences like IEEE ICDCS. Moreover, his contributions include over a hundred papers indexed in SCI along with the publication of nine monographs. Furthermore, he has been granted twenty-seven national invention patents and received twenty awards at both national and provincial levels for talent cultivation.

Prof. Yue Zhang
Westlake University, China

Biography: Yue Zhang is a tenured Professor at Westlake University. His research interests include NLP and its underlying machine learning algorithms. His major contributions to the field include psycholinguistically motivated machine learning algorithm, learning-guided beam search for structured prediction, pioneering neural NLP models including graph LSTM, and OOD generalization for NLP. He authored the Cambridge University Press book ``Natural Language Processing -- a Machine Learning Perspective''. He is the PC co-chair for CCL 2020 and EMNLP 2022, and action editor for Transactios for ACL. He also served as associate editor for IEEE/ACM Transactions of Audio Speech and Language Processing (TASLP), ACM Transactions on Asian and Low-Resource Languages (TALLIP), IEEE Transactions on Big Data (TBD) and Computer, Speech and Language (CSL). He won the best paper awards of IALP 2017 and COLING 2018, best paper honorable mention of SemEval 2020, and best paper nomination for ACL 2018 and ACL 2023.
Speech Title: LLM reasoning and generalization
Abstract: In this talk, I will discuss linguistic reasoning, and the capabilities of formal logic reasoning for large langauge models (LLMs). I will discuss the difficulty of learning formal reasoning from empirical risk minimization, and discuss a perspective to this problem from causal learning theory. I will discuss causal features and confounders, and show how learning confounders can lead to low out-of-distribution generalization performance. Then I will discuss two general methods to address the issue, including a data-centric method and a model-centric method, introducing several recent works using both methods.

Keynote Speakers & Invited Speakers

Asst. Prof. Bo Han
Hong Kong Baptist University

Biography: Bo Han is currently an Assistant Professor in Machine Learning and a Director of Trustworthy Machine Learning and Reasoning Group at Hong Kong Baptist University, and a BAIHO Visiting Scientist of Imperfect Information Learning Team at RIKEN Center for Advanced Intelligence Project (RIKEN AIP), where his research focuses on machine learning, deep learning, foundation models, and their applications. He was a Visiting Research Scholar at MBZUAI MLD (2024), a Visiting Faculty Researcher at Microsoft Research (2022) and Alibaba DAMO Academy (2021), and a Postdoc Fellow at RIKEN AIP (2019-2020). He received his Ph.D. degree in Computer Science from University of Technology Sydney (2015-2019). He has served as Senior Area Chair of NeurIPS, and Area Chairs of NeurIPS, ICML and ICLR. He has also served as Associate Editors of IEEE TPAMI, MLJ and JAIR, and Editorial Board Members of JMLR and MLJ. He received Outstanding Paper Award at NeurIPS, Most Influential Paper at NeurIPS, Outstanding Student Paper Award at NeurIPS Workshop, Notable Area Chair at NeurIPS, Outstanding Area Chair at ICLR, and Outstanding Associate Editor at IEEE TNNLS.
Speech Title: Exploring Trustworthy Foundation Models under Imperfect Data
Abstract: In the current landscape of machine learning, it is crucial to build trustworthy foundation models that can operate under imperfect conditions, since most real-world data, such as unexpected inputs, image artifacts, and adversarial inputs, are easily noisy. These models need to possess human-like capabilities to learn and reason in uncertainty. In this talk, I will focus on three recent research advancements, each shedding light on the reliability, robustness, and safety in this field. Specifically, the reliability will be explored through the enhancement of vision-language models by introducing negative labels, which effectively detect out-of-distribution samples. Meanwhile, robustness will be explored through our investigation into image interpolation using diffusion models, addressing the challenge of information loss to ensure consistency and quality of generated content. Then, safety will be highlighted by our study on hypnotizing large language models, DeepInception, which leverages the creation of a novel nested scenario to induce adaptive jailbreak behaviors, revealing vulnerabilities during interactive model engagement.

Prof. Songlin Hu
Institute of Information Engineering (IIE), the Chinese Academy of Sciences, China

Biography: Songlin Hu is a full professor at the Institute of Information Engineering (IIE), the Chinese Academy of Sciences. He is also a joint professor at the University of Chinese Academy of Sciences. His research areas include big data, natural langurage processing, knowledge graph, etc. He has published more than 100 publications in many reputed conferences and journals, like ACL,AAAI,IJCAI,EMNLP, SIGMOD,VLDB,ICDE, ACM/IEEE Trans, etc.
Speech Title: 大模型安全治理

Assoc. Prof. Gao Huang
Tsinghua University, China

Biography: Gao Huang is an Associate Professor affiliated with the Department of Automation at Tsinghua University. He obtained the PhD degree in machine learning from Tsinghua in 2015, and spent three years at Cornell University as a postdoc. His research interests lie in machine learning and computer vision. In particular, he is actively working on efficient deep learning, dynamic neural networks, learning with limited data and reinforcement learning. His work on DenseNet won the Best Paper Award of CVPR (2017). He has collected more than 70,000 citations according to Google Scholar.
Speech Title: 面向长序列的Transformer基础架构

Prof. Noor Zaman Jhanjhi
Taylor's University, Malaysia

Biography: Professor Dr. Noor Zaman Jhanjhi, often referred to as N.Z. Jhanjhi, holds the esteemed position of Professor in Computer Science with specializations in Cybersecurity and Artificial Intelligence. He currently serves as the Program Director for Postgraduate Research Degree Programmes in Computer Science and Director of the Center for Smart Society (CSS5) at Taylor’s University, Malaysia. Recognized as one of the world’s top 2% research scientists for consecutive years in 2022 and 2023, he is esteemed as one of Malaysia's top three computer science researchers. Notably, he was honoured as an Outstanding Faculty Member by MDEC Malaysia in 2022. Prof. Jhanjhi boasts a prolific publication record with numerous highly indexed works in WoS/ISI/SCI/SCIE/Scopus, accumulating a collective research impact factor exceeding 1000 points. His Google Scholar H-index stands at an impressive 65, with an I-10 Index approaching 291, and a Scopus H-index of 47. With over 600 publications to his credit, including several international patents in Australia, Germany, the UK, and Japan, Prof. Jhanjhi has significantly contributed to the academic discourse. An accomplished editor and author, he has curated over 50 research books published by esteemed publishers such as Springer, IGI Global USA, Taylor & Francis, IET, Elsevier, Wiley, Bentham, and Intech Open. Prof. Jhanjhi excels in mentoring postgraduate scholars, with over 38 scholars graduating under his tutelage. He also serves as Associate Editor and Editorial Assistant Board member for reputable journals and has received accolades such as the Outstanding Associate Editor award for IEEE ACCESS. Renowned as a top-tier reviewer by Publons (Web of Science), Prof. Jhanjhi has evaluated over 60 theses as an external Ph.D./Master thesis examiner for universities worldwide. His extensive academic qualifications span 10 years and encompass accreditation bodies such as ABET, NCAAA, and NCEAC. Prof. Jhanjhi's diverse research interests encompass Cybersecurity, AI, IoT Security, Wireless Security, Data Science, Software Engineering, and Unmanned Aerial Vehicles (UAVs). Additionally, he has been invited as a keynote speaker for over 60 international conferences and has chaired numerous international conference sessions.
Speech Title: Cybersecurity Issues and Challanges in the Era of Generative AI

Asst. Prof. Hongyang Li
The University of Hong Kong, China

Biography: Professor Li is an Assistant Professor in HKU Musketeers Foundation Institute of Data Science and Research Scientist at OpenDriveLab, Shanghai AI Lab. His research focus is on autonomous driving and embodied AI. He proposed the bird’s-eye-view perception work, BEVFormer, that won Top 100 AI Papers in 2022 and was explicitly recognized by Jensen Huang, CEO of NVIDIA and Prof. Shashua, CEO of Mobileye at public keynotes. He served as Area Chair for CVPR 2023, 2024, NeurIPS 2023 (Notable AC), 2024, ACM MM 2024, ICLR 2025, referee for Nature Communications. He will serve as Workshop Chair for CVPR 2026. He is the Working Group Chair for IEEE Standards under Vehicular Technology Society and Senior Member of IEEE.
Speech Title: Achilles' Heel in Manipulation: Key Recipe and Missing Pieces towards Intelligent Embodied AI
Abstract: The increasing demand for versatile robotic systems to operate in diverse and dynamic environments has emphasized the importance of a generalist policy, which leverages a large cross-embodiment data corpus to facilitate broad adaptability and high-level reasoning. However, the generalist would struggle with inefficient inference and cost-expensive training. The specialist policy, instead, is curated for specific domain data and excels at task-level precision with efficiency. Yet, it lacks the generalization capacity for a wide range of applications. Inspired by these observations, we introduce RoboDual, a synergistic dual-system that supplements the merits of both generalist and specialist policy. A diffusion transformer-based specialist is devised for multi-step action rollouts, exquisitely conditioned on the high-level task understanding and discretized action output of a vision-language-action (VLA) based generalist. Compared to OpenVLA, RoboDual achieves a 12% improvement on CALVIN and 26.7% in real-world by adapting the specialist policy with 20M trainable parameters only. It maintains strong performance with merely 5% of demonstration data, and enables a 3.8 higher control frequency in real-world deployment. Code and models would be made publicly available.

Prof. Bing Liu
University of Illinois Chicago (UIC)
ACM/AAAI/IEEE Fellow

Biography: Bing Liu is a Distinguished Professor and the Peter L. and Deborah K. Wexler Professor of Computing at the University of Illinois Chicago. He earned his Ph.D. in Artificial Intelligence from the University of Edinburgh. His research interests span continual/lifelong learning, lifelong learning dialogue systems, machine learning, and natural language processing. Professor Liu has published extensively in top conferences and journals and authored five books, including two focused on lifelong/continual learning. He has received three Test-of-Time paper awards, one Test-of-Time honorable mention, and some of his work has been widely featured in international media and tech press. He served as Chair of ACM SIGKDD from 2013 to 2017 and as a program chair for numerous leading data mining conferences. Currently, he serves as a program co-chair for the 2025 Conference on Lifelong Learning Agents (CoLLAs-2025). Among his many honors, Professor Liu is the 2018 recipient of the ACM SIGKDD Innovation Award and is a Fellow of ACM, AAAI, and IEEE.
Speech Title: Continual Learning Using Large Language Models
Abstract: The ability to continually learn and accumulate knowledge over a lifetime is a hallmark of human intelligence. It is also essential for AI agents. However, the prevailing machine learning paradigm lacks this crucial capability. This talk introduces the concept of continual learning, outlining its different settings, and then delves into using large language models (LLMs) for continual learning, which notably boosts accuracy. Following this, it presents some recent work on using in-context learning as a strategy for continual learning, which further enhances accuracy and adaptability.

Assoc. Prof. Rui Gao
Shanghai Jiao Tong University, China

Biography: Rui Gao is an Associate Professor in the Department of Naval Architecture and Ocean Engineering at Shanghai Jiao Tong University. She earned her Doctor of Science (Tech.) degree in Automation, Systems and Control Engineering from Aalto University, Finland, in 2020, and conducted postdoctoral research at the University of Cambridge, UK, in 2021. She has been selected for the Shanghai Overseas High-Level Leading Talents Program and the Pujiang Talents Program. Her research focuses on the innovative application of artificial intelligence algorithms in marine unmanned systems.
Speech Title: Large Language Model Driven Adaptive Path Planning Method for Unmanned Surface Vehicle Swarm
Abstract: To address the issues of poor adaptability and insufficient robustness in Unmanned Surface Vehicle (USV) swarm path planning methods within complex environments, we propose an LLM-driven Adaptive Path Planning with Tool-function Chains (APPT). The proposed method utilizes a planning encoder to assist the Large Language Model (LLM) in parsing the features of environmental obstacles. Combined with prompt engineering, it constructs an agent for USV swarm path planning, enabling the dynamic combination and optimization of classical path planning algorithms such as A*, RRT, APF, and DWA. Additionally, a similarity calculation strategy is employed to achieve intelligent matching of tool chains, allowing flexible adaptation to diverse task requirements and complex obstacle environments, while also supporting user-guided adaptive iterative optimization. Experimental results show that the APPT method achieves an average accuracy of 89.7% in effective tool selection across multiple scenarios. It also possesses the capability of iterative optimization based on demands, reducing the total path length by 14.55%. The APPT method fully leverages the reasoning and analytical advantages of LLMs, significantly enhancing the intelligent decision-making ability of USV swarms. It provides a solution for path planning in complex environments that integrates theoretical innovation and engineering practicality.

Assoc. Prof. Liang Pang
CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences, China

Biography: Liang Pang, an associate researcher at the CAS Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences, and a visiting scholar at the National University of Singapore, specializes in research areas of natural language generation and information retrieval. He has published over 60 papers at international conferences and has accumulated more than 3000 citations on Google Scholar. Pang serves as a program committee member for international conferences, a reviewer for academic journals, a standing committee member of the Information Retrieval Special Committee of the Chinese Information Processing Society, the deputy director of the Youth Working Committee of the Chinese Information Processing Society, and a member of the Youth Innovation Promotion Association of the Chinese Academy of Sciences. He has been honored with the Outstanding Doctoral Dissertation Award from the Chinese Information Processing Society, Best Paper Runner-up Award at CIKM, and received the Best Paper Hornerable Mentioned Award at SIGIR. His proposed deep text matching model achieved a global ranking of fourth in the Kaggle QQP Text Matching competition. He was the global champion in reinforcement learning at the NeurIPS 2018 Multi-Agent Challenge. His team topped the global leaderboard in the multi-hop open-domain question answering challenge HotpotQA.
Speech Title: 检索增强大模型前沿技术与社会影响
Abstract: 近年来，检索增强大模型的范式有效地提升了大语言模型生成内容的准确性和可信性，基于检索增强大模型的流程我们可以从四个视角来讨论。在信息检索模块的视角，如何构建适用于大模型的检索模块，有助于大模型更高效的筛选出对生成有效的信息；在大语言模型模块的视角，如何教会大模型使用外部信息，有助于避免检索噪声信息对生成影响；在模块间交互的视角，如何设计信息检索模块与大语言模型模块交互配合的机制，有助于将内部参数知识与外部语料库知识充分融合；最后，在信息回路的视角，讨论智能生成内容将对信息检索内容生态造成的潜在影响。

Dr. Hoifung Poon
General Manager, Health Futures
Microsoft Research

Biography: Hoifung Poon is the General Manager at Health Futures in Microsoft Research and an affiliated faculty at the University of Washington Medical School. He leads biomedical AI research and incubation, with the overarching goal of structuring medical data to optimize delivery and accelerate discovery for precision health. His team and collaborators are among the first to explore large language models (LLMs) and multimodal generative AI in health applications, producing popular open-source foundation models such as PubMedBERT, BioGPT, BiomedCLIP, LLaVA-Med, BiomedParse. His latest publication in Nature features GigaPath, the first whole-slide digital pathology foundation model pretrained on over 1 billion pathology image tiles. He has led successful research partnerships with large health providers and life science companies, creating AI systems in daily use for applications such as molecular tumor board and clinical trial matching. He has given tutorials on these topics at top AI conferences such as ACL, AAAI, and KDD, and his prior work has been recognized with Best Paper Awards from premier AI venues such as NAACL, EMNLP, and UAI. He received his PhD in Computer Science and Engineering from the University of Washington, specializing in machine learning and NLP.
Speech Title: Advancing Health at the Speed of AI
Abstract: The dream of precision health is to develop a data-driven, continuous learning system where new health information is instantly incorporated to optimize care delivery and accelerate biomedical discovery. The confluence of technological advances and social policies has led to rapid digitization of multimodal, longitudinal patient journeys, such as electronic medical records (EMRs), imaging, and multiomics. Our overarching research agenda lies in advancing multimodal generative AI for precision health, where we harness real-world data to pretrain powerful multimodal patient embedding, which can serve as digital twins for patients. This enables us to synthesize multimodal, longitudinal information for millions of cancer patients, and apply the population-scale real-world evidence to advancing precision oncology in deep partnerships with real-world stakeholders such as large health systems and pharmaceutical companies.

Dr. Xian Wu
Director of Tencent Youtu Lab Jarvis Research Center

Biography: Xian Wu received the PhD degree from Shanghai Jiao Tong University. He is now a principal researcher with Tencent. Before joining Tencent, he worked as a senior scientist manager and a staff researcher with Microsoft and IBM Research. His research interests include medical AI, natural language processing and multi-modal modeling. He has published papers in CVPR, NeurIPS, ACL, WWW, AAAI, IJCAI etc. He also served as PC member of IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Knowledge Discovery from Data, ACM Transactions on Information Systems, ACM Transactions on Intelligent Systems and Technology, CVPR, ICCV, AAAI etc.
Speech Title: 从深度学习到大模型，医学AI上的一些尝试

Dr. Xin Xia
Chief Expert of the Software Engineering Application Technology at Huawei, China.

Biography: Xin Xia is the Chief Expert of Software Engineering Application Technology at Huawei, China. Before joining Huawei, he was an ARC DECRA Fellow and a lecturer (equivalent to a U.S. assistant professor) at the Faculty of Information Technology, Monash University, Australia. He earned his Ph.D. in June 2014 from the College of Computer Science and Technology, Zhejiang University, China, under the supervision of Prof. Xiaohu Yang and Prof. Jianling Sun. From July 2012 to January 2014, he was a visiting student with Prof. David Lo at Singapore Management University. In 2022, he received the ACM SIGSOFT Early Career Researcher Award. Xin Xia's current research aims to assist developers and testers in improving their productivity by focusing on data science for software engineering. Specifically, he works on mining and analyzing data from software repositories to uncover valuable and actionable insights. His work employs and customizes a variety of structured and unstructured data analytics techniques, such as data mining, information retrieval, natural language processing, search-based algorithms, and program analysis, transforming passive software engineering data into automated tools and novel insights.
Speech Title: 大模型下的软件工程：进展与挑战
Abstract: 软件工程大模型得到了广泛应用，同时也迎来了新的挑战，例如如何让大模型可以更好地理解软件工程业务和知识、如何更好地使能大模型输出安全可信的代码、如何评价大模型在各项软件工程能力的表现等，这也亟需我们重新思考大模型下的软件工程的未来方向。本次报告从实践角度，梳理当前软件工程大模型的挑战，并探讨未来可能的发展方向。

Prof. Jungang Xu
University of Chinese Academy of Sciences, China
Director of Cloud Computing and Intelligent Information Processing Laboratory

Biography: Jungang Xu, Professor and doctoral supervisor of University of Chinese Academy of Sciences, Director of Cloud Computing and Intelligent Information Processing Laboratory, and chief Professor of Deep Learning Course of University of Chinese Academy of Sciences. His research interests include multimodal intelligence, intelligent decision and optimization, embodied intelligence, etc. He is the Expert in the National Science and Technology expert Database, the expert of the Ministry of Industry and Information Technology of China, the expert of the Beijing Municipal Science and Technology Commission and Administrative Commision of Zhongguancun Science Park. He is the executive member of the Special Committee on Artificial Intelligence and Pattern Recognition, executive member of the Special Committee on Natural Language Processing, executive member of the Special Committee on Database in China Computer Federation, and standing member of the Special Committee on Intelligent Service of the Chinese Association for Artificial Intelligence. He has presided over a number of scientific research projects, such as National Key Technology Research and Development Program, National Natural Science Foundation, Beijing Science and Technology Plan, and Beijing Natural Science Foundation, and published more than 100 articles. He won the second prize of China Geographic Information Technology Progress Award in 2022.
Speech Title: 大模型的发展趋势与应用

Assoc. Prof. Cheng Yang
Beijing University of Posts and Telecommunications, China

Biography: Cheng Yang received the BE and PhD degrees from Tsinghua University, in 2014 and 2019, respectively. He is currently an associate professor with the Beijing University of Posts and Telecommunications. His research interests include natural language processing and network representation learning.
Speech Title: 大语言模型智能体高效协作框架
Abstract:大语言模型(LLMs)目前已展现出推理、规划、工具使用等诸多类人智能，可作为智能体(Agent)的大脑自动化地处理各种复杂任务。然而这些大语言模型智能体是否能够像人类一样学会沟通与分工，更快更好地进行任务协作，仍然是一个亟待探索的问题。本报告将介绍大语言模型智能体协作研究的最新进展，并分析实验中发现的各类智能体合作涌现行为。

Prof. Shuanghua Yang
University of Reading, UK
IET Fellow, IEEE Senior Member

Biography: Shuang-Hua Yang is currently a professor and the Head of Department of Computer Science at the University of Reading, the UK and the Director of Shenzhen Key Laboratory of Safety and Security for Next Generation of Industrial Internet, China. He was selected as a member of European Academy of Sciences and Arts in 2024, and awarded DSc from Loughborough University in 2014 to recognize his academic contribution to wireless monitoring research. He is a Fellow of IET and a Fellow of InstMC, U.K. His current research interests include cyber-physical system safety and security, and industrial Internet of Things.
Speech Title: Comprehensive Knowledge Integration for Multivariate Time Series Anomaly Detection with Multi-view learning
Abstract: Anomaly detection in the Industrial Internet of Things (IIoT) is a challenging task that hinges on the effective learning of multivariate time series representations. Despite the intricate spatial and temporal relationships inherent in IIoT systems, existing methods primarily extract features from a single domain—either temporal or spatial (sensor-wise)—or simply combine the two sequentially, limiting their anomaly detection capabilities. To address these limitations, this talk introduces the Spatial-Temporal Association Discrepancy (STAD) component, which leverages the discrepancies between spatial and temporal features to enhance latent representation learning. Specifically, we propose the Skip-Patching Spatial-Temporal Anomaly Detection (SSAD) framework, which integrates spatial and temporal features in a diverse and comprehensive manner, significantly improving learning processes. Furthermore, we present a novel framework called Two-Views Pre-train Anomaly Detection (2ViewsAD), designed to enhance both the generalization and robustness of learned representations. The SSAD framework demonstrates superior performance, validating the effectiveness of combining skip-patching techniques with spatial-temporal features to improve anomaly detection in IIoT systems. Meanwhile, 2ViewsAD utilizes self-supervised learning during pre-training, effectively capturing both temporal and spatial (sensor-wise) features. This dual-view strategy enables the model to seamlessly integrate insights from both perspectives, further boosting detection capabilities. Experimental results confirm that 2ViewsAD achieves state-of-the-art anomaly detection performance.

Asst. Prof. Quanming Yao
Tsinghua University, China

Biography: Dr. Quanming Yao currently is a tenure-track assistant professor at Department of Electronic Engineering, Tsinghua University. He was a researcher to a senior scientist in 4Paradigm INC, where he set up and led the company's machine learning research team. He obtained his Ph.D. degree at the Department of Computer Science and Engineering of Hong Kong University of Science and Technology (HKUST). He has published 80+ top conference and journal papers, with more than 10000 citations. He regularly serves as area chairs for ICML, NeurIPS and ICLR. He is also a receipt of National Youth Talent Plan (China), inaugural winner of Ant Intech Prize, Forbes 30 Under 30 (China), Young Scientist Awards (Hong Kong Institution of Science), and Google Fellowship (in machine learning).
Speech Title: Parsimony Learning from Deep Networks
Abstract: The scaling law, which involves the brute-force expansion of training datasets and learnable parameters, has become a prevalent strategy for developing more robust learning models. However, due to bottlenecks in data, computation, and trust, the sustainability of the scaling law is a serious concern for the future of deep learning. In this paper, we address this issue by developing next-generation models in a parsimonious manner (i.e., achieving greater potential with simpler models). The key is to drive models using domain-specific knowledge, such as symbols, logic, and formulas, instead of relying on the scaling law. This approach allows us to build a framework that uses this knowledge as “building blocks” to achieve parsimony in model design, training, and interpretation. Empirical results show that our methods surpass those that typically follow the scaling law. We also demonstrate the application of our framework in AI for science, specifically in the problem of drug-drug interaction prediction. We hope our research can foster more diverse technical roadmaps in the era of foundation models.

Prof. Xindong You
Beijing Information Science & Technology University, China

Biography: Xindong You is a Professor at Beijing Information Science and Technology University, she is a member of both the Natural Language Processing Professional Professional Committee and the Information Storage Committee of China Computer Federation (CCF). She has presided near 20 research projects, including the National Natural Science Foundation of China, the National Defense Basic Strengthening Research Program, the Beijing Natural Science Foundation General Program, the Equipment Pre-research Key Laboratory Fund Project, industry-commissioned horizontal projects, the Zhejiang Provincial Natural Science Foundation, and the China Postdoctoral Science Foundation General Program. She has also participated as a key member in over 10 research projects, such as the 973 National Key Research and Development Program, the Ministry of Science and Technology's Support Program, the National Natural Science Foundation of China, Zhejiang Provincial Major Special Projects, Zhejiang Provincial Natural Science Foundation Projects, and the Humanities and Social Sciences Research Program funded by the Ministry of Education. She has published more than 30 papers in domestic and international journals as the first author or corresponding author, with three papers included in the TOP journals of the first quartile in Chinese Academy of Sciences' journal ranking. Additionally, She has authored one academic monograph independently, which was published with support from the China Postdoctoral Excellent Academic Monograph Publication Fund by Science Press.
Speech Title: Exploration of Key Technologies and Field Applications of Knowledge Graphs
Abstract: The technical system of symbolic knowledge graphs serves as an effective complement to large models, providing support for accurate domain knowledge and complex reasoning capabilities for the industrial implementation of large models. The combination of domain-specific large models and domain-specific knowledge graphs can become an important means for the application of artificial intelligence in various fields. This report mainly discusses the past, present, and future development trends of knowledge graphs, the key technologies for constructing knowledge graphs and their main application scenarios, as well as the research group's application in the fields of weaponry and equipment, coal mine electromechanical equipment, interpretability of image classification, and the entire Mini/Micro LED industry chain, and vision future prospects for the application of the knowledge graph.

Prof. Zhongfei (Mark) Zhang
University of New York (SUNY) at Binghamton, USA
IEEE Fellow, IAPR Fellow, AAIA Fellow

Biography: Zhongfei (Mark) Zhang is a professor at the School of Computing, Binghamton University, State University of New York (SUNY), USA. He received a B.S. in Electronics Engineering (with Honors), an M.S. in Information Sciences, both from Zhejiang University, China, and a PhD in Computer Science from the University of Massachusetts at Amherst, USA. His research interests are in the broad areas of machine learning, data mining, computer vision, and pattern recognition, and specifically focus on multimedia/multimodal data understanding and mining. He was on the faculty of Computer Science and Engineering at the University at Buffalo, SUNY, before he joined the faculty of the School of Computing at Binghamton University, SUNY. He is the author or co-author of the very first monographs on multimedia data mining and on relational data clustering, respectively. He has published over 200 papers in the premier venues in his areas. He holds more than thirty inventions, has served as members of the organization committees of several premier international conferences in his areas including general co-chair and lead program chair, and as editorial board members for several international journals. He served as a French CNRS Chair Professor of Computer Science at the University of Lille 1 in France, a JSPS Fellow and visiting professorship in Waseda University and Chuo University, Japan, a QiuShi Chair Professor in Zhejiang University, China, as well as visiting professorships at many universities and research labs in the world when he was on leave from Binghamton University years ago. He received many honors including SUNY Chancellor’s Award for Scholarship and Creative Activities, SUNY Chancellor’s Promising Inventor Award, and best paper awards from several premier conferences in his areas. He is a Fellow of IEEE, IAPR, and AAIA.
Speech Title: Uncertainty Analysis for Out-of-distribution Detection
Abstract: One significant obstacle to deploying deep neural network (DNN) models in real-world applications is that deep learning systems often break down in novel situations which were never seen during the training of the system. This is related to the out-of-distribution detection problem in the literature. Specifically, DNNs tend to yield unreliable predictive estimates and make high-confident yet incorrect predictions when exposed to inputs drawn from unfamiliar distributions. Consequently, accurate predictive uncertainty analysis of DNNs is critical in many high-stake applications such as medical diagnosis, self-driving vehicles, and financial decision-making, where silent mistakes can lead to catastrophic consequences. In this talk, I will first introduce the uncertain analysis issue through a novel uncertainty factorization model as the theoretical foundation for this study. Based on this model, I will then introduce a general and flexible framework for predictive uncertainty estimation with promising evaluation results in several out-of-distribution detection tasks on both vision and language datasets.

Prof. Dongyan Zhao
Wangxuan Institute of Computer Technology
Peking University, China

Biography: Dongyan Zhao is a professor with the Wangxuan Institute of Computer Technology (WICT), Peking University (PKU), China. He received the BS, MS, and PhD degrees in computer science from the Department of Computer Science and Technology, PKU. He His major research interests include natural language processing, semantic data management and knowledge-based intelligent system.
Speech Title: 基于大规模语言模型的智能问答

Asst. Prof. Lei Lu
King’s College London & University of Oxford, UK

Biography: Dr. Lei Lu is an Assistant Professor at King’s College London, and a Visiting Research Fellow at University of Oxford. Prior to this, he was a Senior Research Associate at the Institute of Biomedical Engineering, University of Oxford. Dr. Lu’s work focuses on clinical machine learning and computational informatics for healthcare applications. This involves developing multimodal AI and generative model for medical diagnosis, patient phenotyping, health prediction, and biomarker identification. He contributes to the academic community by serving as conference session chair and workshop committee for IJCAI, CIKM, and ICRA. His papers were published in IEEE TPAMI, TCYB, JBHI, TBME, and EHJ-DH. He received the IET J.A. Lodge award in 2021, which presents to one early-career researcher annually with distinction in the UK and abroad.
Speech Title: Deep Learning for Advancing Cardiovascular Healthcare
Abstract: Electrocardiogram (ECG) is widely considered the primary test for evaluating cardiovascular diseases. However, the use of AI models to advance these medical practices and learn new clinical insights from ECGs remains largely unexplored. Utilising a data set of 2.3 million ECGs collected from patients with 7 years follow-up, we developed a DNN model with state-of-the-art granularity for the interpretable diagnosis of cardiac abnormalities, gender identification, and hypertension screening solely from ECGs, which are then used to stratify the risk of mortality. Our model demonstrated cardiologist-level accuracy in interpretable cardiac diagnosis, and the potential to facilitate clinical knowledge discovery for gender and hypertension detection which are not readily available. In addition, we explored the design of optimal DNN models through of a novel Neural Architecture Search (NAS) approach, which was able to find networks outperformed the state-of-the-art models with fewer than 5% parameters.

Asst. Prof. Liangqiong Qu
The University of Hong Kong, Hong Kong S.A.R., China

Biography: Dr. Liangqiong Qu is an Assistant Professor in the Department of Statistics and Actuarial Science and the Institute of Data Science, The University of Hong Kong. Previously, she was a postdoctoral research fellow at Stanford University, working with Prof. Daniel Rubin. Before joining Stanford, she was a postdoctoral research fellow at The University of North Carolina at Chapel Hill, working with Prof. Dinggang Shen. She obtained her joint Ph.D. degree in University of Chinese Academy of Sciences and City University of Hong Kong under the supervision of Prof. Yandong Tang, Prof. Qingxiong Yang, and Prof. Rynson W.H. Lau. Her research interests span the area of artificial intelligence, computer vision and medical imaging processing. More information about Dr. Qu can be found at her personal website: https://liangqiong.github.io/.
Speech Title: Advancing federated learning via Heterogeneity Evaluation, Optimization, and Privacy Preservation
Abstract: Federated Learning (FL) offers a promising solution for training robust deep learning models on large and representative data without sharing it across institutions. Nonetheless, the widespread adoption of FL in healthcare is hindered by two key challenges: (1) The lack of federated learning methods robust to data, device, and state variabilities across sites. Existing approaches for addressing device and state heterogeneities are often evaluated in simulated FL environments, raising concerns about their real-world performance. Additionally, assessing a new FL device/state optimization method’s ability to adapt to varying degrees of such heterogeneity is challenging due to the lack of diverse real-world datasets and quantification metrics. (2) Potential privacy leakage risks through shared model weights and the absence of intuitive tools for securely executing FL algorithms. While advanced privacy preservation FL techniques exist, they usually involve considerable trade-offs between accuracy and utility. In this talk, we will illustrate how we address the foregoing challenges by establishing a practical and versatile FL platform that integrates real-world evaluation benchmarks, heterogeneous optimization methods, and privacy protection strategies.

Assoc. Prof. Yanan Sui
Tsinghua University, China

Biography: Yanan Sui (YananSui.com), Associate Professor at Tsinghua University, is dedicated to the research of human neuro-musculo-skeletal modeling and control, with applications in embodied intelligence and brain-machine interaction. He received his B.S. from Tsinghua University, his Ph.D. from Caltech, and did postdoctoral work at Caltech and Stanford University. His work on safe optimization has been included in textbooks at Stanford and other universities. He co-won the Best Conference Paper Award and the Best Paper Award on Human-Robot Interaction at the 2020 International Conference on Robotics and Automation. His work has been successfully applied to the clinical treatment of neural injuries in China and the United States. He has served as area chair for leading AI conferences. For his contribution to the interdisciplinary field of artificial intelligence and neural engineering, he was selected as one of MIT Technology Review's Innovators Under 35 in China.
Speech Title: Self Model for Embodied Intelligence
Abstract: Modeling and control of the human musculoskeletal system is important for understanding human motor function, developing embodied intelligence, and optimizing human-robot interaction systems. However, current models are restricted to a limited range of body parts and often with a reduced number of muscles. There is also a lack of algorithms capable of controlling over 600 muscles to generate reasonable human movements. To fill this gap, we build a musculoskeletal model with 90 body segments, 206 joints, and ~700 muscle-tendon units, allowing simulation of whole-body dynamics and interaction with various devices. We develop a new algorithm using low-dimensional representation and hierarchical deep reinforcement learning to achieve state-of-the-art whole-body control. We validate the effectiveness of our model and algorithm in simulations using real human locomotion data. This work promotes a deeper understanding of human motion control and better design of interactive robots.

History of AIGC 2024

International Conference on AI-Generated Content (AIGC 2024)