Keynotes

Representation Learning and Information Retrieval

Yiming Yang

Yiming Yang
Carnegie Mellon University
Livestream: https://youtube.com/live/1ftmVDwdeO0?feature=share

Abstract: How to best represent words, documents, queries, entities, relations, and other variables in information retrieval (IR) and related applications has been a fundamental research question for decades. Early IR systems relied on the independence assumptions about words and documents for simplicity and scalability, which were clearly sub-optimal from a semantic point of view. The rapid development of deep neural networks in the past decade has revolutionized the representation learning technologies for contextualized word embedding and graph-enhanced document embedding, leading to the new era of dense IR. This talk highlights such impactful shifts in representation learning for IR and related areas, the new challenges coming along and the remedies, including our recent work in large-scale dense IR, in graph-based reasoning for knowledge-enhanced predictions, in self-refinement of large language models (LLMs) with retrieval augmented generation (RAG) and iterative feedback, in principle-driven selfalignment of LLMs with minimum human supervision, etc. More generally, the power of such deep learning goes beyond IR enhancements, e.g., for significantly improving the state-of-the-art solvers for NP-Complete problems in classical computer science.

Bio: Yiming Yang is a professor with a joint appointment at the Language Technologies Institute (LTI) and the Machine Learning Department (MLD) in the School of Computer Science, Carnegie Mellon University (CMU). She has jointed CMU as a faculty member since 1996, and her research has been focused on machine learning paradigms, algorithms and applications in a broad range, including her influential early work in large-scale text classification and information retrieval, and more recently on cutting-edge technologies for large language models (e.g., XL-Net), neural-network architecture search (e.g., DARTS), reasoning with graph neural networks, reinforcement learning and diffusion models for solving NP complete problems (e.g., DIMES and DIFFUSCO), AI-enhanced self-alignment of LLMs, knowledge-enhanced information retrieval, LLMs with RAG (Retrieval Augmented Generation), large foundation models for scientific domains, etc. She became a member of the SIGIR Academy in 2023, in recognition for her contributions in the intersection of Machine Learning and Information Retrieval.


Towards Steerable AI Systems

Thorsten Joachims

Thorsten Joachims
Cornell University
Livestream: https://youtube.com/live/aqGK2LIxVKI?feature=share

Abstract: AI systems ranging from personal assistants to large internet platforms are taking an ever increasing role in acting on our behalf. This requires that we can steer these systems to take actions that match our expectations and that are beneficial in the long run. This talk will explore machine learning methods for steerable control from the perspective of different stakeholders, including the end-users as well as the platform. I argue that we need models that can reason about goals at multiple scales, enabling well-founded approaches for trading off short-term and long-term metrics. The talk will illustrate such methods and models using applications ranging from search ranking to writing assistance.

Bio: Thorsten Joachims is a Professor at Cornell University in the Departments of Computer Science and Information Science, and he is an Amazon Scholar. His research interests center on a synthesis of theory and system building in machine learning from human interaction, with applications in information access, language technology, and recommendation. His research focuses on counterfactual and causal inference, policy learning, learning to rank, structured output prediction, and learning from implicit feedback. He is an ACM Fellow, AAAI Fellow, KDD Innovations Award recipient, and member of the ACM SIGIR Academy.


The Trajectory of Information Retrieval

Michael_Littman

Michael L. Littman
National Science Foundation (NSF)
Livestream: https://youtube.com/live/Wqmx-CA8ADo?feature=share

Abstract: The talk will survey the past and future of information retrieval from my own idiosyncratic perspective. I will draw on my current role at the National Science Foundation, a long time funder of this line of research, as well as my prior experience working in the group that developed the earliest word-embedding approach. In many ways, and perhaps not shockingly, the trajectory of information retrieval has tracked the trajectory of information technology more generally. The current historical moment that we're in---trying to absorb and channel the lessons of large language models---offers a profound opportunity to envision the future of the field and of computing and information more generally.

Bio: Michael L. Littman is currently serving as Division Director for Information and Intelligent Systems at the National Science Foundation. The division is home to the programs and program officers that support researchers in artificial intelligence, human-centered computing, data management, and assistive technologies, as well as those exploring the impact of intelligent information systems on society. Littman is also University Professor of Computer Science at Brown University, where he studies machine learning and decision-making under uncertainty. He has earned multiple university-level awards for teaching and his research has been recognized with three best-paper awards and three influential paper awards. Littman is a Fellow of the Association for the Advancement of Artificial Intelligence and the Association for Computing Machinery.