Keynotes
Dr Andrew Putnam
Partner GM, Cloud & AI Hardware Engineering -- Microsoft Azure
USA
Title: Redefining the FPGA Landscape in the Age of Cloud Computing and AI
Date: July 24 (Wed)
Time: 09:30-10:30
Session Chair: Wayne Luk
Abstract: FPGAs have carved a niche in the computing industry by addressing unique problem domains and being pioneers in technology adoption. As the industry focused on mainframes, PCs, and mobile, FPGAs excelled in ASIC prototyping, telecom, networking, and specialized computing. With the pivot to Cloud and AI, FPGAs have played key roles, but now they face a critical juncture. As major computing players converge on the areas traditionally dominated by FPGA researchers and developers, the pressing question is whether FPGAs can maintain their innovative edge or be overshadowed by larger, better-funded competitors.
This presentation will critically examine the FPGA industry's influence on Cloud and AI, evaluate its triumphs and setbacks, and identify pivotal research areas that could propel FPGAs toward groundbreaking innovation in specialized hardware and accelerated computing. We invite you to join us in a timely discussion on how the FPGA community must swiftly capitalize on early successes in Cloud and AI to catalyze rapid, significant advancements and secure a lasting impact before the window of opportunity closes.
Bio: Dr. Andrew Putnam is a leading figure in Cloud & AI Hardware Engineering, serving as a Partner General Manager within Microsoft's Azure Hardware Systems & Infrastructure group. He co-founded the Microsoft Catapult project, where his pioneering work integrated Field Programmable Gate Arrays (FPGAs) into production hyperscale data centers, doubling server capacity for web search. His efforts have been instrumental in developing Azure Accelerated Networking, Project Brainwave, and Azure Boost, contributing to some of the fastest network, storage, and AI solutions in the cloud and setting new industry benchmarks.
Dr. Putnam's academic journey is distinguished by a triple major in Physics, Computer Science, and Electrical Engineering from the University of San Diego, and a Master's and Ph.D. in Computer Science and Engineering from the University of Washington. Starting as a researcher at Microsoft in 2009, he rapidly made a name for himself through his innovative contributions in FPGAs and computer architecture. In 2016, he transitioned his research to practical applications within Azure Networking, leading to his current leadership role in Azure Hardware since 2020, where he continues to drive the evolution of specialized hardware and accelerated computing at hyperscale.
Dr Damien Querlioz
Bioinspired Nanolectronics, CNRS Research Director at Centre de Nanosciences et de Nanotechnologies
France
Title: In-Memory Computing: A Path to Energy-Efficient and Trustworthy Embedded AI
Date: July 24 (Wed)
Time: 13:30-14:30
Session Chair: Wei Zhang
Abstract: Artificial intelligence (AI) has immense potential for edge applications, but its energy demands are a critical barrier to widespread adoption in fields such as medical implants and advanced brain-machine interfaces. Traditional computing systems face substantial inefficiencies due to the high energy costs of memory access. In contrast, the human brain achieves remarkable energy efficiency by performing computation directly within memory structures. This talk explores innovative approaches to embedded AI through in-memory and near-memory computing, focusing on integrating logic and memory to drastically reduce energy consumption. We highlight the development and application of emerging non-volatile memory technologies like memristors, magnetic memory, and phase-change memory. These technologies emulate the brain's energy-efficient architecture and have recently reached maturity, enabling the demonstration of fully functional in-memory computing systems, which we will showcase throughout this talk. These advanced memory technologies facilitate the creation of digital, low-precision neural networks, offering robust, low-power solutions for AI inference. They also enable analog in-memory computing, which naturally performs neural network operations through fundamental electrical laws. Despite their potential, these technologies come with significant challenges due to their variability. We demonstrate how Bayesian techniques can not only tolerate these imperfections but sometimes even leverage them. The presentation will also cover recent advances in local learning algorithms, such as Equilibrium Propagation, which promise efficient on-chip learning capabilities using in-memory computing. Finally, we will discuss the current challenges and future directions for incorporating such technologies at the architectural level.
Bio: Damien Querlioz is a CNRS Research Director at the Centre de Nanosciences et de Nanotechnologies of Université Paris-Saclay and CNRS. His research focuses on novel usages of emerging non-volatile memory and other nanodevices, in particular relying on inspirations from biology and machine learning. He received his predoctoral education at Ecole Normale Supérieure, Paris and his PhD from Université Paris-Sud in 2009. Before his appointment at CNRS, he was a Postdoctoral Scholar at Stanford University and at the Commissariat à l'Energie Atomique. Damien Querlioz is the coordinator of the interdisciplinary INTEGNANO research group, with colleagues working on all aspects of nanodevice physics and technology, from materials to systems. He has co-authored one book, nine book chapters, more than 150 journal articles, and conference proceedings, and given more than 80 invited talks at national and international workshops and conferences. In 2016, he was the recipient of an ERC Starting Grant to develop the concept of natively intelligent memory. In 2017, he received the CNRS Bronze medal. He has also been a co-recipient of the 2017 IEEE Guillemin-Cauer Best Paper Award and of the 2018 IEEE Biomedical Circuits and Systems Best Paper Award.
Prof Kentaro Sano
RIKEN Center for Computational Science
Processor Research Team
Japan
Title: Challenges for Reconfigurable HPC : FPGA or Customized Architectures such as CGRA?
Date: July 25 (Thu)
Time: 09:00-10:00
Session Chair: Jason Anderson
Abstract: At RIKEN Center for Computational Science (R-CCS), we have been researching future architectures for HPC (high-performance computing) and AI. Especially, in Processor research team, we are focusing on reconfigurable computing architectures such as coarse-grained reconfigurable array (CGRA), which can be advantageous for limited data movement resulting in lower power consumption. In this talk, I introduce our previous research on FPGA-based reconfigurable HPC and share lessons learned, and then show RIKEN CGRA project for HPC and AI with architectural exploration for more efficient reconfigurable computing.
Bio: Kentaro Sano is the leader of the processor research team and the next-generation AI device R&D unit at RIKEN Center for Computational Science (R-CCS) since 2017, responsible for research and development of future processors and systems for HPC and AI. He is also a visiting professor with an advanced computing system laboratory at Tohoku University. He received his Ph.D. from the graduate school of information sciences, Tohoku University, in 2000. From 2000 until 2018, he was a Research Associate and an Associate Professor at Tohoku University. He was a visiting researcher at the Department of Computing, Imperial College, London, and Maxeler Technology corporation in 2006 and 2007. Nowadays he leads the architecture research group in the feasibility study project for the next-generation supercomputer development in Japan. His research interests include data-driven and spatial-parallel processor architectures such as a coarse-grain reconfigurable array (CGRA), FPGA-based high-performance reconfigurable computing, high-level synthesis compilers and tools for reconfigurable custom computing machines, and system architectures for next-generation supercomputing based on the data-flow computing model.
Prof Yu Wang
Professor and Department Head, EE Dept, Tsinghua University
China
Date: July 26 (Fri)
Time: 09:30-10:30
Abstract: Based on the transformer architecture, Large Language Models (LLMs) and other AIGC algorithms have achieved outstanding performance across
various applications, marking the advent of the AI 2.0 era. The exponential growth in model parameters has led to a corresponding increase in computational, storage, and memory access overheads, escalating by four to five orders of magnitude compared to traditional deep learning models. This has caused significant disparities between the algorithmic requirements and the capabilities of existing hardware platforms.
This keynote will first review the energy-efficient circuit and system design methodology for AI 1.0, then focus on the new key challenges brought by large models. After that, this talk will introduce energy-efficient circuit and system design methodology for AI 2.0. Focusing on various AIGC algorithms like LLM and diffusion model, this talk will delve into software and hardware co-optimization methods for GPU and FPGA-based heterogeneous systems, including efficient AIGC
algorithm models with quantization and sparsity, inference framework and kernel optimizations, and energy-efficient hardware IP. Our optimization methods can reduce the total cost of large model inference by four orders of magnitude. Finally, we will provide insights into future development trends of AI 2.0 era.