Program
Paper Session: Machine Learning 1
Date: July 24 (Wed)
Session: 11:00-11:40
Session Chair: Tao Wei
Session: 11:00-11:40
Session Chair: Tao Wei
11:00 | Yannick Braatz, Taha Soliman, Shubham Rai, Dennis Sebastian Rieber and Oliver Bringmann | CoNAX: Towards Comprehensive Co-Design Neural Architecture Search Using HW Abstractions |
11:20 | Hanqing Liu, Xiaole Cui, Sunrui Zhang, Mingqi Yin, Yuanyuan Jiang and Xiaoxin Cui | A Convolutional Spiking Neural Network Accelerator with the Sparsity-aware Memory and Compressed Weights |
Paper Session: Arithmetic
Date: July 24 (Wed)
Session: 15:00-16:30
Session Chair: Chester Sungchung Park
Session: 15:00-16:30
Session Chair: Chester Sungchung Park
15:00 | Andreas Boettcher and Martin Kumm | Multiplier Design Addressing Area-Delay Trade-offs by using DSP and Logic resources on FPGAs |
15:20 | Georg Rutishauser, Joan Mihali, Moritz Scherer and Luca Benini | xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems |
15:40 | Jens Karrenbauer, Sven Schönewald, Simon Klein and Holger Blume | Enhancing a Hearing Aid Processor with ISA Extensions Supporting Flexible Fixed-Point Formats |
16:00 | Yasong Cao, Mei Wen, Junzhong Shen and Zhongxing Li | BitShare: An Efficient Precision-Scalable Accelerator with Combining-Like-Terms GEMM |
Paper Session: Architectures and Applications
Date: July 25 (Thu)
Session: 11:00-12:00
Session Chair: Qiang Liu
Session: 11:00-12:00
Session Chair: Qiang Liu
11:00 | Dhruv Gajaria, Tosiron Adegbija and Kevin Gomez | CHIME: Energy-Efficient STT-RAM-based Concurrent Hierarchical In-Memory Processing |
11:20 | Songqiao Cui and Josep Balasch | Configurable Loop Shuffling via Instruction Set Extensions |
11:40 | Salim Khemira, Xinyuan Wang, Anh Nguyen, Yutaka Tamiya, Makoto Taiji, Takahide Yoshikawa and Jason Anderson | Raising Compute Density of Molecular Dynamics Simulation Through Approximate Memoization |
Paper Session: Tools
Date: July 25 (Thu)
Session: 13:30-14:30
Session Chair: Matthew Tang
Session: 13:30-14:30
Session Chair: Matthew Tang
13:30 | Yuanhai Zhang, Shuai Zhao, Gang Chen and Kai Huang | Fault-tolerant DAG Scheduling with Runtime Reconfiguration on Multicore Real-Time Systems |
13:50 | Tianyi Yu, Omar Ragheb, Stephen Wicklund and Jason Anderson | MLIR-to-CGRA: A Versatile MLIR-Based Compiler Framework for CGRAs |
14:10 | Yunfeng Deng and Haifeng Sun | A DRL-based multi-priority task division scheduling strategy in IIoT |
Paper Session: Crypto & Security
Date: July 25 (Thu)
Session: 15:00-15:40
Session Chair: Minoru Watanabe
Session: 15:00-15:40
Session Chair: Minoru Watanabe
15:00 | Zhuoheng Ran, Muhammad A.A. Abdelgawad, Zekai Zhang, Ray C.C. Cheung and Hong Yan | RO-SVD: A Reconfigurable Digital Copyright Protection Framework for AIGC Applications |
15:20 | Yang Yang, Rajgopal Kannan and Viktor Prasanna | A Low Latency FPGA Framework for Homomorphic Encryption Operations |
Paper Session: Machine Learning 2
Date: July 26 (Fri)
Session: 11:00-12:00
Session Chair: Martin Kumm
Session: 11:00-12:00
Session Chair: Martin Kumm
11:00 | Haoran Su and Nan Wu | Deoxys: Defensive Approximate Computing for Secure Graph Neural Networks |
11:20 | Zehuan Zhang, Matej Genci, Hongxiang Fan, Andreas Wetscherek and Wayne Luk | Accelerating MRI Uncertainty Estimation with Mask-based Bayesian Neural Network |
11:40 | Zhenyu Xu, Miaoxiang Yu, Jillian Cai, Qing Yang and Tao Wei | TwinStep Network (TwNet): a Neuron-Centric Architecture Achieving Rapid Training |
Paper Session: GPUs
Date: July 26 (Fri)
Session: 13:30-14:10
Session: 13:30-14:10
13:30 | Paul Delestrac, Jonathan Miquel, Debjyoti Bhattacharjee, Diksha Moolchandani, Francky Catthoor, Lionel Torres and David Novo | Analyzing GPU Energy Consumption in Data Movement and Storage |
13:50 | Shinya Miura, Qiong Chang and Jun Miyazaki | K-way In-place Merge by CPU-GPU Cooperative Processing |
Paper Session: Linear Algebra Acceleration
Date: July 26 (Fri)
Session: 14:40-15:40
Session Chair: Jun Miyazaki
Session: 14:40-15:40
Session Chair: Jun Miyazaki
14:40 | Valentin Isaac-Chassande, Adrian Evans, Yves Durand and Frédéric Rousseau | SpDCache: Region-Based Reduction Cache for Outer-Product Sparse Matrix Kernels |
15:00 | Dominik Walter, Thomas Adamtschuk, Frank Hannig and Jürgen Teich | Analysis and Optimization of Block LU Decomposition for Execution on Tightly Coupled Processor Arrays |
15:20 | Shengbai Luo, Sheng Ma, Bo Wang, Yihao Shi, Qingshan Xue and Xueyi Zhang | Sparm: A Sparse Matrix Multiplication Accelerator Supporting Multiple Dataflows |
Poster Presentations 1
Date: July 24 (Wed)
Session: 11:40-12:00
Session: 11:40-12:00
Frederik Kautz, Sven Gesper, Gia Bao Thieu, Hans-Martin Bluethgen, Holger Blume and Guillermo Payá-Vayá | Multi-Level Prototyping of a Vertical Vector AI Processing System |
Yazhou Yan, Jiangnan Li, Guowei Zhu, Wenbo Yin and Lingli Wang | An End-to-End Agile Design Framework to Improve Energy Efficiency on CGRAs |
Oliver Renke, Christoph Riggers, Jens Karrenbauer and Holger Blume | Design space exploration of semantic segmentation CNN SalsaNext for constrained architectures |
Hongbin Wang, Zou Yi, Guohua Wen and Junfeng Hu | Memory Access Acceleration Through Architecture Design for Edge SoCs |
Tenghao Zhao and Zhaohui Ye | ZeroVex: A Scalable and High-performance RISC-V Vector Processor Core for Embedded Systems |
Yingchang Mao and Qiang Liu | MSCA: A Multi-grained Sparse Convolution Accelerator for DNN Training |
Yao Liu, Shiyang Chen, Guolong Yang, Long Ma and Kun Wan | Real-time Order Book Building and Snapshot Generating for High Frequency Trading on FPGA |
Yuchen Gui, Qizhe Wu, Wei Yuan, Huawen Liang, Xiaotian Wang and Xi Jin | A FPGA-HBM-based Hardware Streaming Accelerator for GNN Sampling |
Shipeng Yue, Honghao Liang, Xinpeng Xing and Haigang Feng | SLICE Matrix: A Memory Access Scheduling Policy for Multicore Network Processors |
Poster Presentations 2
Date: July 25 (Thu)
Session: 10:00-10:30
Session: 10:00-10:30
Yude Fang, Junhui Wang, Libo Huang, Yongwen Wang and Weixia Xu | Out-of-Order and Recursive RAS: A Return Address Stack Design On High Performance Processor |
Riadh Benabdelhamid, Vladislav Valek and Dirk Koch | SPARKLE: A 1024-Core/16,384-Thread single FPGA many-core RISC-V barrel processor Overlay |
Uyong Lee, Yeji Park, Junsu Heo, Sungkyung Park and Chester Sungchung Park | Design Space Exploration of FFT Accelerators for IEEE 802.11ax Using High-Level Synthesis |
Junzhe Huang, Qiang Dou and Li Shen | Extending the RISC-V Instruction Set for High Performance Data Compression Hardware Acceleration |
Shaoyang Sun, Boyin Jin, Jiahang Lou, Jiangnan Li, Yuhang Cao, Jingyuan Li, Chen Shen, Yuan Dai, Wenbo Yin and Lingli Wang | MDCRA: A Reconfigurable Accelerator Framework for Multiple Dataflow Lanes |
Yiyang Lin, Yi Zou and Yanfeng Yang | CSIFA: A Configurable SRAM-based In-memory FFT Accelerator |
Matteo Perotti, Michele Raeber, Mattia Sinigaglia, Matheus Cavalcante, Davide Rossi and Luca Benini | Spatzformer: An Efficient Reconfigurable Dual-Core RISC-V V Cluster for Mixed Scalar-Vector Workloads |
Dong-eon Won, Yeeun Kim, Janghwan Lee, Minjae Lee, Jonghyun Bae, Jongjoo Park, Jeongyong Song and Jungwook Choi | ISP2DLA: Automated Deep Learning Accelerator Design for On-Sensor Image Signal Processing |
Tao Cai, Jianfei Dai, Dejiao Niu, Lei Li, Zeyu Huang and Qiangqiang Ni | A LLC-Friendly LSM-tree |
PhD Forum
Mohamed Bouaziz and Suhaib A. Fahmy | Leveraging MLIR for efficient irregular-shaped CGRA overlay design |
Qingchen Zhai, Zhiwei Zhang and Ruozhou Xiao | LLM Based End-to-end Branch Predictor Optimization Generator |
Annina Gutermann and Juergen Becker | A Full-System Approach to Multi-Valued Logic Design |
Mateusz Wygrzywalski and Robert Szczygieł | Lightweight Extension of RISC-V Core for NTT-like Algorithms |
Xiqin Tang and Delong Shang | Design of High-performance while Energy-efficient Microprocessor with Novel Asynchronous Techniques |
Zekai Zhang and Ray Chak Chung Cheung | Design of Light-weight Encryption Algorithm Based on RISC-V Platform |
Yuki Shimamura, Minoru Watanabe and Nobuya Watanabe | Voltage range evaluation of an optically reconfigurable gate array VLSI |
Kangli Zhao, Anping He and Di Zhao | Research on High-Efficiency Asynchronous Superscalar Processors |