The Seventh International Workshop on Domain Specific System Architecture (DOSSA-7)

Seventh International Workshop on
Domain Specific System Architecture (DOSSA-7)

Seoul, Korea, October 19, 2025
http://prism.sejong.ac.kr/dossa-7

CALL FOR PAPERS

In conjunction with the 58^st ACM International Symposium on Micro Architecture (MICRO-58)

Workshop Schedule

1:00 pm - 1:05 pm (KST)
Workshop Introduction

1:05 pm - 1:55 pm (KST)Invited Talk I
Amir Yazdanbakhsh (Research Scientist, Google DeepMind)
"Generative Optimization at Scale: Practical Techniques for Sustainable Hyperscale Operations"

1:55 pm - 2:20 pm (KST)Paper I
Choi Huiwon, Jung Myoungsoo, KAIST
"Memoization: Accelerating MCD BNN by Attribution-Based Dynamic Precision Scaling"
(paper)

2:20 pm - 2:50 pm (KST)Student Invited Talk
Jiho Kim, Georgia Institute of Technology
"From Domain-Specific Architectures to Domain-Specific Systems in the Chiplet Era: Packaging-Aware Co-Design Across Compute and Interconnect"
(slide)

2:50 pm - 3:10 pm (KST)Break time

3:10 pm - 4:00 pm (KST)Invited Talk II
Dongsoo Lee (Executive Vice President, Naver Cloud)
"The Era of Reasoning-Driven AI Models: Challenges and Opportunities in AI Computing"
(slide)

4:00 pm - 4:25 pm (KST)Paper II
Narcís Rodas, Max Doblas, Víctor Soria-Pardos, Santiago Marco-Sola, Miquel Moretó (Barcelona Supercomputing Center)
"Sargantana: An Open RISC-V Processor for HW/SW Co-Design of Domain-Specific Accelerators"
(slide) (paper)

4:25 pm - 5:15 pm (KST) Invited Talk III
Eui-cheol Lim (Research Fellow and leader of Solution Advanced Technology team, SK Hynix)
"Breaking the $/token barrier in LLM service : attention offloading with PIM – GPU heterogeneous system"

5:15 pm - 5:20 pm (KST)Closing

CALL FOR PAPERS

Domain specific systems are an increasingly important computing environment for many people and businesses. As the information technologies emerge into various real world applications such as autonomous driving, IoT (Internet of Things), CPS (Cyber physical systems) and health care applications in the 4th industrial revolution era, the interest in the specialized domain specific computing systems is increasing significantly. In addition to the conventional computing platforms, domain specific computing systems have a lot of design challenges including specialized hardware components like hardware accelerator, optimized library and domain specific languages. This workshop focuses on domain specific system design in both hardware and software aspects and their interaction in order to improve the availability and efficiency in the emerging real world applications. The main theme of this workshop in this year is the HW/SW components for domain specific systems. Topics of particular interest include, but are not limited to:

Application analysis and workload characterization to design domain specific system for emerging applications, such as autonomous driving, IoT and health care applications.
Domain specific processor/system architectures and hardware features for domain specific systems
Low-power,energy-efficient domain specific accelerator/system architectures for on-device AI systems
Hardware accelerators for domain specific systems;
Storage architectures for domain specific systems;
Experiences in domain specific system development;
Novel techniques to improve responsiveness by exploiting domain specific systems;
Novel techniques to improve performance/energy for domain specific systems;
Domain specific systems performance evaluation methodologies;
Application benchmarks for domain specific systems;
Enabling technologies for domain specific systems (smart edge devices, smart sensors, energy harvesting, sensor networks, sensor fusion etc.);

The workshop aims at providing a forum for researchers, engineers and students from academia and industry to discuss their latest research in designing domain specific system for various emerging application areas in 4th industrial revolution era to bring their ideas and research problems to the attention of others, and to obtain valuable and instant feedback from fellow researchers. One of the goals of the workshop is to facilitate lively and rigorous-yet friendly-discussion about the research problems in the architecture, implementation, networking, and programming and thus pave the way to novel solutions that improve both hardware and software of future domain specific systems.

Invited Talk I

- Speaker : Amir Yazdanbakhsh (Research Scientist, Google DeepMind)

- Talk Title : Generative Optimization at Scale: Practical Techniques for Sustainable Hyperscale Operations

- Abstract :
As the era of Moore’s Law wanes, the focus of performance optimization is shifting from low-level tweaks to complex, high-level code transformations traditionally handled by human experts. This talk shows how generative AI can automate this challenging domain, paving the way for more efficient and sustainable software optimization at scale. I begin by introducing PIE (pie4perf.com), a new dataset derived from competitive programming that teaches LLMs to generate high-performance code. I will detail the key techniques that enable these models to generate optimizations that often surpass human-level performance. I will conclude by shifting from research to real-world application, summarizing our work on a fully autonomous, LLM-driven optimizer deployed at Google. I will highlight the unique challenges, key learnings, and results from implementing such a system at scale.

- Bio :
Amir Yazdanbakhsh is a Research Scientist at Google DeepMind, working at the intersection of machine learning and computer architecture. His primary focus is on applying machine learning to design efficient and sustainable computing systems, from leading the development of large-scale distributed training systems on TPUs to shaping the next generation of Google's ML accelerators. His work has been recognized by the ISCA Hall of Fame. Notably, his research on using AI to solve performance challenges in hyperscale systems received an IEEE Micro Top Picks award, and his work on a new system for AI won the IEEE Computer Society Best Paper Award. Amir received his Ph.D. from the Georgia Institute of Technology, where he was a recipient of the Microsoft and Qualcomm fellowships.

Invited Talk II

- Speaker : Dongsoo Lee (Executive Vice President, Naver Cloud)

- Talk Title : The Era of Reasoning-Driven AI Models: Challenges and Opportunities in AI Computing

- Abstract :
Recent advances in reasoning-driven AI models have demonstrated remarkable capabilities across highly complex workloads. Yet, these breakthroughs bring both new challenges and opportunities for AI computing systems. For instance, the cost of tokens has risen sharply due to the surge in output token generation, while workloads are showing increasingly memory-bound characteristics. This talk will explore how emerging AI service requirements are reshaping computational demands, how optimization algorithms are adapting in response, and what critical issues are unfolding in AI semiconductor design.

- Bio :
He is a member of the Presidential Committee on AI Strategy and currently Executive Vice President at NAVER Cloud, leading the AI Computing Solutions group since 2021. Previously, he was a Principal Engineer at Samsung Research (2017–2021) and a Research Staff Member at the IBM T.J. Watson Research Center, Yorktown Heights, NY (2013–2017). His research interests include high-performance computing, AI model optimization, and model compression. He received his Ph.D. in Electrical and Computer Engineering from Purdue University in 2013, and his B.S. (2002) and M.S. (2004) in Electrical Engineering from KAIST.

Invited Talk III

- Speaker : Eui-cheol Lim (Research Fellow and leader of Solution Advanced Technology team, SK Hynix)

- Talk Title : Breaking the $/token barrier in LLM service : attention offloading with PIM – GPU heterogeneous system

- Abstract :
While demand for LLM-based AI services is rising rapidly, inference cost—especially dollar-per-token—remains a key barrier. A major bottleneck is attention processing, which requires storing and computing large token caches alongside model weights within constrained GPU memory. This causes resource contention and limits batch scalability. We propose a heterogeneous architecture that offloads attention operations and token storage to a PIM-based accelerator, while GPUs focus solely on model weight execution. This separation reduces memory pressure, improves parallelism, and enables larger batches. The result is higher throughput and a dramatic reduction in dollar-per-token cost. This breakthrough makes LLM-based AI services more scalable and economically viable, accelerating real-world deployment across industries.

- Bio :
Eui-cheol Lim is a Research Fellow and leader of Solution Advanced Technology team in SK Hynix. He received the B.S. degree and the M.S. degree from Yonsei University, Seoul, Korea, in 1993 and 1995, and the Ph.D. degree from Sungkyunkwan University, suwon, Korea in 2006. Dr.Lim joined SK Hynix in 2016 as a system architect in memory system R&D. Before joining SK Hynix, he had been working as an SoC architect in Samsung Electronics and leading the architecture of most Exynos mobile SoC. His recent interesting points are memory and storage system architecture with new media memory and new memory solution such as CXL memory and Processing in Memory. In particular, he is proposing a new computing architecture based on PIM, which is more efficient and flexible than existing AI accelerators, to process generative AI and LLM (large language Model).

Student Invited Talk

- Speaker : Jiho Kim (Georgia Institute of Technology)

- Talk Title : From Domain-Specific Architectures to Domain-Specific Systems in the Chiplet Era: Packaging-Aware Co-Design Across Compute and Interconnect

- Abstract :
Domain-specific architectures have traditionally been defined by specialized compute units, optimized memory hierarchies, and tailored programming models. However, in the chiplet era, this notion is expanding as packaging technologies play a central role. The way chiplets are integrated through interposers, bonding, and vertical interconnects directly determines the bandwidth, latency, and energy efficiency of the interconnect fabric that binds domain-specific systems together. For emerging workloads such as AI, HPC, and autonomous systems, architectural design without packaging awareness risks unrealistic projections and suboptimal implementations. This talk demonstrates that domain-specific system design must expand beyond processors and accelerators to explicitly include packaging as a co-equal design parameter. We introduce a methodology that integrates physical interconnect characteristics into cycle-level architectural simulations, enabling quantitative exploration of trade-offs across chiplet partitioning, interconnect hierarchy, and workload mapping. Using chipletization as a case study, we demonstrate how packaging-aware modeling reveals fundamental insights into performance scaling, latency, and energy efficiency, and how these findings inform the construction of an open chiplet ecosystem. By elevating packaging to a first-class design parameter in domain-specific architecture research, we outline a path toward truly holistic domain-specific systems for the next generation of computing.

- Bio :
Jiho Kim is a third-year Ph.D. student in Electrical and Computer Engineering at the Georgia Institute of Technology, advised by Prof. Cong (Callie) Hao in the Sharc Lab. Her research focuses on high-level synthesis (HLS), 2.5D/3D chiplet-based systems, and architecture–packaging co-design. In particular, she develops technology-aware simulation frameworks to enable accurate design-space exploration for emerging heterogeneous chiplet systems. She was a research intern at IBM Research in 2024 and a visiting scholar at ETH Zürich in the summer of 2025. She is a recipient of the Qualcomm Innovation Fellowship 2025 and the SK Hynix Global Fellowship. Her recent work on FPGA profiling tool was nominated for Best Paper at FCCM 2025.

SUBMISSION GUIDELINE

Submit a 2‐page presentation abstract to a web‐based submission system (https://cmt3.research.microsoft.com/DoSSA2025) by August 26, 2025. Notification of acceptance will be sent out by September 15, 2025. Final paper and presentation material (to be posted on the workshop web site) due October. 6, 2025. For additional information regarding paper submissions, please contact the organizers.

IMPORTANT DATES

Abstract submission :~~August. 26, 2025~~ September. 8, 2025
Author notification : September. 15, 2025
Final camera-ready paper : October. 6, 2025
Workshop : October. 19, 2025

Workshop Organizers

Hyesoon Kim, Georgia Tech (hyesoon@cc.gatech.edu)
Gi-Ho Park, Sejong Univ. (ghpark@sejong.ac.kr)
Jaewoong Sim, Seoul National Univ. (jaewoong@snu.ac.kr)

Web Chair

Chiwon Han, Sejong Univ. (hc930104@sju.ac.kr)
Yewon Choi, Sejong Univ. (choiyewon@sju.ac.kr)

Prior DOSSA

DOSSA-1 (http://prism.sejong.ac.kr/dossa-1)
DOSSA-2 (http://prism.sejong.ac.kr/dossa-2)
DOSSA-3 (http://prism.sejong.ac.kr/dossa-3)
DOSSA-4 (http://prism.sejong.ac.kr/dossa-4)
DOSSA-5 (http://prism.sejong.ac.kr/dossa-5)
DOSSA-6 (http://prism.sejong.ac.kr/dossa-6)

The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.