Sixth International Workshop on
Domain Specific System Architecture (DOSSA-6)


Buenos Aires, Argentina, June 30, 2024
http://prism.sejong.ac.kr/dossa-6

CALL FOR PAPERS

In conjunction with the 51st IEEE International Symposium on Computer Architecture (ISCA-51)


Workshop Schedule (Tentative)

2:00 pm - 2:05 pm (ART)
Workshop Introduction

2:05 pm - 2:45 pm (ART)Invited Talk I

Angshuman Parashar (NVIDIA)
"Understanding the limits of Data Movement Energy"


2:45 pm - 3:05 pm (ART)Paper I

Noah Kaplan (University of Michigan), Yufeng Gu (University of Michigan),
Reetuparna Das (University of Michigan)
"Understanding and Characterization of Pangenomics"

(slide) (paper)

3:05 pm - 3:20 pm (ART)Coffee Break


3:20 pm - 4:00 pm (ART)Invited Talk II

Ben Feinberg (Sandia National Lab)
"Analog MVM Accelerators: Architectural Challenges and Opportunity"


4:00 pm - 4:40 pm (ART)Invited Talk III

Ramyad Hadidi (Rain AI)
"On-Device Computing: Rain AI’s Mission for Energy-Efficient AI Hardware"


4:40 pm - 5:00 pm (ART)Paper II

Naorin Hossain (IBM Research), Alper Buyuktosunoglu (IBM Research),
John-David Wellman (IBM Research), Pradip Bose (IBM Research),
Margaret Martonosi (Princeton University)
"NoC-level Threat Monitoring in Domain-Specific Heterogeneous SoCs with SoCurity"

(slide) (paper)

5:00 pm - 5:20 pm (ART) Paper III

Joseph Zuckerman (Columbia University), John-David Wellman (IBM Research),
Ajay Vanamali (Columbia University), Manish Shankar (Columbia University),
Gabriele Tombesi (Columbia University), Karthik Swaminathan (IBM Research),
Mohit Kapur (IBM Research), Robert Philhower (IBM Research),
Pradip Bose (IBM Research), Luca Carloni (Columbia University)
"Towards Generalized On-Chip Communication for Programmable Accelerators in Heterogeneous Architectures"


(slide) (paper)

5:20 pm - 6:00 pm (ART)Invited Talk IV

Byeongho Kim (Samsung Electronics)
"Real-world Implementation and Future of AI Acceleration Systems based on Processing-in-Memory"

(slide)

6:00 pm - 6:05 pm (ART)Closing


CALL FOR PAPERS

Dmain specific systems are an increasingly important computing environment for many people and businesses. As the information technologies emerge into various real world applications such as autonomous driving, IoT (Internet of Things), CPS (Cyber physical systems) and health care applications in the 4th industrial revolution era, the interest in the specialized domain specific computing systems is increasing significantly. In addition to the conventional computing platforms, domain specific computing systems have a lot of design challenges including specialized hardware components like hardware accelerator, optimized library and domain specific languages. This workshop focuses on domain specific system design in both hardware and software aspects and their interaction in order to improve the availability and efficiency in the emerging real world applications. The main theme of this workshop in this year is the HW/SW components for domain specific systems. Topics of particular interest include, but are not limited to:

Application analysis and workload characterization to design domain specific system for emerging applications, such as autonomous driving, IoT and health care applications.
Domain specific processor/system architectures and hardware features for domain specific systems
Hardware accelerators for domain specific systems;
Storage architectures for domain specific systems;
Experiences in domain specific system development;
Novel techniques to improve responsiveness by exploiting domain specific systems;
Novel techniques to improve performance/energy for domain specific systems;
Domain specific systems performance evaluation methodologies;
Application benchmarks for domain specific systems;
Enabling technologies for domain specific systems (smart edge devices, smart sensors, energy harvesting, sensor networks, sensor fusion etc.);

Invited Talk I

- Speaker : Angshuman Parashar, NVIDIA

- Talk Title : Understanding the limits of Data Movement Energy

- Abstract :
   Tensor computations form the backbone of today's AI tasks, offering structured predictability and reuse that can be exploited to minimize energy consumption – a critical constraint for data centers.
However, optimizing hardware and software to optimally exploit these properties involves navigating complex and enormous design- and mapping-spaces.
Despite extensive research, accurately gauging the proximity of locally-optimal solutions to the global optimum remains challenging.
In this talk, we discuss ongoing efforts to bridge this gap by understanding energy efficiency limits, focusing on data movement within tensor computations. These limits can provide architects with strong insights early in the design process, facilitating more precise evaluations of the success of search heuristics and/or manual designs by measuring how close they are to the limit.
We present our achievements thus far and discuss the unresolved research challenges that must be solved for a comprehensive understanding of energy efficiency limits in tensor computations.

- Bio :
    Dr. Angshuman Parashar is a Senior Research Scientist at NVIDIA.
His research interests are in building, evaluating and programming spatial and data-parallel architectures, with a present focus on mapping and modeling tensor-algebra accelerators. Prior to NVIDIA, he was a member of the VSSAD group at Intel, where he worked with a small team of experts in architecture, languages, workloads and implementation to design a new spatial architecture.
Dr. Parashar received his Ph.D. in Computer Science and Engineering from the Pennsylvania State University in 2007, and his B.Tech.
in Computer Science and Engineering from the Indian Institute of Technology, Delhi in 2002.

Invited Talk II

- Speaker : Ben Feinberg, Sandia National Lab

- Talk Title : Analog MVM Accelerators: Architectural Challenges and Opportunity

- Abstract :
With the end of Dennard Scaling and the rising cost of data movement there has been renewed interest in processing-using-memory (PUM) or in situ computing approaches. These approaches seek to minimize data movement by performing computations not just near but within the memory arrays themselves. Among the potential PUM primitives, analog matrix-vector multiplication (MVM) and other linear algebra operations have garnered significant attention due to their wide applicability. After a decade of research on these accelerators, recently we have seen multiple demonstrations of large-scale prototypes capable of accelerating problems of a practical scale. Furthermore, programs such as DARPA OPTIMA, DARPA ScAN, and AFRL INTREPID are providing resources to accelerate the transition of these concepts from prototypes to products. With the circuits and devices for these accelerators reaching a sufficient level of maturity, it is now an ideal time for computer architecture research. This talk will provide an overview of the current state of the art in MVM accelerators, present recent results showcasing their potential across various problem domains and discuss key architectural research directions to enable the transition from circuit prototypes to large-scale systems.
   

- Bio :
Ben Feinberg is a Senior Member of Technical Staff in the Scalable Computer Architecture Group at Sandia National Laboratories. His research focuses on architectures for autonomous system with an emphasis on analog accelerators. Dr. Feinberg leads architecture modeling and system software research for Sandia's Rad-Edge project and is Sandia's lead architect for the DARPA OPTIMA program. He is one of the developers of CrossSim and leads a project with DOE Vehicle Technology Office on compute requirements for autonomous vehicles. Prior to joining Sandia in 2019, Ben completed his PhD in Electrical Engineering at the University of Rochester.
   


Invited Talk III

- Speaker : Ramyad Hadidi, Senior Research Scientist, Rain AI

- Talk Title : On-Device Computing: Rain AI’s Mission for Energy-Efficient AI Hardware

- Abstract :
    Today's AI systems are hampered in achieving peak performance on low-power devices, the critical juncture where real-time processing is essential.
This disconnect is a significant hurdle in harnessing the full potential of AI, from autonomous systems to large language model (LLM) based agents.
The goal is the seamless operation of advanced AI models on local devices.
The prevalent separation of memory and computation, along with the prohibitive cost of information processing on current hardware, stands as a barrier to the future of AI.
Rain AI's mission is to break these chains by developing the most energy-efficient AI hardware in the industry.
In this talk, I will give an overview of our first product and some of its techniques that show our commitment to this mission.
I will focus on our state-of-the-art approach in hardware-software co-design, emphasizing our breakthroughs in in-memory computing and AI fine-tuning techniques designed to revolutionize efficient on-device computing.

- Bio :
    Ramyad Hadidi is a senior research scientist at Rain AI, where he develops sophisticated artificial intelligence systems for edge computing.
With a Ph.D. in Computer Science from Georgia Institute of Technology, Ramyad's expertise spans edge computing, computer architecture, and machine learning.
His doctoral thesis centered on deploying deep neural networks efficiently at the edge.
At Rain AI, Ramyad is advancing the field of hardware/software co-design for AI, concentrating on optimizing in-memory computing architectures and enhancing their hardware-software synergy for resource-constrained environments.


Invited Talk IV

- Speaker : Byeongho Kim, Samsung Electronics

- Talk Title :  Real-world Implementation and Future of AI Acceleration Systems based on Processing-in-Memory

- Abstract :
   Driven by the success of GPT, AI application has become the mainstream of computing systems.
AI Applications are increasingly becoming memory-bound, requiring larger memory capacity and bandwidth than traditional applications.
To overcome these limitations, Processing-in-Memory (PIM) has been proposed as a promising solution.
This talk will introduce how Samsung's PIM-based system accelerates various AI applications and the future direction of memory architecture designs.

- Bio :
    Byeongho Kim is a hardware engineer currently working in the DRAM Design team at Samsung Electronics.
He specializes in processing-in-memory architecture and its associated systems, with a focus on HBM-PIM and next-gen PIM architecture.
Byeongho Kim holds a Ph.D. and B.S. from Seoul National University, earned in 2022 and 2017 respectively.
His Ph.D. research focuses on intelligent memory systems, particularly in-memory processing, high-performance computing, and AI acceleration.


SUBMISSION GUIDELINE

Submit a 2‐page presentation abstract to a web‐based submission system (https://cmt3.research.microsoft.com/DoSSA2024) by May. 10, 2024. Notification of acceptance will be sent out by May. 24, 2024. Final paper and presentation material (to be posted on the workshop web site) due June. 17, 2024. For additional information regarding paper submissions, please contact the organizers.

IMPORTANT DATES

Abstract submission : May. 10, 2024
Author notification : May. 24, 2024
Final camera-ready paper : June. 17, 2024
Workshop : June. 30,2024

Workshop Organizers

Hyesoon Kim, Georgia Tech (hyesoon@cc.gatech.edu)
Giho Park, Sejong Univ. (ghpark@sejong.ac.kr)
Jaewoon Sim, Seol National Univ. (jaewoong@snu.ac.kr)

Web Chair

Chiwon Han, Sejong Univ. (hc930104@sju.ac.kr)
Sungyun Bay, Sejong Univ. (bay1028@sju.ac.kr)

Prior DOSSA

DOSSA-1 (http://prism.sejong.ac.kr/dossa-1)
DOSSA-2 (http://prism.sejong.ac.kr/dossa-2)
DOSSA-3 (http://prism.sejong.ac.kr/dossa-3)
DOSSA-4 (http://prism.sejong.ac.kr/dossa-4)
DOSSA-5 (http://prism.sejong.ac.kr/dossa-5)