Selected Publications
is/are with the HPCP Lab.
1 denotes co-first authors.
* indicates (co-)correspondence.
-
SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs
Suhyun Lee, Chaemin Lim, Jinwoo Choi, Heelim Choi, Chan Lee, Yongjun Park, Kwanghyun Park, Hanjun Kim, and Youngsok Kim*
2025 ACM International Conference on Management of Data (SIGMOD), June 2025 (to appear)
System SW PIM Database -
GCStack: A GPU Cycle Accounting Mechanism for Providing Accurate Insight into GPU Performance
Hanna Cha, Sungchul Lee, Yeonan Ha, Hanhwi Jang, Joonsung Kim*, and Youngsok Kim*
IEEE Computer Architecture Letters (CAL), Oct. 2024
Architecture GPU Modeling -
GraNNDis: Fast Distributed Graph Neural Network Training Framework for Multi-Server Clusters
Jaeyong Song, Hongsun Jang, Hunseong Lim, Jaewon Jung, Youngsok Kim, and Jinho Lee
33rd International Conference on Parallel Architectures and Compilation Techniques (PACT), Oct. 2024
System SW AI GPU -
CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU
Shinnung Jeong, Sungjun Cho, Yongwoo Lee, Hyunjun Park, Seonyeong Heo, Gwangsun Kim, Youngsok Kim, and Hanjun Kim
53rd International Conference on Parallel Processing (ICPP), Aug. 2024
System SW GPU -
PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices
Si Ung Noh1, Junguk Hong1, Chaemin Lim, Seongyeon Park, Jeehyun Kim, Hanjun Kim, Youngsok Kim*, and Jinho Lee*
51st IEEE/ACM International Symposium on Computer Architecture (ISCA), June–July 2024
System SW PIM -
Orchestrating Multiple Mixed Precision Models on a Shared Precision-Scalable NPU
Kiung Jung, Seok Namkoong, Hongjun Um, Hyejun Kim, Youngsok Kim, and Yongjun Park
25th ACM International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), June 2024
System SW AI -
Fully Harnessing the Performance Potential of DRAM-less Mobile Flash Storage
Jaesun No, Gyusun Lee, Youngsok Kim, and Jinkyu Jeong
38th International Conference on Massive Storage Systems and Technology (MSST), June 2024
System SW Mobile -
MPC-Wrapper: Fully Harnessing the Potential of Samsung Aquabolt-XL HBM2-PIM on FPGAs
Jinwoo Choi1, Yeonan Ha1, Hanna Cha, Seil Lee, Sungchul Lee, Jounghoo Lee, Shin-haeng Kang, Bongjun Kim, Hanwoong Jung, Hanjun Kim, and Youngsok Kim*
32nd IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2024
Architecture PIM FPGA -
AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping
Seongyeon Park, Junguk Hong, Jaeyong Song, Hajin Kim, Youngsok Kim, and Jinho Lee
29th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), Mar. 2024
System SW GPU -
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
Hongsun Jang, Jaeyong Song, Jaewon Jung, Jaeyoung Park, Youngsok Kim, and Jinho Lee
30th IEEE International Symposium on High-Performance Computer Architecture (HPCA), Mar. 2024
Received the HPCA 2024 Best Paper Award (Honorable Mention)!
Architecture SmartSSD AI -
McCore: A Holistic Management of High-Performance Heterogeneous Multicores
Jaewon Kwon, Yongju Lee, Hongju Kal, Minjae Kim, Youngsok Kim, and Won Woo Ro
56th IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2023
Architecture -
Virtual PIM: Resource-aware Dynamic DPU Allocation and Workload Scheduling Framework for Multi-DPU PIM Architecture
Donghyeon Kim, Taehoon Kim, Inyong Hwang, Taehyeong Park, Hanjun Kim, Youngsok Kim, and Yongjun Park
32nd International Conference on Parallel Architectures and Compilation Techniques (PACT), Oct. 2023
System SW PIM -
Enabling Fine-Grained Spatial Multitasking on Systolic-Array NPUs Using Dataflow Mirroring
Jinwoo Choi, Yeonan Ha, Jounghoo Lee, Sangsu Lee, Jinho Lee, Hanhwi Jang*, and Youngsok Kim*
IEEE Transactions on Computers (TC), Aug. 2023
Architecture AI Modeling -
Occamy: Memory-efficient GPU Compiler for DNN Inference
Jaeho Lee, Shinnung Jeong, Seungbin Song, Kunwoo Kim, Heelim Choi, Youngsok Kim, and Hanjun Kim
60th ACM/IEEE Design Automation Conference (DAC), July 2023
System SW GPU AI -
Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs [GitHub]
Chaemin Lim, Suhyun Lee, Jinwoo Choi, Jounghoo Lee, Seongyeon Park, Hanjun Kim, Jinho Lee, and Youngsok Kim*
2023 ACM International Conference on Management of Data (SIGMOD), June 2023
System SW PIM Database -
Pipe-BD: Pipelined Parallel Blockwise Distillation
Hongsun Jang, Jaewon Jung, Jaeyong Song, Joonsang Yu, Youngsok Kim, and Jinho Lee
26th Design, Automation, and Test in Europe Conference (DATE), Apr. 2023
System SW AI GPU -
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression
Jaeyong Song1, Jinkyu Yim1, Jaewon Jung, Hongsun Jang, Hyung-Jin Kim, Youngsok Kim, and Jinho Lee
28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2023
System SW AI GPU -
SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators
Mingi Yoo1, Jaeyong Song1, Jounghoo Lee, Namhyung Kim, Youngsok Kim*, and Jinho Lee*
29th IEEE International Symposium on High-Performance Computer Architecture (HPCA), Feb.–Mar. 2023
Architecture AI -
GuardiaNN: Fast and Secure On-Device Inference in TrustZone Using Embedded SRAM and Cryptographic Hardware
Jinwoo Choi1, Jaeyeon Kim1, Chaemin Lim1, Suhyun Lee, Jinho Lee, Dokyung Song, and Youngsok Kim*
23rd ACM/IFIP International Middleware Conference (Middleware), Nov. 2022
System SW Mobile AI Security -
Decoupling Schedule, Topology Layout, and Algorithm to Easily Enlarge the Tuning Space of GPU Graph Processing
Shinnung Jeong, Yongwoo Lee, Jaeho Lee, Heelim Choi, Seungbin Song, Jinho Lee, Youngsok Kim, and Hanjun Kim
31st International Conference on Parallel Architectures and Compilation Techniques (PACT), Oct. 2022
System SW GPU -
Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators
Mingi Yoo1, Jaeyong Song1, Hyeyoon Lee, Jounghoo Lee, Namhyung Kim, Youngsok Kim, and Jinho Lee
31st International Conference on Parallel Architectures and Compilation Techniques (PACT), Oct. 2022
Received the PACT 2022 Best Paper Award!
Architecture AI -
GCoM: A Detailed GPU Core Model for Accurate Analytical Modeling of Modern GPUs
Jounghoo Lee, Yeonan Ha, Suhyun Lee, Jinyoung Woo, Jinho Lee, Hanhwi Jang, and Youngsok Kim*
49th IEEE/ACM International Symposium on Computer Architecture (ISCA), June 2022
Architecture GPU Modeling -
SALoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs
Seongyeon Park, Hajin Kim, Tanveer Ahmad, Nauman Ahmed, Zaid Al-Ars, Peter Hofstee, Youngsok Kim, and Jinho Lee
36th IEEE International Parallel and Distributed Processing Symposium (IPDPS), May–June 2022
System SW GPU -
Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs
Jounghoo Lee1, Jinwoo Choi1, Jaeyeon Kim, Jinho Lee, and Youngsok Kim*
58th ACM/IEEE Design Automation Conference (DAC), Dec. 2021
Architecture AI -
Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing
Mingi Yoo1, Jaeyong Song1, Jounghoo Lee, Namhyung Kim, Youngsok Kim, and Jinho Lee
IEEE Computer Architecture Letters (CAL), June 2021
Architecture AI -
Thread-aware Area-efficient High-level Synthesis Compiler for Embedded Devices
Changsu Kim, Shinnung Jeong, Sungjun Cho, Yongwoo Lee, William Song, Youngsok Kim, and Hanjun Kim
19th ACM/IEEE International Symposium on Code Generation and Optimization (CGO), Mar. 2021
System SW FPGA -
Real-Time Object Detection System with Multi-Path Neural Networks
Seonyeong Heo, Sungjun Cho, Youngsok Kim, and Hanjun Kim
26th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), Apr. 2020
System SW GPU AI -
FlexLearn: Fast and Highly Efficient Brain Simulations Using Flexible On-Chip Learning
Eunjin Baek1, Hunjun Lee1, Youngsok Kim, and Jangwoo Kim
52nd IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2019
Architecture Neuromorphic -
μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization
Youngsok Kim, Joonsung Kim, Dongju Chae, Daehyun Kim, and Jangwoo Kim
14th ACM European Conference on Computer Systems (EuroSys), Mar. 2019
System SW Mobile AI -
Flexon: A Flexible Digital Neuron for Efficient Spiking Neural Network Simulations
Dayeol Lee1, Gwangmu Lee1, Dongup Kwon, Sunghwa Lee, Youngsok Kim, and Jangwoo Kim
45th ACM/IEEE International Symposium on Computer Architecture (ISCA), June 2018
Architecture Neuromorphic -
DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture
Dongup Kwon1, Jaehyung Ahn1, Dongju Chae, Mohammadamin Ajdari, Jaewon Lee, Suheon Bae, Youngsok Kim, and Jangwoo Kim
45th ACM/IEEE International Symposium on Computer Architecture (ISCA), June 2018
Architecture FPGA -
Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks
Amirali Boroumand, Saugata Ghose, Youngsok Kim, Rachata Ausavarungnirun, Eric Shiu, Rahul Thakur, Daehyun Kim, Aki Kuusela, Allan Knies, Parthasarathy Ranganathan, and Onur Mutlu
23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2018
Architecture PIM Mobile -
GPUpd: A Fast and Scalable Multi-GPU Architecture Using Cooperative Projection and Distribution
Youngsok Kim, Jae-Eon Jo, Hanhwi Jang, Minsoo Rhu, Hanjun Kim, and Jangwoo Kim
50th IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2017
Architecture GPU -
CloudSwap: A Cloud-Assisted Swap Mechanism for Mobile Devices
Dongju Chae, Joonsung Kim, Youngsok Kim, Jangwoo Kim, Kyung-Ah Chang, Sang-Bum Suh, and Hyogun Lee
16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May 2016
System SW Mobile -
Efficient Footprint Caching for Tagless DRAM Caches
Hakbeom Jang1, Yongjun Lee1, Jongwon Kim, Youngsok Kim, Jangwoo Kim, Jinkyu Jeong, and Jae W. Lee
22nd IEEE International Symposium on High Performance Computer Architecture (HPCA), Mar. 2016
Architecture -
DCS: A Fast and Scalable Device-Centric Server Architecture
Jaehyung Ahn1, Dongup Kwon1, Youngsok Kim, Mohammadamin Ajdari, Jaewon Lee, and Jangwoo Kim
48th IEEE/ACM International Symposium on Microarchitecture (MICRO), Dec. 2015
Architecture FPGA -
Stealing Webpages Rendered on Your Browser by Exploiting GPU Vulnerabilities
Sangho Lee, Youngsok Kim, Jangwoo Kim, and Jong Kim
35th IEEE Symposium on Security and Privacy (S&P), May 2014
Security GPU -
GPUdmm: A High-Performance and Memory-Oblivious GPU Architecture Using Dynamic Memory Management
Youngsok Kim, Jaewon Lee, Jae-Eon Jo, and Jangwoo Kim
20th IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2014
Architecture GPU -
ScaleGPU: GPU Architecture for Memory-Unaware GPU Programming
Youngsok Kim, Jaewon Lee, Donggyu Kim, and Jangwoo Kim
IEEE Computer Architecture Letters (CAL), July 2013
Architecture GPU