Research Topics

Processor Performance Analysis & Modeling
Cycle-level simulations are currently the most widely-used performance evaluation technique, but they are too slow and will become much slower as processor microarchitectures become more complex. Performance modeling, on the other hand, allows us to quickly predict an application's performance by raising the level of abstraction in architectural simulation. Furthermore, systems can greatly benefit from performance modeling by being able to predict applications' performance with little overheads. In this research, we propose faster but accurate performance evaluation methods which can serve as alternatives to the slow cycle-level simulations. GCoM [ISCA '22], a fast and accurate analytical model for evaluating the performance of modern GPUs, and GCStack+GCScaler [ISCA '25], a fast and accurate GPU performance analysis framework using fine-grained stall cycle accounting and interval analysis, are representative outcomes of this research topic.
Hardware-Accelerated Database Management Systems
Database management systems (DBMSs; e.g., MySQL, MonetDB) are one of the most important server workloads as they allow users to capture and analyze large amounts of data. Fast query processing is a key design goal of DBMSs, and we seek to achieve the goal through hardware acceleration. In this research, to accelerate query processing, we analyze the key performance bottlenecks of the existing DBMSs and propose hardware acceleration techniques which fully exploit the high potential of modern hardware. Example outcomes of this topic are PID-Join [SIGMOD '23], a fast in-memory join algorithm exploiting the processing-in-memory capabilities of commodity DIMM devices, and SPID-Join [SIGMOD '25], a skew-resistant in-memory join algorithm which achieves high skew resistance by leveraging the architectural characteristics of commodity processing-in-DIMM devices.
Next-Generation CPU & GPU Architectures for General-Purpose Computing
CPU and GPU are the two most widely-deployed processors available on most of desktops, servers, and mobile devices. Still, their architectures continue to change as their primary target workloads change over time. In this research, we design and propose next-generation CPU and GPU architectures to accomodate the ever-increasing performance demands of emerging CPU and GPU workloads. We pursue both software- and hardware-oriented approaches by prototyping software-based performance optimizations on real CPUs and GPUs, and by evaluating architectural extensions using cycle-level simulators such as gem5 and GPGPU-Sim. Examples of this research topic include GPUdmm [HPCA '14] and ScaleGPU [CAL '14].
FPGA-Based Hardware Acceleration
One of the most rewarding moments as a computer architect is to realize our hardware proposals on real systems. It is very difficult to manufacture an actual chip, so we heavily utilize Field-Programmable Gate Arrays (FPGAs) whose functionality can be changed with hardware description languages (e.g., Verilog, VHDL). By implementing our hardware proposals on FPGAs, we can apply the hardware proposals to real systems and demonstrate the real-world effectiveness of the hardware proposals. Still, how to implement high-performance FPGA prototypes of the hardware proposals raises various research opportunities which we seek to resolve. Notable FPGA prototypes we implemented are DCS [MICRO '15], DCS-ctrl [ISCA '18], and MPC-Wrapper [FCCM '24].