- Next-Generation Processor Architectures
CPU and GPU are the two most widely-deployed processors available on most of desktops, servers, and mobile devices. Still, their architectures continue to change as their primary target workloads change over time. In this research, we design and propose next-generation CPU and GPU architectures to accomodate the performance demands of the emerging workloads. We pursue both software- and hardware-oriented approaches by prototyping software-based performance optimizations on real systems and by evaluating architectural modifications built on top of the existing CPU and GPU architectures. Examples of this research's outcomes include GPUdmm [HPCA'14] and ScaleGPU [CAL'14].
- Next-Generation GPU Architectures for High-Performance Rasterization and Ray Tracing
Have you ever wondered how your GPU produces a frame which gets displayed on your monitor? Graphics workloads such as games are the primary acceleration target of GPUs, and GPU microarchitectures have been evolved in a way which maximizes throughput to process millions of vertices, primitives (e.g., triangles), fragments, and pixels. In this research, we seek to propose new GPU microarchitectures which improve the performance of real-time rasterization and ray tracing, the two most widely-used graphics techniques for implementing rich graphics effects. We focus on both rasterization and ray tracing; rasterization has been the dominant technique for graphics, whereas ray tracing is becoming more and more popular due to its ability to implement photorealistic frames. GPUpd [MICRO'17] is an example outcome of this research topic.
- Performance Modeling & Scheduling
Cycle-level simulations, the most widely-used performance estimation technique, are too slow and will become much slower as the complexity of hardware continues to increase. Performance modeling, on the other hand, allows us to quickly predict an application's performance by raising the level of abstraction in architectural simulation. Furthermore, systems can greatly benefit from performance modeling by being able to predict applications' performance with little overheads. In this research, we propose faster but accurate performance modeling methods and improve the schedulability of the existing schedulers by exploiting the proposed methods. Modeling the performance of DNN executions on GPUs [RTAS'20] is relevant to this research topic.
- System Software and Hardware Accelerators for Artificial Intelligence
Artificial Intelligence (AI) is one of the most important workloads of today; however, efficient execution of AI workloads is still a major challenge to both hardware and software. In this research, we investigate architectural and system-level support to maximize the performance and/or energy efficiency of AI workload execution on various computing platforms. We propose new hardware accelerator designs, new AI frameworks to fully utilize the underlying hardware resources, and vertical integration methods spanning both hardware and software. The research outcomes of this topic include GuardiaNN [Middleware'22], Dataflow Mirroring [DAC'21], μLayer [EuroSys'19], FlexLearn [MICRO'19], and Flexon [ISCA'18].
- FPGA-Based Hardware Acceleration
One of the most rewarding moments as a computer architect is to realize our hardware proposals. It is very difficult to manufacture an actual chip, so we heavily utilize Field-Programmable Gate Arrays (FPGAs) whose functionality can be changed with hardware description languages (e.g., Verilog, VHDL). By implementing our hardware proposals on FPGAs, we can apply the hardware proposals to real systems and demonstrate the real-world effectiveness of the hardware proposals. Still, how to implement high-performance FPGA prototypes of the hardware proposals raises various research opportunities which we seek to resolve. Notable FPGA prototypes we implemented are DCS [MICRO'15] and DCS-ctrl [ISCA'18].