Advanced Computer Architecture
The course explores advanced and bleeding-edge techniques used for architecting CPUs, GPUs, and accelerators for ML and other computation domains. Exact topics vary year-to-year as research frontiers advance, but typically include: advanced instruction flow and data flow techniques (e.g., trace cache), high-performance memory hierarchy (e.g., advanced cache replacement and prefetching), special-purpose accelerator architectures (e.g., for graphics and ML), non-von-Neumann architectures (e.g., dataflow machines), hardware support for operating systems and programmability (e.g., transacitonal memory), as well as microarchitecture-level security exploits and mitigations (e.g., Spectre and friends). The workload consists of (i) studying and discussing recent papers from top-tier computer architecture venues such as ISCA, MICRO, ASPLOS, and HPCA (about 4 papers per week), and (ii) a course project in which students will typically develop a new technique to improve an existing computing architecture and implement/evaluate it in a microarchitecture simulator (such as gem5).
Pre-requisites: Students must either (i) have taken CPEN 411, or (ii) have a prior in-depth understanding of the operation of modern CPUs, including register renaming, out-of-order execution of, superscalar issue, speculation, and memory hierarchy. We will assume that you know these topics cold, so if you do not feel confident, take CPEN 411 beforehand to learn them (even as a graduate student).