Abstract: For a variety of ML applications, generalized matrix multiply (GEMM) with DOT product is the most computationally intensive operation. This paper presents a microarchitecture exploration of ...
Abstract: Mixed-precision computation, which uses multiple different precision in a single code, is being studied to increase computational speed and energy efficiency. It typically uses the IEEE ...