- This event has passed.
BIU learning club – Itay Hubara – Toward Fast and Efficient Deep Learning
December 11, 2022 @ 12:00 pm - 1:00 pm IST
Zoom link:
https://us02web.zoom.us/j/4685913265
Title:
Toward Fast and Efficient Deep Learning
Abstract:
Deep Neural Networks (DNNs) are now irreplaceable in various applications. However, DNNs require a vast amount of computational resources. In most cases, complex DNNs training requires several machines working in parallel (most commonly using data parallelism). Moreover, DNNs deployment on devices with limited computational power can be challenging and often infeasible. The main computational effort is due to massive amounts of multiply-accumulate operations required to compute the weighted sums of the neurons’ inputs and the parameters’ gradients. Much work has been done to reduce network size or remove unnecessary computations. The most dominant approaches are quantization and pruning. Quantization approaches replace DNNs’ floating-point tensors (weights, activations, and gradients) with approximated low-precision representation (e.g., INT8, INT4, FP8). Pruning approaches take a different path and zero out part of the network parameters or intermediate results (i.e., weights and activations). In this talk, we will go over methods for both inference and training acceleration. We will discuss the benefits and costs of each method focusing on some of my latest papers including:
-
Below 8-bit post-training quantization using a small (or no) calibration set (CVPR2020, ICML2021).
-
Structure sparsity on weights, activations, and gradients for DNNs training acceleration (NeurIPS2021, Under_review)
-
Overcoming the “generalization gap” phenomena that were observed when a large batch is used to utilize a vast amount of computational resources (NeurIPS2017, CVPR2020)
Finally, we will discuss future DNN’s acceleration directions when matrix multiplication is no longer a bottleneck.
Short Bio:
Itay is a senior researcher and principal engineer at Habana Labs where he has been since 2016. Itay’s work focuses on accelerating neural network training and inference using low-bit sparse representations, sparsity, and large-batch training. While working for Habana Itay completed his Ph.D. at the Department of Electrical Engineering where he received several excellence awards. Itay is an active member of the MLPerf benchmarking community and published his research in top-tier journals and conferences among them his seminal work on binarized neural networks.