Learning Club – The Data Science Institute at Bar-Ilan University

Learning Club BIU talk by Sagie Benaim

The recording of the talk:

https://drive.google.com/file/d/1zxX8iQrwabXvjCo8RBbu4wolGMYdnf3e/view?usp=sharing

—–

On Sunday 27.6.2021 at 12:00, we will host Sagie Benaim fromTel-Aviv university.

Please see the details below.

Zoom:

https://us02web.zoom.us/j/88063900417?pwd=VWRDUy9ld0d1eVpocFhVOEcvd0FNUT09

Meeting ID: 880 6390 0417

Passcode: 683072

Title:

Structure-Aware Manipulation of Images and Videos

Abstract:

Methods for image and video manipulation, such as texture and style transfer, are a subject of long term interest in computer vision. In this talk, I will go a step further, and discuss approaches for the manipulation of images and videos, based on structure.

In the first part of the talk, I will present methods for structure manipulation. I will begin by discussing the ‘unpaired setting’ in which a collection of unpaired images is given for two visual domains, and will present a method for recovering the shared structure between the domains as well as the structure that is unique to each domain. I will then discuss a method for generating a ‘structural analogy’ between two images A and B: that is, an image that keeps the appearance and style of B, but has a structural arrangement that corresponds to A. Moving to videos, I will discuss a method for generating diverse and novel videos from a single training video sample. The generated videos look realistic and capture the structure and motion of objects in the training video.

In the second part of the talk, I will present a method for video manipulation based on the understanding of structure. Specifically, our method learns to automatically predict the ‘speediness’ of moving objects in videos in a self-supervised manner. This can be used for adaptive video speedup as well as for self-supervised action recognition and video retrieval.

Learning Club talk by Yedid Hoshen

Hey,

The recording of today’s talk:

https://us02web.zoom.us/rec/share/rR45YFjmksHuzP9SG__a-7_MmLdVI6Bq6OU9r4RIID2QbxCzPhcso_d_H-OFdcXT.v-J1Wj5nzanD8Ph5?startTime=1622365463000

Thanks to Yedid for presenting and everyone who attended,

———————————–

Hey everyone,

On Sunday 30.5.2021 we will host Yedid Hoshen from the Hebrew University.

Please see the details below.

See you then,

Roni

——

Zoom:

https://us02web.zoom.us/j/84758320608?pwd=SWljZGxhT1lkK1c2U0YyaVVEclZFUT09

Meeting ID: 847 5832 0608

Passcode: 942440

Title:

Scaling-up Disentanglement

Abstract:

Disentangling data into attributes that are meaningful to humans is a key task in machine learning. Here, we tackle the case where supervision is available for some of the attributes but not for others. Most large-scale disentanglement methods use adversarial training which typically yields imperfect results. We will first describe a simple, generic non-adversarial method, LORD, which outperforms the corresponding GAN-based methods. The secret sauce of this method is a latent optimization-based information bottleneck. Unfortunately, LORD is unable to scale-up to high-dimensional multi-modal disentanglement tasks such as image translation. We therefore present an advanced framework, OverLORD which overcomes these issues. We show that this flexible framework covers multiple image translation settings e.g. attribute manipulation, pose-appearance translation, segmentation-guided synthesis and shape-texture transfer. In an extensive evaluation, we present significantly better disentanglement with higher translation quality and greater output diversity than state-of-the-art methods.

BIU Learning Club 4.4.2021 — Ronen Basri — On the Connection between Deep Neural Networks and Kernel Methods

The recording of Ronen’s talk:

https://us02web.zoom.us/rec/share/lFyTCaXdH469BOuce3TgfdZQZ1pJlsy-Ei165FQPuiue6nGPHjpLgbOCF1hdvWEu.iDjrEpLTDMdUpiXJ

Thanks to Ronen for the presentation and everyone who attended,

Hey everyone,

After the holiday, on Sunday 4.4.2021 at 12:00, we will host Ronen Basri from Weizman Institute.

Ronen will present his work on the connection between deep neural networks and kernel methods.

Zoom:
https://us02web.zoom.us/j/88008797212pwd=TkFnME5OM3VqV2FjSlRGRTNLSVB1QT09

Meeting ID: 880 0879 7212

Passcode: 416934

Title: On the Connection between Deep Neural Networks and Kernel Methods

Abstract: Recent theoretical work has shown that massively overparameterized neural networks are equivalent to kernel regressors that use Neural Tangent Kernels (NTKs). Experiments indicate that these kernel methods perform similarly to real neural networks. My work in this subject aims to better understand the properties of NTK and relate them to properties of real neural networks. In particular, I will argue that for input data distributed uniformly on the sphere NTK favors low frequency predictions over high frequency ones, potentially explaining why overparameterized networks do not overfit their training data. I will further discuss the behavior of NTK when data is distributed nonuniformly, and finally show that NTK is tightly related to the classical Laplace kernel, which has a simple closed form. Our results suggest that much insight about neural networks can be obtained from analysis of NTK.

Learning Club talk by Joanathan Berant

The recording of Jonathan’s talk:

https://us02web.zoom.us/rec/share/3n_NxS_A2xCc1fNRyToKsNIT70oCEDuirBM-UW0BB9eZPasSGsUL-3Cl9D4xbPYu.YjtwPGXF1–IJTkK?startTime=1624179875000

Thanks to Jonathan for presenting and everyone who attended,

Learning Club – Joanathan Berant

On Sunday 20.6.2021 at 12:00, we will host Jonathan Berant from Tel-Aviv university.

——

Zoom:

https://us02web.zoom.us/j/87317183515?pwd=VHJyUUdVTG0rMlVMbzUyaGZNTHhzZz09

Meeting ID: 873 1718 3515

Passcode: 617148

Title:

Beyond supervised learning: generalization, few-shot learning, and robustness

Abstract:

Pre-trained language models combined with fine-tuning on labeled data have led to impressive results throughout natural language understanding. In this talk, I will describe some recent work that goes beyond this traditional paradigm. First, I will present a class of models with an inductive bias towards tree structures and show that such models exhibit much better compositional generalization, that is, they generalize better to new compositions. Second, I will describe Splinter, a pre-training technique that emulates question answering during pre-training and leads to large gains in the few-shot setup. For example, Splinter achieves 73 F1 on SQuAD with as few as 128 examples. Time permitting, I will also describe some recent work on improving the robustness of models through adversarial training, where we propose a new discrete attack and show it can substantially improve robustness through online augmentation.

Learning Club talk by Roi Reichart

The recording of Roi’s talk:

https://us02web.zoom.us/rec/share/TED6jHbAUWqMRm7ERD3Tft7-EdMta6RwWdUaGYYYnoTrE7vcxIMeC1EmkU1v_LFp.xfF0lKSytaBECNuL?startTime=1623574979000

Learning Club – Roi Reichart

On Sunday 13.6.2021 at 12:00, we will host Roi Reichart from the Technion.

Zoom:

https://us02web.zoom.us/j/83866349257?pwd=Y1g2R3U1akNLQ1MrNXh5QUcxME8ydz09

Meeting ID: 838 6634 9257

Passcode: 754963

Title:

CausaLM: Causal Model Explanation Through Counterfactual Language Models

Abstract:

Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all ML-based methods, they are as good as their training data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a language representation model that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.

Learning Club talk by Yair Carmon

Learning Club – Yair Carmon

The recording of Yair’s talk:

https://us02web.zoom.us/rec/share/4qoVA4woNo0sguj1vvlUYtWZIYOpyuiKfNPxt2YY8wSv0QfV61sua-tbf6Z6AXXU.kLtZGmkcoTBn5uty?startTime=1622970316000

Zoom:

https://us02web.zoom.us/j/84847086771?pwd=K2QrMnZrQ21XU2JaUi9KS24wUHN2QT09

Meeting ID: 848 4708 6771
Passcode: 607548

Title:

Accuracy on the line: on the predictability of out-of-distribution generalization

Abstract:

To make machine learning reliable, we must understand generalization to out-of-distribution environments (unseen during training) in addition to the in-distribution generalization measured by standard test sets. We show empirically that, to very good approximation and to a large extent, out-of-distribution performance is in fact a simple function of in-distribution performance – our experiments span a wide range of models, computer vision datasets and types of distributions shifts. We also present a number of exceptions to this relationship and test hypotheses for their causes. Finally, we show how a simple Gaussian generative model exhibits several of the phenomena we observe, and discuss the scope and implications of our findings.

Joint work with John Miller, Rohan Tauri, Aditi Raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang and Ludwig Schmidt.

Learning Club BIU talk by Yonathan Efroni

BIU Learning Club 23.5.2021 — Yonathan Efroni — Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni from Microsoft research Israel/New York.

Zoom:

https://us02web.zoom.us/j/84891223876?pwd=UGcyZU8rZHY2NGpTMFNuUFIzajVRUT09

Meeting ID: 848 9122 3876

Passcode: 750812

Title:
Confidence-Budget Matching for Sequential Budgeted Learning

Abstract:
A core element in decision-making under uncertainty is the feedback on the quality of the performed actions. However, in many applications, such feedback is restricted. For example, in recommendation systems, repeatedly asking the user to provide feedback on the quality of recommendations will annoy them. In this work, we formalize decision-making problems with querying budget, where there is a (possibly time-dependent) hard limit on the number of reward queries allowed. Specifically, we consider multi-armed bandits, linear bandits, and reinforcement learning problems. We start by analyzing the performance of `greedy’ algorithms that query a reward whenever they can. We show that in fully stochastic settings, doing so performs surprisingly well, but in the presence of any adversity, this might lead to linear regret. To overcome this issue, we propose the Confidence-Budget Matching (CBM) principle that queries rewards when the confidence intervals are wider than the inverse square root of the available budget. We analyze the performance of CBM based algorithms in different settings and show that they perform well in the presence of adversity in the contexts, initial states, and budgets.

Joint work with Nadav Merlis, Aadirupa Saha and Shie Mannor (to be in ICML 2021).

The recording of Yonathan’s talk is available:

https://us02web.zoom.us/rec/share/SdaHGxxu6tXjbjPIUajttPPK4_AZMsGdcrf9JQM1UIw33g0mEs4k-cuV1uUqwW3x.geqog3qz-krXbbQk?startTime=1621760727000

Thanks to Yonathan for presenting and to everyone who attended,

BIU Learning Club 9.5.2021 — Brain Chmiel — NEURAL GRADIENTS ARE NEAR-LOGNORMAL: IMPROVED QUANTIZED AND SPARSE TRAINING

The recording of Brian’s talk is available here:

https://us02web.zoom.us/rec/play/q1MhnoNJqMOBCBpADGABiGx7BUErjRyRe61hIsrWZeVwn5h0Jewn9JtO2z5BzYtFUrMB7dKckS2Us0RK.tsVtRh_-0qN13UJI

Zoom:

https://us02web.zoom.us/j/84467714967?pwd=MW5pR216eUlvMXRFK0g2WHpNbjBVQT09
Meeting ID: 844 6771 4967
Passcode: 149008

Title:

NEURAL GRADIENTS ARE NEAR-LOGNORMAL: IMPROVED QUANTIZED AND SPARSE TRAINING

Abstract:
While training can mostly be accelerated by reducing the time needed to propagate neural gradients (loss gradients with respect to the intermediate neural layer outputs) back throughout the model, most previous works focus on the quantization/pruning of weights and activations. These methods are often not applicable to neural gradients, which have very different statistical properties. Distinguished from weights and activations, we find that the distribution of neural gradients is approximately lognormal. Considering this, we suggest two closed-form analytical methods to reduce the computational and memory burdens of neural gradients. The first method optimizes the floating-point format and scale of the gradients. The second method accurately sets sparsity thresholds for gradient pruning. Each method achieves state-of-the-art results on ImageNet. To the best of our knowledge, this paper is the first to (1) quantize the gradients to 6-bit floating-point formats, or (2)achieve up to 85% gradient sparsity — in each case without accuracy degradation.

Learning Club talk by Raja Giryes

on Sunday 2.5.2021at 12:00, we will host Raja Giryes from Tel-Aviv University.

Title:

Robustifying neural networks
Abstract:
In this talk I will survey several techniques to make neural networks more robust. While neural networks achieve groundbreaking results in many applications, they depend strongly on the availability of good training data and the assumption that the data in the test time will resemble the one at train time. In this talk, we will survey different techniques that we developed for improving the network robustness and/or adapting it to the data at hand.

Zoom:

https://us02web.zoom.us/j/85098955112pwd=cEx1S2ovR0dsQ2g1bGNGLzlGOGN1UT09

Meeting ID: 850 9895 5112

Passcode: 109415

The recording or Raja’s talk:

https://us02web.zoom.us/rec/share/4cPt9gV_vNQ4wyn8cqdfHiaBKWa6vxGGnLjgafUj6HbHFv1vBT4nAzC1kuc-BTGd.0tCrK2iLx89XLlKd

A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks – Learning Club talk by Asaf Noy

Learning Club – Asaf Noy

Title:

A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks

Abstract:
Deep neural networks’ remarkable ability to correctly fit training data when optimized by gradient-based algorithms is yet to be fully understood. Recent theoretical results explain the convergence for ReLU networks that are wider than those used in practice by orders of magnitude. In this work, we take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time. We show that convergence to a global minimum is guaranteed for networks with widths quadratic in the sample size and linear in their depth at a time logarithmic in both. Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size. This construction can be viewed as a novel technique to accelerate training, while its tight finite-width equivalence to Neural Tangent Kernel (NTK) suggests it can be utilized to study generalization as well.

The recording of Asaf’s talk:

https://us02web.zoom.us/rec/share/rXjFIoMu_C1bBmZJnKTshvW69cVrcihMpOYmtumes664Jfj5ueNA9yYMrfgfr0OZ.QTgmKUSjxj_SoWkp?startTime=1619341484000