- This event has passed.
Differentially Private Ordinary Least Squares – Talk by Or Shefft
March 8, 2020 @ 12:00 pm - 1:00 pm IST
Title: Differentially Private Ordinary Least Squares
Abstract:
Linear regression is one of the most prevalent techniques in machine learning; however, it is also common to use linear regression for its explanatory capabilities rather than label prediction. Ordinary Least Squares (OLS) is often used in statistics to establish a correlation between an attribute (e.g. gender) and a label (e.g. income) in the presence of other (potentially correlated) features. OLS assumes a particular model that randomly generates the data, and derives t-values — representing the likelihood of each real value to be the true correlation. Using t-values, OLS can release a confidence interval, which is an interval on the reals that is likely to contain the true correlation; and when this interval does not intersect the origin, we can reject the null hypothesis as it is likely that the true correlation is non-zero.
Our work aims at achieving similar guarantees on data under differentially private estimators — estimators which one can publish while adhering to the rigorous mathematical notion of differential privacy. First, we show that for well-spread data, the Gaussian Johnson-Lindenstrauss Transform (JLT) gives a very good approximation of t-values; secondly, when JLT approximates Ridge regression (linear regression with l_2-regularization) we derive, under certain conditions, confidence intervals using the projected data; lastly, we derive, under different conditions, confidence intervals for the “Analyze Gauss” algorithm of Dwork et al (STOC14).
The Talk:
This talk is self-contained and assumes no prior knowledge in either differential privacy or ordinary least squares. Moreover, a non-negligible portion of the talk focuses on open problems, so beginning MSc and PhD students — from CS, Math or Statistics — are especially encouraged to participate.
Sent from Mailspring