FOCS'23 Workshop: Exploring the Frontiers of Adaptive Robustness

Randomized algorithms are often designed and analyzed under the assumption that their internal randomness is independent of their fixed inputs. However, for complex and evolving datasets, repeated interactions with the dataset may affect the distributions of the future dataset updates or future interactions with the dataset. Thus, there has recently been substantial interest in understanding the performance of algorithms when the inputs are chosen adaptively, possibly as a function of their previous outputs, and in designing robust algorithms which provide utility guarantees even when their inputs are chosen adaptively.

The goal of this proposed workshop is to provide exposure to this emerging area in a unified manner accessible to a general theoretical audience. We will highlight new algorithmic and analytic techniques, new computational models that have been proposed and intricately analyzed, and new connections to other areas of theoretical computer science that have been observed, including:

Data structures for adaptive robustness
Adversarially-robust streaming model
Differential privacy as a tool against black-box adversaries
Cryptography as a tool against white-box time-bounded adversaries

Due to both the nascent field of study and the connections to multiple areas of TCS, we hope this workshop is of interest to both newcomers and researchers to the areas of adaptive data analysis, cryptography, differential privacy, and streaming algorithms.

The 5-hour proposed workshop will consist of tutorial talks on adaptive data analysis and the adversarially robust streaming model, followed by a number of research and survey talks on specific recent results in the area.

When	Monday-Thursday, November 6-9, 2023
Where	The workshop will be held at the Hotel Paradox, located at 611 Ocean St., Santa Cruz, CA 95060, as part of FOCS 2023.

Organizers

Jelani Nelson (UC Berkeley)
David P. Woodruff (Carnegie Mellon University)
Samson Zhou (Texas A&M University)

Schedule

Monday, November 6, 2023

9:00-9:05 am PT

Organizers

Opening Remarks

9:05-9:40 am PT

Rajesh Jayaram
Google Research

Adversarially Robust Streaming: A Survey of Recent Progress and Future Directions

Abstract

A streaming algorithm is adversarially robust if it retains its performance guarantees over a sequence of updates that are generated adaptively by an adversary that observes the previous outputs of the algorithm. Although deterministic algorithms are inherently robust, many central streaming problems provably require randomization to achieve sublinear space. Since randomized algorithms are not inherently robust to adaptive updates, this raises the question of whether sublinear space adversarially robust algorithms exist for many streaming tasks. In this talk, we will discuss recent results and central challenges for the adversarially robust streaming model, as well as directions for future work.

9:45-10:15 am PT

Thomas Steinke
Google DeepMind

Adaptive Data Analysis: An Introductory Survey

Abstract

Tuesday, November 7, 2023

9:00-9:35 am PT

Edith Cohen
Google Research and Tel Aviv University

On Robustness to Adaptive Inputs: A Case Study of CountSketch

Abstract

The performance of randomized algorithms can be considered in both adaptive and non-adaptive settings. Adaptive settings come into play when the algorithm is part of a system with feedback or when an adversary attempts to construct an input that causes the algorithm to fail. Recent research in the past decade has led to a black-box reduction, allowing k replicas of an algorithm designed for non-adaptive inputs to handle a quadratic number of adaptive inputs. However, it is important to understand how robust common practical sketch structures are to adaptive inputs and whether we can improve upon the black-box method. We investigate these questions for CountSketch, a widely-used sketching method.

(Joint work with Xin Lyu, Jelani Nelson, Tamas Sarlos, Moshe Shechner, and Uri Stemmer)

9:40-10:15 am PT

David P. Woodruff
Carnegie Mellon University

The White-Box Adversarial Data Stream Model

Abstract

There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a black-box adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible.

Wednesday, November 8, 2023

9:00-9:35 am PT

Vladimir Braverman
Rice University

Adversarial Robustness of Streaming Algorithms through Importance Sampling

Abstract

In the adversarial streaming model, an adversary gives an algorithm a sequence of adaptively chosen updates u_1,...,u_n as a data stream. The goal of the algorithm is to compute or approximate some predetermined function for every prefix of the adversarial stream, but the adversary may generate future updates based on previous outputs of the algorithm. In particular, the adversary may gradually learn the random bits internally used by an algorithm to manipulate dependencies in the input. We introduce adversarially robust streaming algorithms for central machine learning and algorithmic tasks, such as regression and clustering, as well as their more general counterparts, subspace embedding, low-rank approximation, and coreset construction. Our results are based on a simple, but powerful, observation that sampling based algorithms give rise to adversarial robustness which is in contrast to sketching based algorithms, which are very prevalent in the streaming literature but suffer from adversarial attacks. In addition, we show that the well-known merge and reduce paradigm in streaming is adversarially robust. Since the merge and reduce paradigm defines coreset constructions, we thus obtain robust algorithms for k-means, k-median, k-center, Bregman clustering, projective clustering, principal component analysis (PCA) and non-negative matrix factorization.

9:40-10:15 am PT

Guy Blanc
Stanford University

Subsampling Suffices for Adaptive Data Analysis

Abstract

Ensuring that analyses performed on a dataset are representative of the entire population is one of the central problems in statistics. Most classical techniques assume that the dataset is independent of the analyst's query and break down in the common setting where a dataset is reused for multiple, adaptively chosen, queries. This problem of adaptive data analysis was formalized in the seminal works of Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014).

We identify a remarkably simple set of assumptions under which the queries will continue to be representative even when chosen adaptively: The only requirements are that each query takes as input a random subsample and outputs few bits. This result shows that the noise inherent in subsampling is sufficient to guarantee that query responses generalize. The simplicity of this subsampling-based framework allows it to model a variety of real-world scenarios not covered by prior work. We furthermore demonstrate the utility of this framework by designing a state of the art and extremely simple mechanism for answering statistical queries.

Thursday, November 9, 2023

9:00-9:35 am PT

Samson Zhou
Texas A&M University

Tight Bounds for Adversarially Robust Streams and Sliding Windows via Difference Estimators

Abstract

We introduce difference estimators for data stream computation, which provide approximations to F(v)-F(u) for frequency vectors v,u and a given function F. We show how to use such estimators to carefully trade error for memory in an iterative manner. We give the first difference estimators for the frequency moments F_p for p between 0 and 2, as well as for integers p>2. Using these, we resolve a number of central open questions in adversarial robust streaming and sliding window models.

For both models, we obtain algorithms for norm estimation whose dependence on epsilon is 1/epsilon^2, which shows, up to logarithmic factors, that there is no overhead over the standard insertion-only data stream model for these problems.

(Joint work with David P. Woodruff)

9:40-10:15 am PT

Sepehr Assadi
University of Waterloo and Rutgers University

Coloring in Graph Streams via Deterministic and Adversarially Robust Algorithms

Abstract

In recent years, there has been a growing interest in solving various graph coloring problems in the streaming model. The algorithms in this line of work are all crucially randomized, raising the natural question of how important a role randomization plays in streaming graph coloring. In this talk, we review the recent progress on this question, including fully settling the complexity of deterministic algorithms, and a three-way separation between power of randomized algorithms, adversarially robust algorithms, and deterministic algorithms.