TITLE:  Streaming Symmetric Norms via Measure Concentration

ABSTRACT: 

We characterize the streaming space complexity of every symmetric norm l (a norm on R^n invariant under sign-flips and coordinate-permutations), by relating this space complexity to the measure-concentration characteristics of l. Specifically, we provide nearly matching upper and lower bounds on the space complexity of calculating a (1±ϵ)-approximation to the norm of the stream, for every 0<ϵ≤1/2. (The bounds match up to poly(ϵ^−1logn) factors.) We further extend those bounds to any large approximation ratio D≥1.1, showing that the decrease in space complexity is proportional to D^2, and that this factor the best possible. All of the bounds depend on the median of l(x) when x is drawn uniformly from the l2 unit sphere. The same median governs many phenomena in high-dimensional spaces, such as large-deviation bounds and the critical dimension in Dvoretzky's Theorem. 

The family of symmetric norms contains several well-studied norms, such as all l_p norms, and indeed we provide a new explanation for the disparity in space complexity between p≤2 and p>2. In addition, we apply our general results to easily derive bounds for several norms that were not studied before in the streaming model, including the top-k norm and the k-support norm, which was recently employed for machine learning tasks.  Overall, these results make progress on two outstanding problems in the area of sublinear algorithms (Problems 5 and 30 in https://sublinear.info).  Based on paper: https://arxiv.org/abs/1511.01111. Joint work with: Jaroslaw Blasiok, Vladimir Braverman, Stephen R. Chestnut, Robert Krauthgamer.

Bio:  Lin Yang is currently a PhD student in the Computer Science Department and Physics Department of Johns Hopkins University. He will obtain two PhD degrees at the end of this year. He works on algorithms for streaming data analysis, studying the fundamental complexity of the streaming computation model as well as designing new and better algorithms for important problems. His research connects theoretical computer science with machine learning and computational cosmology in the big data regime.