The Space Complexity of Sampling

Christian Engels August 28, 2021 [TCS]

2021-07-22-CGZ-The Space Complexity of Sampling

Explicit boolean function that cannot be computed by a width $2^{\Omega(n)}$ read-once branching program but can be easily sampled by ROBPs.
An explicit boolean function such that any distribution by a width $2^{\Omega(n)}$ width ROBP has statistical distance $1/2-2^{-\Omega(n)}$ from the uniform distribution.
- Notice that sampling can be easier than computing.
- (x,XOR(x)) is hard to compute by AC0 circuits but (U,XOR(U)) where $U$ is the uniform distribution is easy to sample by AC0 circuits.
Same but with distance to a good code.
Asking questions similar to Viola20 answered but in the limited space model.

Two models:
Complex Samplers:
- ROBPs without input where every vertex can have an arbitrary number of outputs.
- Every vertex has a distribution.
- Every edge is also denoted by a label $0,1$.
- The sampled output is taking a random walk (according to the distributions) and the concatenated labels.
Complex Samplers and ROBP are equivalent up to small loss in parameters with same width restriction (for sampling).
- This is proven in Theorem 6
Simple Samplers:
- Simple Samplers are Complex Samplers where the outgoing degree of every vertex is at most 2.
Simple Samplers are the same as computing with ROBP (on uniform input distribution).

Address function: Take input $x||a$ with $a=\log |x|$ and size $n=k+\log k$ and output $x_{\text{bin}^{-2}(a)}$.
- This function separates read-once and read-twice branching programs.
- Also separates simple and complex samplers.
Simple samplers cannot the distribution $(U_n, address(U_n))$.
- Assume the width is $2^k-1$
- There is a single unique path for every output.
- At layer $V_k$ there must be two different sequences of bits $x,y\in {0,1}^k$ that lead to the same vertex $v\in V_k$.
  - Because of the width restriction.
- But $x,y$ differ at some coordinate which could be exactly the width it produces.
Complex Samplers can sample this distribution in width $O(n)$.
- It is a convex combination of $2k$ distributions.
- Fix such that $x_a\in {0,1}$.
- Now every distribution is easy, fix $a$ and the bit $x_a$ and draw everything else at random.
- Notice that the target distribution is uniformly drawn by first picking a distribution as above and then draw the distribution.
- Any collection of $2k$ distributions that can be sampled in width 1 can be sampled by width $2k$ in a complex sampler.
  - Just have a deciding vertex at the start that samples uniformly these.
  - As every vertex needs an output bit, contract the first two edges together.

Questions:
- What if the address function is $a||x$ instead?
  - Then the width is obviously maximal $n$.
- Notice that the following does not work.
  - Sample the first bit $a_1$. If this is zero, sample $x_1,\dots,x_{k-1}$ bits independently
  - Otherwise, sample $a_2$ and recurse.
  - At every point we can sample $x_1,\dots,x_{k-1}$ independently and hence save.
  - This gives us a skew binary tree which has still width $\Omega(2^d)$.
- It is unclear if a lower bound for any function exist that works for every order (not necessarily for address).
  - The lower bound works by using the Computation equal to Simple Samplers argument.
  - The Computation is actually a lower bound for oblivious read-k branching program.
  - Oblivious, as in the ABP sense, i.e., the program has in every layer only one variable it reads and this does not depend on values it read so far.
- Nissan-Widgerson Generator seems to be, however, independent of the order.
  - The Nissan-Widgerson Generator is about a Turing Machine that visits every cell at most $d$ times.
  - An argument about the sequences shows the lower bound.