Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity

Christian Engels November 11, 2020 [TCS, Arithmetic Circuits]

Paper

Chattopadhyay, Datta, Mukhopadhyay - Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
While the paper uses probabilities that are scaled by their support to always be correct, we will ignore this scaling for clarity. For details please check the original source.
This text also switches between $m$ being a monomial and a number sometimes.

Results

There is a polynomial that has depth 3 arithmetic circuits of size $O(nm^4)$ and every polynomial of this type needs $2^{\Omega(\sqrt{n})}$ size monotone circuits.
- This is the first constant depth vs monotone circuit bound.
- All other known bound have to go via depth reduction and hence will have size $2^{\sqrt{d}\log n}$ which is $\Theta(n)$ for
Both $F_{n,m} -\varepsilon P$ and $F_{n,m} + \varepsilon P$ have monotone circuit complexity $2^{\Omega(m)}$. Here $F_{n,m} = \prod_{i=1}^n (x_{i,1} + \dots + x_{i,m})$ and $P=P_{n,m}^{\mod 3} = \sum_{\sigma:[n]\rightarrow [m], \mod 3(\oplus(\sigma) = 0)} \prod_{i=1}^n x_{i,\sigma(i)}$.
- Hrubes recently showed how to show hardness against general arithmetic circuits.
- He showed that if $f$ is computed efficiently by general arithmetic circuits than $(1+\sum_i x_i)^d + \varepsilon f$ has efficient monotone circuits.
- This is theorem is a first step at proving similar bounds to what we need for Hrubes argument but is not quite the same.
- We transfer lower bounds on communication complexity (on a certain rectangle we cannot control) to polynomials that are given by multiplication of two polynomials with coefficients being at most one.
- This is similar to rank bounds, compare $\sum_{i} g_i h_i$, if the matrix has high rank than the non-commutative complexity is high.
- Here, however, we deal with a more complex measure (corruption) and monotone computation.
- From this, we use Lemma 4.1 to give us a connection between the measure $W$ and a function that fulfills $W(\alpha \beta \cap \mathcal{K}(f^{-1}(z))) \leq \gamma W(\alpha\beta)$.

Definitions

For a given set multilinear polynomial with variables $X = \cup_i X_i = \cup_i \lbrace x_{i,1},\dots, x_{i,m}\rbrace$ we define $\sigma:[n]\rightarrow [m]$ with $\sigma(i)=j_i$, i.e., the $j$th variable in the set $i$ for a given monomial $m$. (The function should probably be called $\sigma_m$.)
- This forms a bijection between the space of all functions $[n]\rightarrow [m]$ ($\mathcal{F}_{n,m}$) and the set of all set multilinear monomials.
Parity set vectors: Define the parity set vector for $m$, $\oplus(m)\in \lbrace 0,1\rbrace ^m$ where the $j$th entry is $|\sigma^{-1}(j)| \mod 2$, i.e., how many variables $x_{i,j}$ for a fixed $j$ exist $\mod 2$.
We define for a subset $S\subseteq \lbrace 0,1\rbrace^m$ $K(S)=\lbrace m \mid \oplus(m)\in S\rbrace$. This just takes all monomials that are compatible with a set $S$.
Rectangular corruption: We assume familiarity with rectangles and communication complexity. $Corr_{\lambda,\varepsilon}^z(F) = \min_{\lbrace R, \Pr_\lambda [R\cap F^{-1}(z)]\leq \varepsilon\Pr_\lambda [R]\rbrace} \log(1/\Pr_\lambda [R])$.
- Here $\Pr_\lambda [R]$ is the probability that an element under the distribution $\lambda$ is in $R$.
- The intutition behind this measure is that $F$ is hard (has large communication complexity) if finding rectangles that are approximately monochromatic is hard.
Small bias spaces: A multiset $\mathcal{B}\subseteq \lbrace -1,+1\rbrace ^m$ is called an $\varepsilon$-biased space if for every $S$ $1/N \sum_{b\in \mathcal{B}}\prod_{i\in S} b_i \leq \varepsilon$.
- I.e., no matter which $S$ we take, $B$ has still small bias under the projection to $S$.
- There are deterministically constructable spaces.
SINK is the following boolean function: Given a boolean vector $\binom{m}{2}$, define a complete directed graph with this for a vector $x_{i,j}=1$ then $v_i\rightarrow v_j$, otherwise $v_i \leftarrow v_j$.
- SINK$(x)=1$ if there is a sink in the graph.
A monomial $m$ is called a sink monomial if $SINK(\vec{\oplus}(m))=1$.
We call a polynomial a non-sink polynomial if it is completely supported on non-sink monomials.
We call a polynomial a $\delta$-non-sink polynomial if every monomial's $m$ coefficient $\alpha$ lies in the interval $[0,\delta]$ otherwise it lies in the interval $[1-\delta, 1]$.
- I think: With our transfer theorem, we can show that for any function $f$ that has large rectangular corruption, then the polynomial that is supported on exactly the $f(\vec{\oplus})=0$ has large monotone complexity.
- Except this is not quite true, it needs to be hard for $f\circ XOR$.

The measure

Basics:
- We define $\lambda = \mu(x\oplus y)$.
- Define the measure $W$ of a monomial $\kappa$ as $\mu(\vec{\oplus}(\kappa)) \cdot 2^m/m^n$. Remember that $\vec{\oplus}$ here is the parity vector
  - Notice that the scaling is needed in bounding the summation which stems from the constraints in the optimization problem in the proof.
- We define the measure $W$ of a polynomial via extending it linearily.
- Question This measure is kind of weird because of parity vectors, perhaps there is something easier we can do?
Lemma 4.1: If a balanced product polynomial $H=\alpha \cdot \beta$ with coefficients at most one. If this satisfies $W(H\cap \mathcal{K}(f^{-1}(z))) \leq \eta /3 W(H)$. Then $W(H)\leq \Omega(2^{-\min Corr_{\lambda,\eta}^z}(f\circ XOR))$.
- Proof:
- Let $u=\lbrace 0,1\rbrace ^m$, $\tilde \alpha[u] = \sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa]$. We sum over all coefficients of monomials that have the parity vector equal to $m$.
- Hence, we can write $\alpha \beta$ as $\left(\sum_{u\in \lbrace 0,1\rbrace ^m} \sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa] \kappa \right) \cdot \left(\sum_{v\in \lbrace 0,1\rbrace ^m} \sum_{\kappa' \in \oplus(\kappa')=u} \beta[\kappa'] \kappa \right)$.
- This is equal to $\sum_{x\in \lbrace 0,1\rbrace ^m} \left( \sum_{u\in \lbrace 0,1\rbrace ^m} \left( \sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa] \kappa \right) \cdot \left( \sum_{\kappa' \in \oplus(\kappa')=u\oplus x} \beta[\kappa'] \kappa' \right) \right)$ as we can obviously replace $w$ with a xor with $u$.
- For the next notice that our measure on this is equal to $\sum_{x\in \lbrace 0,1\rbrace ^m} \sum_{\kappa \in \oplus(\kappa)=u} \sum_{\kappa \in \oplus(\kappa')=u\oplus x} W(\kappa)\cdot \kappa'$ if we ignore coefficients.
  - But this is now equal to (by abusing notation a lot) $W(u\oplus (u\oplus x))$. Notice that $W(m\cdot m')$ with vectors $\oplus(m)=u,\oplus(m')=u'$ is exactly $u\oplus u'$.
- Hence, we can write this as $W(\alpha \beta) = \sum_{x\in \lbrace 0,1\rbrace ^m} W(\kappa'') \left(\sum_{u\in \lbrace 0,1\rbrace ^m} \left(\sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa]\right) \left(\sum_{\kappa' \in \oplus(\kappa)=u'} \beta[\kappa']\right)\right)$ where $W(\kappa'')=x$.
- Now with our new notation this is equal to $\sum_{x\in \lbrace 0,1\rbrace ^m} W(\kappa'') \sum_{u\in \lbrace 0,1\rbrace } \tilde\alpha[u] \tilde\beta[u\oplus x]$.
- We denote this sum as $W(\tilde\alpha,\tilde\beta)$ and $W_z(\tilde w\alpha, \tilde\beta)$ if we enforce that $f(x)=z$.
- There is a optimization problem $B$ which maximizes $\tilde W(\alpha,\beta)$ with constraints $0\leq \alpha[u] \leq 4m^{\lvert I(\alpha)\rvert}$ and similar for $\beta[v]$.
  - This follows from a corollary I skip.
- For an optimal solution to $B$, we can see that all except one $\alpha^{*},\beta^{*}$ have to be either $0$ or $4m^{\lvert I(\alpha)\rvert}$ ($\beta$, respectively).
- Now we can define a rectangle that is all $u,v$ such that $\alpha[u]\neq 0$ and $\beta[v]\neq 0$ and removing the optimal solution.
- Hence, we can bound the overall weight by the following sums: The sum of all $u,v$ in the rectangle, the sum of $v$ outside the rectangle and $u$ arbitrary and the sum of $u$ outside the rectangle and $u$ arbitrary.
  - We upper bound the last two by using the optimal solution, i.e., $u^{*}$ and $v^{*}$.
  - The optimal solution might be a good upper bound already but does not have anything to do with the rectangle.
- This is just $\sum_{u\in A,v\in B} \alpha^{*}[u]\beta^{*}[v] W(u\oplus v)$. The rest follows from the upper bounds given by our optimization program and and inserting the definition of the measure.
- The other two summation can be upper bounded in a similar fashion where the sum over $W(u^{*} \oplus v)$ is upper bounded by $1$ as $\mu$ is a probability measure.
Corollary 4.1

Construction of depth 3 formulas monotone lowerbound

We look at the function $SINK \circ XOR$.
We define the distribution $\mu$ as follows:
- With probability $1/2$ sample a vertex $i\in [k]$ at random and sample randomly an $x$ under the condition that $i$ is a sink. Otherwise, with probability $1/2$ sample $x$ at random.
Define $\lambda =\mu(x\oplus y)$.
Theorem 5.2 (Lemma from [CMS20]): Corr$_{\lambda,\mu}^1(SINK\circ XOR) = \Omega(\sqrt{m})$.
Proof of the lower bound:
- Let $Q$ be a $\delta$-non sink polynomial.
- By common structure theorem (Theorem 2.1) we can write $Q=\sum_{i=1}^s m_i m'_i$.
- We take the measure $W$ described in Lemma 4.1.
- $W(Q\cap \mathcal{K}(SINK^{-1}(1))) = \sum_{i=i}^s W(\alpha_i \beta_i \cap \mathcal{K}(SINK^{-1}(1)))$ (by linearity)
- $\geq \sum_{i=1}^t \nu/3 W(\alpha_i \beta_i) -48 s 2^{-Cor_{\lambda, nu}(SINK \circ XOR)}$ by Corollary 4.1
  - Notice that Theorem 2.1 gives us the bound on the coefficients.
- Hence, solving $48 s 2^{-Cor_{\lambda, \nu}(SINK \circ XOR)} \geq \nu/3 W(Q) - W(Q\cap SINK^{-1}(1))$ gives us a lowerbound.
  - $W(Q\cap SINK^{-1}(1))\leq 4\delta$.
  - $W(Q)\geq (1-\delta)/3$.
  - Hence $s\geq ~ 2^{Cor_{\lambda, \nu}(SINK \circ XOR)}$ which is $2^{\sqrt{m}}$ by Theorem 5.2.