Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
Christian Engels November 11, 2020 [TCS, Arithmetic Circuits]Paper
- Chattopadhyay, Datta, Mukhopadhyay - Lower Bounds for Monotone Arithmetic Circuits via Communication Complexity
- While the paper uses probabilities that are scaled by their support to always be correct, we will ignore this scaling for clarity. For details please check the original source.
- This text also switches between $m$ being a monomial and a number sometimes.
Results
- There is a polynomial that has depth 3 arithmetic circuits of size $O(nm^4)$ and every polynomial of this type needs $2^{\Omega(\sqrt{n})}$ size monotone circuits.
- This is the first constant depth vs monotone circuit bound.
- All other known bound have to go via depth reduction and hence will have size $2^{\sqrt{d}\log n}$ which is $\Theta(n)$ for
- Both $F_{n,m} -\varepsilon P$ and $F_{n,m} + \varepsilon P$ have monotone circuit complexity $2^{\Omega(m)}$. Here $F_{n,m} = \prod_{i=1}^n (x_{i,1} + \dots + x_{i,m})$ and $P=P_{n,m}^{\mod 3} = \sum_{\sigma:[n]\rightarrow [m], \mod 3(\oplus(\sigma) = 0)} \prod_{i=1}^n x_{i,\sigma(i)}$.
- Hrubes recently showed how to show hardness against general arithmetic circuits.
- He showed that if $f$ is computed efficiently by general arithmetic circuits than $(1+\sum_i x_i)^d + \varepsilon f$ has efficient monotone circuits.
- This is theorem is a first step at proving similar bounds to what we need for Hrubes argument but is not quite the same.
- We transfer lower bounds on communication complexity (on a certain rectangle we cannot control) to polynomials that are given by multiplication of two polynomials with coefficients being at most one.
- This is similar to rank bounds, compare $\sum_{i} g_i h_i$, if the matrix has high rank than the non-commutative complexity is high.
- Here, however, we deal with a more complex measure (corruption) and monotone computation.
- From this, we use Lemma 4.1 to give us a connection between the measure $W$ and a function that fulfills $W(\alpha \beta \cap \mathcal{K}(f^{-1}(z))) \leq \gamma W(\alpha\beta)$.
Definitions
- For a given set multilinear polynomial with variables $X = \cup_i X_i = \cup_i \lbrace x_{i,1},\dots, x_{i,m}\rbrace$ we define $\sigma:[n]\rightarrow [m]$ with $\sigma(i)=j_i$, i.e., the $j$th variable in the set $i$ for a given monomial $m$. (The function should probably be called $\sigma_m$.)
- This forms a bijection between the space of all functions $[n]\rightarrow [m]$ ($\mathcal{F}_{n,m}$) and the set of all set multilinear monomials.
- Parity set vectors: Define the parity set vector for $m$, $\oplus(m)\in \lbrace 0,1\rbrace ^m$ where the $j$th entry is $|\sigma^{-1}(j)| \mod 2$, i.e., how many variables $x_{i,j}$ for a fixed $j$ exist $\mod 2$.
- We define for a subset $S\subseteq \lbrace 0,1\rbrace^m$ $K(S)=\lbrace m \mid \oplus(m)\in S\rbrace$. This just takes all monomials that are compatible with a set $S$.
- Rectangular corruption: We assume familiarity with rectangles and communication complexity. $Corr_{\lambda,\varepsilon}^z(F) = \min_{\lbrace R, \Pr_\lambda [R\cap F^{-1}(z)]\leq \varepsilon\Pr_\lambda [R]\rbrace} \log(1/\Pr_\lambda [R])$.
- Here $\Pr_\lambda [R]$ is the probability that an element under the distribution $\lambda$ is in $R$.
- The intutition behind this measure is that $F$ is hard (has large communication complexity) if finding rectangles that are approximately monochromatic is hard.
- Small bias spaces: A multiset $\mathcal{B}\subseteq \lbrace -1,+1\rbrace ^m$ is called an $\varepsilon$-biased space if for every $S$ $1/N \sum_{b\in \mathcal{B}}\prod_{i\in S} b_i \leq \varepsilon$.
- I.e., no matter which $S$ we take, $B$ has still small bias under the projection to $S$.
- There are deterministically constructable spaces.
- SINK is the following boolean function: Given a boolean vector $\binom{m}{2}$, define a complete directed graph with this for a vector $x_{i,j}=1$ then $v_i\rightarrow v_j$, otherwise $v_i \leftarrow v_j$.
- SINK$(x)=1$ if there is a sink in the graph.
- A monomial $m$ is called a sink monomial if $SINK(\vec{\oplus}(m))=1$.
- We call a polynomial a non-sink polynomial if it is completely supported on non-sink monomials.
- We call a polynomial a $\delta$-non-sink polynomial if every monomial's $m$ coefficient $\alpha$ lies in the interval $[0,\delta]$ otherwise it lies in the interval $[1-\delta, 1]$.
- I think: With our transfer theorem, we can show that for any function $f$ that has large rectangular corruption, then the polynomial that is supported on exactly the $f(\vec{\oplus})=0$ has large monotone complexity.
- Except this is not quite true, it needs to be hard for $f\circ XOR$.
The measure
- Basics:
- We define $\lambda = \mu(x\oplus y)$.
- Define the measure $W$ of a monomial $\kappa$ as $\mu(\vec{\oplus}(\kappa)) \cdot 2^m/m^n$. Remember that $\vec{\oplus}$ here is the parity vector
- Notice that the scaling is needed in bounding the summation which stems from the constraints in the optimization problem in the proof.
- We define the measure $W$ of a polynomial via extending it linearily.
- Question This measure is kind of weird because of parity vectors, perhaps there is something easier we can do?
- Lemma 4.1: If a balanced product polynomial $H=\alpha \cdot \beta$ with coefficients at most one. If this satisfies $W(H\cap \mathcal{K}(f^{-1}(z))) \leq \eta /3 W(H)$. Then $W(H)\leq \Omega(2^{-\min Corr_{\lambda,\eta}^z}(f\circ XOR))$.
- Proof:
- Let $u=\lbrace 0,1\rbrace ^m$, $\tilde \alpha[u] = \sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa]$. We sum over all coefficients of monomials that have the parity vector equal to $m$.
- Hence, we can write $\alpha \beta$ as $\left(\sum_{u\in \lbrace 0,1\rbrace ^m} \sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa] \kappa \right) \cdot \left(\sum_{v\in \lbrace 0,1\rbrace ^m} \sum_{\kappa' \in \oplus(\kappa')=u} \beta[\kappa'] \kappa \right)$.
- This is equal to $\sum_{x\in \lbrace 0,1\rbrace ^m} \left( \sum_{u\in \lbrace 0,1\rbrace ^m} \left( \sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa] \kappa \right) \cdot \left( \sum_{\kappa' \in \oplus(\kappa')=u\oplus x} \beta[\kappa'] \kappa' \right) \right)$ as we can obviously replace $w$ with a xor with $u$.
- For the next notice that our measure on this is equal to $\sum_{x\in \lbrace 0,1\rbrace ^m} \sum_{\kappa \in \oplus(\kappa)=u} \sum_{\kappa \in \oplus(\kappa')=u\oplus x} W(\kappa)\cdot \kappa'$ if we ignore coefficients.
- But this is now equal to (by abusing notation a lot) $W(u\oplus (u\oplus x))$. Notice that $W(m\cdot m')$ with vectors $\oplus(m)=u,\oplus(m')=u'$ is exactly $u\oplus u'$.
- Hence, we can write this as $W(\alpha \beta) = \sum_{x\in \lbrace 0,1\rbrace ^m} W(\kappa'') \left(\sum_{u\in \lbrace 0,1\rbrace ^m} \left(\sum_{\kappa \in \oplus(\kappa)=u} \alpha[\kappa]\right) \left(\sum_{\kappa' \in \oplus(\kappa)=u'} \beta[\kappa']\right)\right)$ where $W(\kappa'')=x$.
- Now with our new notation this is equal to $\sum_{x\in \lbrace 0,1\rbrace ^m} W(\kappa'') \sum_{u\in \lbrace 0,1\rbrace } \tilde\alpha[u] \tilde\beta[u\oplus x]$.
- We denote this sum as $W(\tilde\alpha,\tilde\beta)$ and $W_z(\tilde w\alpha, \tilde\beta)$ if we enforce that $f(x)=z$.
- There is a optimization problem $B$ which maximizes $\tilde W(\alpha,\beta)$ with constraints $0\leq \alpha[u] \leq 4m^{\lvert I(\alpha)\rvert}$ and similar for $\beta[v]$.
- This follows from a corollary I skip.
- For an optimal solution to $B$, we can see that all except one $\alpha^{*},\beta^{*}$ have to be either $0$ or $4m^{\lvert I(\alpha)\rvert}$ ($\beta$, respectively).
- Now we can define a rectangle that is all $u,v$ such that $\alpha[u]\neq 0$ and $\beta[v]\neq 0$ and removing the optimal solution.
- Hence, we can bound the overall weight by the following sums: The sum of all $u,v$ in the rectangle, the sum of $v$ outside the rectangle and $u$ arbitrary and the sum of $u$ outside the rectangle and $u$ arbitrary.
- We upper bound the last two by using the optimal solution, i.e., $u^{*}$ and $v^{*}$.
- The optimal solution might be a good upper bound already but does not have anything to do with the rectangle.
- This is just $\sum_{u\in A,v\in B} \alpha^{*}[u]\beta^{*}[v] W(u\oplus v)$. The rest follows from the upper bounds given by our optimization program and and inserting the definition of the measure.
- The other two summation can be upper bounded in a similar fashion where the sum over $W(u^{*} \oplus v)$ is upper bounded by $1$ as $\mu$ is a probability measure.
- Corollary 4.1
Construction of depth 3 formulas monotone lowerbound
- We look at the function $SINK \circ XOR$.
- We define the distribution $\mu$ as follows:
- With probability $1/2$ sample a vertex $i\in [k]$ at random and sample randomly an $x$ under the condition that $i$ is a sink. Otherwise, with probability $1/2$ sample $x$ at random.
- Define $\lambda =\mu(x\oplus y)$.
- Theorem 5.2 (Lemma from [CMS20]): Corr$_{\lambda,\mu}^1(SINK\circ XOR) = \Omega(\sqrt{m})$.
- Proof of the lower bound:
- Let $Q$ be a $\delta$-non sink polynomial.
- By common structure theorem (Theorem 2.1) we can write $Q=\sum_{i=1}^s m_i m'_i$.
- We take the measure $W$ described in Lemma 4.1.
- $W(Q\cap \mathcal{K}(SINK^{-1}(1))) = \sum_{i=i}^s W(\alpha_i \beta_i \cap \mathcal{K}(SINK^{-1}(1)))$ (by linearity)
- $\geq \sum_{i=1}^t \nu/3 W(\alpha_i \beta_i) -48 s 2^{-Cor_{\lambda, nu}(SINK \circ XOR)}$ by Corollary 4.1
- Notice that Theorem 2.1 gives us the bound on the coefficients.
- Hence, solving $48 s 2^{-Cor_{\lambda, \nu}(SINK \circ XOR)} \geq \nu/3 W(Q) - W(Q\cap SINK^{-1}(1))$ gives us a lowerbound.
- $W(Q\cap SINK^{-1}(1))\leq 4\delta$.
- $W(Q)\geq (1-\delta)/3$.
- Hence $s\geq ~ 2^{Cor_{\lambda, \nu}(SINK \circ XOR)}$ which is $2^{\sqrt{m}}$ by Theorem 5.2.