Factoring vs Identity Testing

Christian Engels August 14, 2018

layout: page title: Overview over the Equivalence between Polynomial Identity Testing and Polynomial Factorization author: Christian Engels publish: true

Sources

Basics

Everybody knows what factoring is over univariate polynomials.
Multivariate polynomials.
A polynomial is irreducible if it is not constant and cannot be factored into the product of two non-constant polynomials with coefficients in F.
Multivariate polynomials might not have an explicit algebraic expression (rather we might have as zeroes of some equation system).
Over a so called Unique Factorization Domain, every polynomial can be factored and the factor is unique.
There exists randomized algorithms for factoring. (Kaltofen 89)
Deterministic algorithms are unkown.

Shpilka, Volkovich

Multilinear Circuits
Theorem: Assume we have a deterministic PIT for a circuit of the form $C_1 + C_2\times C_3$ where $C_i$ are $(n,s,d)$ circuits with running time $T(s,d)$. Then we have a deterministic factoring algoirhtms that outputs the factors (in circuit form) in time polynomial in $n,d, T(s,d)$. Additionally, the size of each factor is bounded by $s$.

Preliminaries

Polynomial sized circuits with polynomial degree on $n$ variables.
PIT: Is such a circuit identically to the zero polynomial.
PIT produce circuit lower bounds (boolean or algebraic) and the other way around.
Depth of a circuit.
If there is a deterministic algorithm that outputs "true" if a given circuit is decomposable, then there is a deterministic algorithm that decides PIT. (This is obvious, as $f+yz$ is not decomposable into factors if it is not zero for new variables $y,z$.)
Note that it is enough to find out the index sets as we are multilinear.
Assume the field is large enough.
$f$ depends on $x_i$ when there exists $a,b$ such that $f(a)\neq f(a_1,\dots,a_{i-1},b,a_{i+1},\dots, a_n)$.
$var(f) = { i \mid \text{f depends on $x_i$} }$.
Variable Partition: Let $I$ be a partition. Then a variable partition of $f$ is such that $f=h_1(X_{I_1})\cdots h_k(X_{I_k})$

Justifying Assignments

Justifying assignment: We say an assignment $a$ is justifying iff for all index sets $I$, $var(f\vert_{x_I \leftarrow a_I}) = var(f)\setminus I$. In essence, replacing the variables with assignments does not kill anything but the assigned variables.
Let there be a deterministic PIT algorithms for $(n,s,d)$ circuits. Then there exists an algorithm that returns a justifying assignment in time polynomial in $n,s, T(s,d)$ where $T(s,d)$ is the running time of the PIT algorithm.

Proof

Prerequisites:
- Let $f_{i,a} = f(a_1,\dots,a_{i-1},x_i,a_{i+1}, \dots, a_n)$.
- We define $f_{i,X}$ as above except that $a_j=x_j$ for all $j\neq i$.
- We will study $f_{i,X}(a_i) - f_{i,X}(0)$.
- We say $a$ is a witness for $x_i$ if $f_{i,X}(a) - f_{i,X}(0) \not\equiv 0$ or $i\not\in var(f)$.
- We say $a$ is a witness for $f$ if $a_i$ is a witness for $x_i$ for all $i$.
- Notice that a witness is not a justifying assignment, it just says that the line stays intact.
Lemma 2: $a$ is a justifying assignment if $a$ is a witness and for every $i\in var(f)$, $f_{i,a}(a_i)-f_{i,a}(0) \neq 0$.
- Proof:
  - Hence, $f_{i,X}(x_i)$ is of the form $x_iP(a) + Q(a)$ where $P(a), Q(a)$ are non-zero.
  - Hence, it is "justifying" for all sets of size $n-1$ and hence justifying for all sets.
Lemma 3: Let $W\subseteq \mathbb{F}$ be a subset of size $d+1$ then $W^n$ contains a witness for $f$.
- Proof:
  - Notice that $f_{i,w}(a) - f_{i,w}(0)$ is a univariate polynomial of degree $d$.
  - Hence such a set needs to exist, as a univariate with degree $d$ has at most $d$ roots.
  - Continue this argument for all $i\in [n]$.
Lemma 4: For all $i\neq j$, and $\alpha, \beta\in \mathbb{F}$, $(f_{i,X}(\alpha) - f_{i,X}(0))\vert_{x_j \leftarrow \beta} = f_{i,X}(\alpha)\vert_{x_j \leftarrow \beta} - f_{i,X}(0)\vert_{x_j \leftarrow \beta}$.
- Proof:
  - Assume w.l.o.g. the left term has $x_j$ in it.
  - Then we know that there exists $x_j$ in the final polynomial.
  - Hence, replacing it earlier does not cancel it.
  - Similarly in the other direction.
Lemma 5: Given $f$ and a PIT for $\partial f$, we can find a witness for $f$ in time polynomially in $n,d,T(s,d)$.
- Proof:
  - For every $i\in [n]$, check for every $c\in [d+1]$ if $f_{i,X}(c) - f_{i,X}(0)\not\equiv 0$.
  - Lemma 3 says we will find such an assignment and with a similar argument as in the proof, we will find it in short time.
Algorithm:
- Find a witness $\alpha$ for $f$ as in Lemma 5.
- Set $g_i=f_{i,X}(\alpha_i) - f_{i,X}(0)$ for all $i$.
- For $j=1,\dots, n$ do
  - Find a value $c$ such that for every $k\neq j$, $g_k\not\equiv 0$ then $g_k\vert_{x_j \leftarrow c}\not\equiv 0$.
  - Set $a_j \leftarrow c$.
  - Set $g_k \leftarrow g_k\vert_{x_j \leftarrow c}$ for all $k\neq j$.
Theorem 2.4: The algorithm finds a justifying assignment in the usual time bounds.
- Proof:
  - There are two parts of the proof as the running time is obvious.
    - First:
      - Notice that $g_i$ has degree bounded by $d$ and there are at most $n-1$ many values.
      - Hence at most $d(n-1)+1$ many values for $c$ have to be searched in every step.
    - Second:
      - If $g_i$ is non-zero in step $j$ than it is non-zero after step $j$ by construction.
      - $g_i$ is initially non-zero as $\alpha$ is a witness.
      - After all steps, we found an assignment such that for all $i$ it holds that $f_{i,a}(\alpha_i) - f_{i,a}(0)\neq 0$.
      - Lemma 2 finishes the proof.

Decomposition

Algorithm (given a justifying assignment $a$):

$\mathcal{I}=\emptyset$, $J=[n]$.
Set $x_n=a_n$ and recursively compute the variable-partition of $C\vert_{x_n \leftarrow a_n}$ or return the singleton set ${ { 1} }$.
Let $\mathcal{I}'$ be the resulting partition.
For every $I\in \mathcal{I}'$ check if $C(a)\cdot C \equiv C_{x_n \leftarrow a_n} C_{x_{[n]\setminus I} \leftarrow a_{[n]\setminus I}}$.
1. If yes, add $I$ to $\mathcal{I}$ and set $J\leftarrow J\setminus I$.
2. Otherwise move to the next $I$.
Add the remaining elements to $\mathcal{I}$ as a single set ($\mathcal{I}\cup { {J}}$).

Lemma 3.2: Let $a$ be a justifying assignment for $f$ then for all $I$ $f(a) \cdot f \equiv f_{x_I\leftarrow a_I} \cdot f_{x_{[n]\setminus I}\leftarrow a_{[n]\setminus I}}$ iff $I$ is a disjoint union of sets from the variable-partition of $f$.
- Meaning:
  - Notice, that this says it could be a really bad factoring algorithm (i.e., not factoring at all).
  - Notice that in the first recursion step we have $n$ as a special variable. If some variable partition $I$ is not a valid partition by this lemma, then it means it has to be in a set with $n$.
- Proof:
  - $\Rightarrow$ (Assume equality holds)
    - As $a$ is a justifying assignment $f\vert_{x_I\leftarrow a_I}$ and $f\vert_{x_{[n]\setminus I} \leftarrow a_{[n]\setminus I}}$ are not equivalent to the zero polynomial.
    - Hence, $f(a)$ is not equal to zero (by equality).
    - Hence, $f(x) = f_{x_{[n]\setminus I}\leftarrow a_{[n]\setminus I}} \cdot f\vert_{x_I \leftarrow a_I}/f(a)$.
    - Then the results follows by uniqueness of decomposition.
    - Notice that this just says, we found a factorization (a unique factorization but not the smallest).
  - $\Leftarrow$ (decomposition to equality)
    - We have the decomposition $f(x)=h(X_I)g(X_{[n]\setminus I})$.
    - Now plugging in our justifying assignment results in $f\vert_{x_I\leftarrow a_I} = h(a) g(X_{[n]\setminus I})$ and $f\vert_{[n]\setminus I \leftarrow a_{[n]\setminus I}} = h(X_I) g(a)$.
    - Hence, $f\vert_{x_I\leftarrow a_I} f\vert_{[n]\setminus I \leftarrow a_{[n]\setminus I}} = h(a) g(X_{[n]\setminus I}) \cdot h(X_I) g(a)$.
    - And this is equal to $f(a)f(x)$. by associativity.
    - Notice, that we used here again the multilinearity.
Lemma 3.1: The algorithm outputs a variable partition for the polynomial computed by C
- Proof:
  - Proof by induction.
  - The base case is clear as a univariate multilinear polynomial does not have factors.
  - Let $C=h_1(X_{I_1}) \cdots h_k(X_{I_k})$. where $\mathcal{I}={ I_1,\dots, I_k}$ the correct partition.
  - Assume w.l.o.g. that $n\in I_k$.
  - Now $C_{x_n\leftarrow a_n} = h_1\cdots h_{k-1} h_k\vert_{x_n\leftarrow a_n}$ by justifying.
  - Now we can use our induction hypothesis on $C_{x_n\leftarrow a_n}$.
    - Notice that this could have factors $h_k\vert_{x_n\leftarrow a_n} = g_1\cdots g_l$
    - However, we will use the induction hypothesis on $C_{x_n\leftarrow a_n}$.
  - This gives us then a partition (not variable partition) $I_1,\dots, I_{k-1}, I_{k,1},\dots, I_{k,l}$.
  - By Lemma 3.2 $I_1,\dots,I_{k-1}$ will be added, however $I_{k,1},\dots,I_{k,l}$ will not be added (as $\mathcal{I}$ was already the partition).
  - The runtime if obvious from the algorithm.
Proof of the Theorem
- The proof is now simple.
- By Theorem 2.4 we can compute a justifying assignment (notice that he PIT also works for the derivative) and the number of variables.
- Lemma 3.1 gives us the variable partition.
- Hence, for every $I$ we see $\alpha y^m h_I = C\vert_{x_{[n]\setminus I} \leftarrow a_{[n]\setminus I}}$ for some $m$.
- A simple interpolation gives us $h_I$.

Kopparty, Saraf, Shpilka

Rough Overview of the Kaltofen Factorization Algorithm (reducing multivariate factorization to univariate factorization)

Initialization:
- Assume $f(x,y)$.
- Assume $f$ is monic, i.e., highest degree coefficient is one.
- Assume $f(x,0)$ is square free.
Univariate Factoring:
- Factor $f(x,0)$ as a univariate polynomial into $g(x,0)\cdot h(x,0)$.
- Various algorithms such as LLL and others.
Lifting:
- Lift the factors by the following theorem:
- Let $R=\mathbb{F}(x)[y]$ and $I=y^{2^i}R$ and let $f=g\cdot h\mod I$ and $a,b\in R$ such that $ag+bh=1\mod I$. Then we can efficiently compute $g',h'$ such that $f=g'h' \mod I^2$ and $g=g'\mod I$, $h=h'\mod I$.
- Now we use this to compute polynomials $g_i(x,y), h_i(x,y)$ such that $f(x,y) = g_i(x,y)h_i(x,y) \mod y^{2^i}$.
- Notice that we essentially started with the factors $f(x,y)=g(x,y)h(x,y)\mod y$.
Reducing the degree:
- We continue until we are at $y^{4d^2}$.
- Now $g_k$ divides(modulo $y^{4d^2}$) another low degree polynomial that has a common factor with $f$.
- Compute $g'$ such that $g'=g_k r \mod y^{4d^2}$ such that $\deg_x(g') < \deg_x(f)$ and $\deg_y(g')\leq \deg_y(f)$.
- Do this by solving linear equation system.
- This system is solvable as $g=g_k r_k \mod y^{4d^2}$.
Computing a non-trivial factor:
- Compute $\gcd(f,g')$.
- One can show (with the resultant) that this has always a non-trivial gcd.
- Continue now by induction on the factors.
Extending this to multivariate case:
- Take $f(x,y_1,\dots,y_n)$ and take it as $f(x,\alpha_1 + \beta_1 t,\dots, \alpha_n t+\beta_n)$ in $\mathbb{F}(\alpha_1,\dots,\alpha_n)[x,t]$ for some random subspace.
- Effective Hilbert Theorem: $\Pr_{\alpha,\beta}[f(x,\alpha_1t+\beta_1,\dots,\alpha_nt+\beta_n) \text{ is not irreducible}] \leq O(\frac{d^5}{\lvert S\rvert})$.
- Effective and efficient Hilber Theorem: We can even compute a polynomial $h(z_1,\dots,z_n,y_1,\dots,y_n)$ such that $h(\alpha,\beta)\neq 0$ iff $f$ is irreducible (follows form the proof of the effective hilbert theorem).
  - This is were we use our blackbox/hitting set for, i.e., one of our elements in the hitting set will be enough without needing to compute $h$.
- With high probability this restriction is enough (Effective Hilbert irreducibility).

Theorem

Polynomial sized circuits with polynomial degree on $n$ variables.
PIT: Is such a circuit identically to the zero polynomial.
PIT produce circuit lower bounds (boolean or algebraic) and the other way around.
Factoring: Compute irreducible factors of $f$.
Depth of a circuit.
$(n,s,d)$ circuit with $n$ variables, size $s$ and degree $d$.
Degree of multivariate polynomial is the total degree.
Theorem: Let $\mathbb{F}$ be a finite field or the rationals $\mathbb{Q}$. Suppose polynomial identity testing for $(n,s,d)$ circuits over $\mathbb{F}$ can be solved deterministically in time $\poly(n,s,d)$. Let the factors be given by $f=\prod_{i=1}^k g_i^{p^{e_i} j_i}$. Then we can compute an arithmetic circuit for the factor $g_i^{p^{e_i}}$ in deterministic time polynomial in $n,s,d,t$. Where $t=l\cdot p$ if the field has characteristic $p$ and $t$ the maximum bit complexity of the constants used in the circuit if $\mathbb{F}$ is the rations.

Explanation of the theorem

Bit complexity is needed.
The details in the theorem change if we use black-box or white-box algorithms for PIT.
Black-box can extract $g_i$ while whitebox only $g_i^{p^{e_i}}$.
We want to replace the last.
- We know that given a circuit $C$ of size $s$ the factors can be computed by a circuit of size $\poly(s)$.
- There exists a polynomial (in the size of $g$) circuit $h$ such that if this $h(\alpha,\beta)\not\equiv 0$, then such a $g$ restricted to a 2 dimensional subspace given by $\alpha,\beta$ is irreducible.
- These two facts are enough for the black-box algorithm.
  - Remember that a deterministic PIT is a polynomial size hitting set.
  - Hence, a set where not all circuits of some size $s$ evalue to zero on all the elements in the hitting set.
  - Hence, just doing the factoring on all the hitting set points is enough.
- This is the problem for whitebox is that we do not have such a hitting set!
For whitebox, that we work over the field $f(x,ta_1,\dots,ta_n)\in \mathbb{F}(a_1,\dots,a_n)[x,t]$
- This field is to complicated for

The whitebox case

Overview

Sadly the factors have small circuits proof uses hilbert's theorem.
Idea:
- Reduce $f(x_1,\dots,x_n)$ to $f(x_1,tx_2,\dots,tx_n)$ as a polynomial in $K[x_2,\dots,x_n][x_1,t]$.
- Factor over this field and use the rest of Kaltofen's algorithm.
- Factoring over this field could still be hard but whitebox PIT will help us.
Algorithm:
1. Reduce to bivariate factoring over a large field.
  - Define $f'(x,t) = f(x,ta_1,\dots,ta_n)$ and show that the factors correspond to the original polynoimals.
  - Notice that in all this, we just look at our polynomial as a polynomial in $\mathbb{F}[a_1,\dots,a_n][x,t]$.
2. Univariate factorization
  - Find an irreducible factor for $f'(x,0)$.
  - Write $f'(x,t)=g'(x,t,a_1,\dots,a_n) h'(x,t,a_1,\dots,a_n) \mod T$ and view this as an equation in $T$.
3. Hensel Lifting
  - Lift to $g'(x,t,a_1,\dots,a_n) h'(x,t,a_1,\dots,a_n) \mod T^{2^k}$.
    - The following lemma (3.6) does not need PIT: There exists a single arithmetic circuit with inputs $a_1,\dots,a_n$ and computes the coefficients of of $g',h'$ for the monomial $x^i t^j$ for all $i,j$.
4. Solve linear equation
  - Suppose $g=\sum_{i\leq d, j\leq D} c_{i,j}(a_1,\dots,a_n)x^i t^j$
  - We know $\sum_{i\leq d, j\leq d} r_{i,j}x^i t^j = \left(\sum_{i\leq D, j\leq D} c_{i,j}x^i t^j\right) \left(\sum_{i\leq D, j\leq D} s_{i,j}x^i t^j\right) \mod T^{2^k}$.
  - Solve this to get a non-trivial solution.
    - This step needs PIT!
  - If there is no solution, $f'$ is irreducible, otherwise find GCD.
    - Proof: The proof works with using the Hensel Lifting THeoreym.
  - Solving linear equations is possible with PIT, main idea:
    - We want to find a $jxj$ minor that has full rank in the matrix (if $j=n$ then we only have the 0 vector as a solution).
    - Get a $1\times 1$ minor by trying all elements in a row.
    - Check all extensions of a $j\times j$ minor to a $(j+1)\times (j+1)$ minor by checking if the determinant is non-zero.
    - Kramer's formula tells us the following:
      - Let $M_j^{(i)}$ be $M_j$ with the $i$th column replaced by $(M_{1,j+1},\dots,M_{j,j+1})$.
      - Then our solution is: $(\det(M_j^{(1)}),\dots,\det(M_j^{(j)}), -\det(M_j),0,\dots,0)$.
5. Compute GCD
  - To get a non-trivial factor of $f$, i.e., $h(x,y_1,\dots,y_n)$ divides $f(x,y_1,\dots,y_n)$.
6. Continue the recursion by factoring $f/h$ where $h$ is the resulting factor.
  - PIT is needed here to check if $f/h=f$ in a recursion step.

Some more Details

Let $h'$ be a factor of $f'$. Then there exists $h$ such that $h'(x,t,a_1,\dots,a_n)=h(x,ta_1,\dots,ta_n)$ and $h$ divides $f$.
- Proof:
  - Let $h'(x,t,a_1,\dots,a_n) = \sum_{i,j} x^i t^j h'_{i,j(a_1,\dots,a_n)$.
  - $h'_{i,j}$ is homogeneous by the following argument:
    - Let $f'=h'\cdot g'$.
    - $h'$ and $g'$ are relatively prime (squaredfreeness and monicness).
    - Note that $f'(x,zt,a_1,\dots,a_n) = f'(x,t,za_1,\dots,za_n)$.
    - Hence $h'(x,zt, a_1,\dots,a_n) g'(x,zt,a_1,\dots,a_n) = h'(x,t,za_1,\dots,za_n) g'(x,t,za_1,\dots,za_n)$.
    - Hence by substituting $z=1$, i.e., calculate $\mod Z-1$.
    - Applying Hensel Lifting, we get that $h'(x,zt,a_1,\dots,a_n) = h'(x,t,za_1,\dots,za_n)$.
    - This implies $h'_{i,j}$ is a homogeneous polynoimal of degree $j$.
  - Define $h(x,y_1,\dots,y_n) = h'(x,1,y_1,\dots,y_n)$.
  - $h$ has now the required properties.
Use normal univariate factoring
Ignore a bunch of other stuff.
- The polynomials $g_k,h_k$ (remember that $g_k(x,t) h_k(x,t) = f'(x,t) \mod T^{2^k}$) have small circuits in the following sense: There exists a circuit $C$ of size $size(f)+poly(n,k,d)$ such that $C$ computes all the coefficients of $g_k$ and $h_k$ in of the monomials $X^jt^{j'}$.
- If $f'$ is reducible in $\mathbb{F}(a_1,\dots,a_n)[x,t]$ then the linear equation system from 4. has a non-trivial solution.
- Consider the polynomial $S$ (with $s_{i,j}$ coefficients) and $R$ (with $r_{i,j}$ coefficients). Then $R$ and $f'$ have non-trivial GCD in the ring $\mathbb{F}(t,a_1,\dots,a_n)[x]$.
Let $h$ be the gcd of $f'$ and $R$
- Lemma 2.1: Homogeneous component extraction.
- Lemma 2.9: Computing gcd with PIT is easy.
- Lemma 2.3: If $f$ is monic and $g$ is monic and a factor in $\mathbb{F}(y_1,\dots,y_n)[x]$ then it is in $\mathbb{F}[x,y_1,\dots,y_n]$.
- Lemma 3.1: Let $h'(x,t,a_1,\dots,a_n)$ be a factor of $f'(x,t,a_1,\dots,a_n)$. Then there exist $h(x,y_1,\dots,y_n)$ such that $h'(x,t,a_1,\dots,a_n)= h(x,ta_1,\dots,ta_n)$.
  - See $h'=\sum_{i,j} x^i t^j h'_{i,j}(a_1,\dots,a_n)$.
  - $h'_{i,j}$ is a homogeneous polynomial in $a_1,\dots,a_n$ of degree $j$. This is obvious by our construction.
- Corollary 3.2: Let $h'(x,t,a_1,\dots,a_n)$ be a factor of $f'(x,t,a_1,\dots,a_n)$. Then there exists $h(x,y_1,\dots,y_n)$ such that $h'(x,t,a_1,\dots,a_n)=h(x,ta_1,\dots,ta_n)$ and $h(x,y_1,\dots,y_n)=h'(x,1,y_1,\dots,y_n)$.
- With this we can do the following after getting the gcd of $f'$.
  - Compute a small circuit for $h'$ for every coefficient of $x^i$.
We have a circuit for $f,h$ and can compute one for $f/h$. Try to factor $f/h$ and test if $f=h$ to stop, i.e., we found a trivial factor.