Kronecker Product and Other Special Matrices

Kronecker product

  • Let $\mathbf{A} \in \mathbb{R}^{m \times n}$ and $\mathbf{B} \in \mathbb{R}^{p \times q}$. The $mp \times nq$ matrix defined by $$ \mathbf{A} \otimes \mathbf{B} = \begin{pmatrix} a_{11} \mathbf{B} & \cdots & a_{1n} \mathbf{B} \\ \vdots & & \vdots \\ a_{m1} \mathbf{B} & \cdots & a_{mn} \mathbf{B} \end{pmatrix} $$ is called the Kronecker product of $\mathbf{A}$ and $\mathbf{B}$.

    Note Kronecker product is defined for any pair of matices of general shape.

  • Following facts are easy to verify.

    1. $$ \mathbf{A} \otimes \mathbf{B} \otimes \mathbf{C} = (\mathbf{A} \otimes \mathbf{B}) \otimes \mathbf{C} = \mathbf{A} \otimes (\mathbf{B} \otimes \mathbf{C}). $$
    2. $$ (\mathbf{A} + \mathbf{B}) \otimes (\mathbf{C} + \mathbf{D}) = \mathbf{A} \otimes \mathbf{C} + \mathbf{A} \otimes \mathbf{D} + \mathbf{B} \otimes \mathbf{C} + \mathbf{B} \otimes \mathbf{D} $$ if $\mathbf{A} + \mathbf{B}$ and $\mathbf{C} + \mathbf{D}$ exist.
    3. $$ (\mathbf{A} \otimes \mathbf{B})(\mathbf{C} \otimes \mathbf{D}) = \mathbf{A} \mathbf{C} \otimes \mathbf{B} \mathbf{D} $$ if $\mathbf{A} \mathbf{C}$ and $\mathbf{B} \mathbf{D}$ exist.
    4. $$ (\mathbf{A} \otimes \mathbf{B})' = \mathbf{A}' \otimes \mathbf{B}'. $$
    5. For square matrices $\mathbf{A}$ and $\mathbf{B}$, $$ \text{tr}(\mathbf{A} \otimes \mathbf{B}) = \text{tr}(\mathbf{A}) \text{tr}(\mathbf{B}). $$
    6. If $\mathbf{A}$ and $\mathbf{B}$ are non-singular, then $$ (\mathbf{A} \otimes \mathbf{B})^{-1} = \mathbf{A}^{-1} \otimes \mathbf{B}^{-1}. $$ Same identity holds for generalized inverses as well.

      These formulas are useful for reducing computations on Kronecker product to those on original (much smaller) matrices.

  • Eigenvalues of Kronecker product. Let $\mathbf{A}$ be an $m \times m$ matrix with eigenvalues $\lambda_1,\ldots, \lambda_m$, and $\mathbf{B}$ a $p \times p$ matrix with eigenvalues $\mu_1, \ldots, \mu_p$. Then the $mp$ eigenvalues of $\mathbf{A} \otimes \mathbf{B}$ are $\lambda_i \mu_j$, $i=1,\ldots,m$, $j=1,\ldots,p$.

In [1]:
using LinearAlgebra, Random

Random.seed!(216)
m, p = 3, 5
# two symmetric matrices
A = randn(m, m); A = (A + A') / 2
B = randn(p, p); B = (B + B') / 2
AkronB = kron(A, B)
Out[1]:
15×15 Array{Float64,2}:
 -1.17678    -0.539279   -0.577007   …  -0.0854543  -0.0442409  -0.24452  
 -0.539279   -1.09005     0.112559       0.0166698   0.0760435  -0.1132   
 -0.577007    0.112559    1.87634        0.277885    0.0118849   0.184425 
 -0.298725    0.513463    0.0802494      0.0118849  -0.203365    0.0558072
 -1.65106    -0.764354    1.24528        0.184425    0.0558072  -0.241604 
 -0.81132    -0.371801   -0.397812   …  -0.188355   -0.0975142  -0.538963 
 -0.371801   -0.751521    0.0776023      0.0367431   0.167612   -0.249512 
 -0.397812    0.0776023   1.29363        0.612504    0.0261962   0.406504 
 -0.205952    0.354002    0.0553271      0.0261962  -0.44825     0.123008 
 -1.1383     -0.526976    0.858547       0.406504    0.123008   -0.532535 
 -0.174281   -0.0798669  -0.0854543  …  -1.0033     -0.51942    -2.87084  
 -0.0798669  -0.161435    0.0166698      0.195716    0.892805   -1.32905  
 -0.0854543   0.0166698   0.277885       3.26257     0.139537    2.16529  
 -0.0442409   0.0760435   0.0118849      0.139537   -2.38765     0.655217 
 -0.24452    -0.1132      0.184425       2.16529     0.655217   -2.83661  
In [2]:
Aeig = eigen(A)
Out[2]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
3-element Array{Float64,1}:
 -2.3628974441499135
 -1.4183402387758672
  2.1240378985254527
eigenvectors:
3×3 Array{Float64,2}:
 0.294179   0.923448   0.246377 
 0.155313   0.208168  -0.965683 
 0.943046  -0.32235    0.0821851
In [3]:
Beig = eigen(B)
Out[3]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
5-element Array{Float64,1}:
 -2.0431527398771667
  0.1881453152888417
  0.4336962375445035
  1.3050654590574953
  2.7581708443189585
eigenvectors:
5×5 Array{Float64,2}:
  0.303014     0.670194  -0.130404   0.378069    0.546883
 -0.00301878  -0.277426  -0.815987  -0.336317    0.379581
 -0.87123      0.428099  -0.208816   0.0324859  -0.114152
 -0.0761851   -0.394763  -0.283267   0.859999   -0.136089
 -0.378588    -0.367108   0.439677   0.0574597   0.724766
In [4]:
eigen(AkronB)
Out[4]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
eigenvalues:
15-element Array{Float64,1}:
 -6.517274838570046  
 -4.33973385197522   
 -3.912024693915976  
 -3.083735837655283  
 -1.851026854817742  
 -1.024779731231333  
 -0.6151288251150611 
 -0.44456808462477326
 -0.26685407141133055
  0.3996277801035184 
  0.9211872449924243 
  2.772008495094637  
  2.8978857449329496 
  4.827760387063657  
  5.858459403941414  
eigenvectors:
15×15 Array{Float64,2}:
  0.160882    0.0746556    -0.505018   …  -0.0891403     0.13474   
  0.111665   -0.00074376   -0.350523       0.000888064   0.0935201 
 -0.0335811  -0.214651      0.105413       0.256298     -0.0281244 
 -0.0400346  -0.0187703     0.125671       0.0224121    -0.0335293 
  0.213211   -0.0932754    -0.669284       0.111373      0.178566  
  0.0849383  -0.292615     -0.113844   …  -0.047062     -0.528116  
  0.058954    0.00291519   -0.0790166      0.000468857  -0.366555  
 -0.0177293   0.841332      0.0237627      0.135314      0.110234  
 -0.0211365   0.0735707     0.0283294      0.0118326     0.131419  
  0.112566    0.365596     -0.150873       0.0587997    -0.699894  
  0.515736    0.0249032     0.176288   …  -0.285756      0.0449457 
  0.357962   -0.000248099   0.122358       0.00284685    0.0311959 
 -0.10765    -0.0716021    -0.0367968      0.82161      -0.00938157
 -0.128338   -0.00626128   -0.0438683      0.0718461    -0.0111845 
  0.683488   -0.0311143     0.233628       0.357026      0.059565  
In [5]:
[Aeig.values[i] * Beig.values[j] for i in 1:3, j in 1:5]
Out[5]:
3×5 Array{Float64,2}:
  4.82776  -0.444568  -1.02478   -3.08374  -6.51727
  2.89789  -0.266854  -0.615129  -1.85103  -3.91202
 -4.33973   0.399628   0.921187   2.77201   5.85846
  • If $\mathbf{x}_1$ is an eigenvector of $\mathbf{A}$ and $\mathbf{x}_2$ an eigenvector of $\mathbf{B}$, then $\mathbf{x}_1 \otimes \mathbf{x}_2$ is an eigenvector of $\mathbf{A} \otimes \mathbf{B}$. It is not generally true that every eigenvector of $\mathbf{A} \otimes \mathbf{B}$ is the Kronecker product of an eigenvector of $\mathbf{A}$ and an eigenvector of $\mathbf{B}$.

    Example: $$ \mathbf{A} = \mathbf{B} = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \quad \mathbf{e}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \quad \mathbf{e}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}. $$
    $\mathbf{e}_1$ is the only eigenvector of $\mathbf{A}$ and $\mathbf{B}$. But the eigenvector of $\mathbf{A} \otimes \mathbf{B}$ are not just $\mathbf{e}_1 \otimes \mathbf{e}_1$, but also $\mathbf{e}_1 \otimes \mathbf{e}_2$ and $\mathbf{e}_2 \times \mathbf{e}_1$.

Corollaries:

  • Definiteness of Kronecker product. If $\mathbf{A}$ and $\mathbf{B}$ are positive (semi)definite, then $\mathbf{A} \otimes \mathbf{B}$ is positive (semi)definite.

  • Determinnant of Kronecker product. For $\mathbf{A} \in \mathbb{R}^{m \times m}$ and $\mathbf{B} \in \mathbb{R}^{p \times p}$, $$ \text{det}(\mathbf{A} \otimes \mathbf{B}) = \text{det}(\mathbf{A})^p \text{det}(\mathbf{B})^m. $$

  • Rank of Kronecker product. $$ \text{rank}(\mathbf{A} \otimes \mathbf{B}) = \text{rank}(\mathbf{A}) \text{rank}(\mathbf{B}). $$

Kronecker sum

  • Let $\mathbf{A} \in \mathbb{R}^{m \times m}$ and $\mathbf{B} \in \mathbb{R}^{n \times n}$. The Kronecker sum is defined as the $mn \times mn$ matrix: $$ \mathbf{A} \oplus \mathbf{B} = \mathbf{A} \otimes \mathbf{I}_n + \mathbf{I}_m \otimes \mathbf{B}. $$

  • The eigenvalues of Kronecker sum $\mathbf{A} \oplus \mathbf{B}$ are $\lambda_i + \mu_j$, where $\lambda_i$ and $\mu_j$ are eigenvalues of $\mathbf{A}$ and $\mathbf{B}$ respectively.

Vec and vech operator

  • The vec operator stacks the columns of a matrix into one long vector. For $$ \mathbf{A} = \begin{pmatrix} \mid & & \mid \\ \mathbf{a}_1 & \cdots & \mathbf{a}_n \\ \mid & & \mid \end{pmatrix} \in \mathbb{R}^{m \times n}, $$ we have $$ \text{vec} \, \mathbf{A} = \begin{pmatrix} \mathbf{a}_1 \\ \vdots \\ \mathbf{a}_n \end{pmatrix}. $$

  • Following facts are easy to verify:
    $$ \text{vec} \, \mathbf{a} \mathbf{b}' = \mathbf{b} \otimes \mathbf{a}. $$ $$ (\text{vec} \, \mathbf{A})' \text{vec} \, \mathbf{B} = \text{tr}(\mathbf{A}' \mathbf{B}). $$

  • Let $\mathbf{A}$, $\mathbf{B}$ and $\mathbf{C}$ such that the matrix product $\mathbf{A} \mathbf{B} \mathbf{C}$ is defined. Then $$ \text{vec} \, \mathbf{A} \mathbf{B} \mathbf{C} = (\mathbf{C}' \otimes \mathbf{A}) \text{vec} \, \mathbf{B}. $$

    Because of the extreme usefulness of this identity, we give it a name: Roth’s Kronecker product identity, or Roth column lemma, or simply the vec trick.

  • A generalization to the product of 4 matrices: $$ \text{tr} \, \mathbf{A} \mathbf{B} \mathbf{C} \mathbf{D} = (\text{vec} \, \mathbf{D}' )' (\mathbf{C}' \otimes \mathbf{A}) (\text{vec} \, \mathbf{B}) = (\text{vec} \, \mathbf{D})' (\mathbf{A} \otimes \mathbf{C}') (\text{vec} \, \mathbf{B}'). $$

vech operator

  • For a symmetric $n \times n$ matrix $$ \mathbf{A} = \begin{pmatrix} a_{11} & \cdots & a_{1n} \\ \vdots & \ddots & \vdots \\ a_{n1} & \cdots & a_{nn} \end{pmatrix}, $$ the vech operator enumerates the $\frac{n (n+1)}{2}$ non-redundant entries of $\mathbf{A}$ in the column-major order $$ \text{vech} \, \mathbf{A} = \begin{pmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{n1} \\ a_{22} \\ a_{32} \\ \vdots \\ a_{n2} \\ a_{33} \\ \vdots \\ a_{nn} \end{pmatrix}. $$
In [6]:
using LinearAlgebra, Permutations

vech(A) = [A[i, j] for i in 1:size(A, 1), j in 1:size(A, 2) if i  j]
A = randn(3, 3)
Out[6]:
3×3 Array{Float64,2}:
  0.333068   -0.474861   -0.254432
  1.66378     0.0561321  -0.529029
 -0.0676266   0.0104737  -0.904305
In [7]:
vech(A)
Out[7]:
6-element Array{Float64,1}:
  0.3330681151463122  
  1.6637814534906514  
 -0.06762655185911629 
  0.05613206900523923 
  0.010473695134034552
 -0.9043054412667461  

Commutation matrix

  • Let $\mathbf{A} \in \mathbb{R}^{m \times n}$. $\text{vec} \, \mathbf{A}$ and $\text{vec} \, \mathbf{A}'$ clearly contain the same $mn$ components. There exists a unique $mn \times mn$ permutation matrix that transforms $\text{vec} \, \mathbf{A}$ to $\text{vec} \, \mathbf{A}'$. We denote it $\mathbf{K}_{mn}$ $$ \mathbf{K}_{mn} \text{vec} \, \mathbf{A} = \text{vec} \mathbf{A}'. $$

  • Since $\mathbf{K}_{mn}$ is an $mn \times mn$ permutation matrix, it is orthogonal, i.e., $$ \mathbf{K}_{mn}^{-1} = \mathbf{K}_{mn}' = \mathbf{K}_{nm}. $$ Also $$ \mathbf{K}_{n1} = \mathbf{K}_{1n} = \mathbf{I}_n. $$

In [8]:
A = [1 3 5; 2 4 6]
Out[8]:
2×3 Array{Int64,2}:
 1  3  5
 2  4  6
In [9]:
vec(A)
Out[9]:
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6
In [10]:
vec(A')
Out[10]:
6-element reshape(::Adjoint{Int64,Array{Int64,2}}, 6) with eltype Int64:
 1
 3
 5
 2
 4
 6
In [11]:
K23 = Matrix(inv(Permutation(vec(A'))))
Out[11]:
6×6 Array{Int64,2}:
 1  0  0  0  0  0
 0  0  1  0  0  0
 0  0  0  0  1  0
 0  1  0  0  0  0
 0  0  0  1  0  0
 0  0  0  0  0  1
In [12]:
K23 * vec(A) == vec(A')
Out[12]:
true
In [13]:
# for general m, n
commutation(m, n) = Matrix(Permutation(vec(reshape(1:m*n, m, n)')))'
K23 = commutation(2, 3)
Out[13]:
6×6 Adjoint{Int64,Array{Int64,2}}:
 1  0  0  0  0  0
 0  0  1  0  0  0
 0  0  0  0  1  0
 0  1  0  0  0  0
 0  0  0  1  0  0
 0  0  0  0  0  1
  • Commutation matrix allows us to exchange the two matrices in a Kronecker product. Let $\mathbf{A} \in \mathbb{R}^{m \times n}$, $\mathbf{B} \in \mathbb{R}^{p \times q}$, and $\mathbf{b} \in \mathbb{R}^{p}$. Then \begin{eqnarray*} \mathbf{K}_{pm} (\mathbf{A} \otimes \mathbf{B}) &=& (\mathbf{B} \otimes \mathbf{A}) \mathbf{K}_{qn} \\ \mathbf{K}_{pm} (\mathbf{A} \otimes \mathbf{B}) \mathbf{K}_{nq} &=& \mathbf{B} \otimes \mathbf{A} \\ \mathbf{K}_{pm} (\mathbf{A} \otimes \mathbf{b}) &=& \mathbf{b} \otimes \mathbf{A} \\ \mathbf{K}_{mp} (\mathbf{b} \otimes \mathbf{A}) &=& \mathbf{A} \otimes \mathbf{b}. \end{eqnarray*}

  • Commutation matrix allows us to transform the vec of a Kronecker product to the Kronecker product of the vecs. Let $\mathbf{A} \in \mathbb{R}^{m \times n}$ and $\mathbf{B} \in \mathbb{R}^{p \times q}$. Then $$ \text{vec} \, (\mathbf{A} \otimes \mathbf{B}) = (\mathbf{I}_n \otimes \mathbf{K}_{qm} \otimes \mathbf{I}_p) (\text{vec} \, \mathbf{A} \otimes \text{vec} \, \mathbf{B}). $$

  • Let $$ \mathbf{N}_n = \frac 12 (\mathbf{I}_{n^2} + \mathbf{K}_{nn}). $$ Then \begin{eqnarray*} \mathbf{N}_n &=& \mathbf{N}_n' = \mathbf{N}_n^2 \\ \text{rank}(\mathbf{N}_n) &=& \text{tr}(\mathbf{N}_n) = \frac{n(n+1)}{2} \\ \mathbf{N}_n \mathbf{K}_{nn} &=& \mathbf{N}_n = \mathbf{K}_{nn} \mathbf{N}_n. \end{eqnarray*}

Duplication matrix

  • Since the elements of $\text{vec} \, \mathbf{A}$ are those of $\text{vech} \, \mathbf{A}$ with some repetition, there exists an $n^2 \times \frac{n(n+1)}{2}$ duplication matrix $\mathbf{D}_n$ that tansforms, for symmetric $\mathbf{A}$, $\text{vech} \, \mathbf{A}$ to $\text{vec} \, \mathbf{A}$: $$ \mathbf{D}_n \text{vech} \, \mathbf{A} = \text{vec} \, \mathbf{A}. \quad \quad \quad (\mathbf{A} = \mathbf{A}'). $$

  • $\mathbf{D}_n$ has full column rank. Why?

  • The Moore-Penrose inverse of $\mathbf{D}_n$ is $$ \mathbf{D}_n^+ = (\mathbf{D}_n'\mathbf{D}_n)^{-1} \mathbf{D}_n'. $$ Thus $\text{vech} \, \mathbf{A}$ can be uniquely solved from $\text{vec} \, \mathbf{A}$ $$ \text{vech} \, \mathbf{A} = \mathbf{D}_n^+ \text{vec} \, \mathbf{A}. \quad \quad \quad (\mathbf{A} = \mathbf{A}'). $$

  • Some properties of $\mathbf{D}_n$. \begin{eqnarray*} \mathbf{K}_n \mathbf{D}_n &=& \mathbf{D}_n \\ \mathbf{D}_n \mathbf{D}_n^+ &=& \frac 12 (\mathbf{I}_{n^2} + \mathbf{K}_{nn}) \\ \mathbf{D}_n \mathbf{D}_n^+ (\mathbf{b} \otimes \mathbf{A}) &=& \frac 12 (\mathbf{b} \otimes \mathbf{A} + \mathbf{A} \otimes \mathbf{b}). \end{eqnarray*}

  • Let $\mathbf{A} \in \mathbb{R}^{n \times n}$. Then \begin{eqnarray*} \mathbf{D}_n' \text{vec} \, \mathbf{A} &=& \text{vech}(\mathbf{A} + \mathbf{A}' - \text{diag}(\mathbf{A})) \\ \text{det}(\mathbf{D}_n^+ (\mathbf{A} \otimes \mathbf{A}) \mathbf{D}_n^+) &=& 2^{-n(n-1)/2} \text{det}(\mathbf{A})^{n+1}. \end{eqnarray*}

In [14]:
n = 3
A = zeros(Int, n, n)
A[tril(trues(n, n))] = 1:6
LinearAlgebra.copytri!(A, 'L')
A
Out[14]:
3×3 Array{Int64,2}:
 1  2  3
 2  4  5
 3  5  6
In [15]:
vech(A)
Out[15]:
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6
In [16]:
vec(A)
Out[16]:
9-element Array{Int64,1}:
 1
 2
 3
 2
 4
 5
 3
 5
 6
In [17]:
function duplication(n)
    D = zeros(Int, abs2(n), n * (n + 1) >> 1)
    vechidx = 1
    for j in 1:n
        for i in j:n
            D[(j - 1) * n + i, vechidx] = 1
            D[(i - 1) * n + j, vechidx] = 1
            vechidx += 1
        end
    end
    D
end
duplication(3)
Out[17]:
9×6 Array{Int64,2}:
 1  0  0  0  0  0
 0  1  0  0  0  0
 0  0  1  0  0  0
 0  1  0  0  0  0
 0  0  0  1  0  0
 0  0  0  0  1  0
 0  0  1  0  0  0
 0  0  0  0  1  0
 0  0  0  0  0  1

Image compression by JPEG

Example at https://cs.stanford.edu/people/eroberts/courses/soco/projects/data-compression/lossy/jpeg/dct.htm.

  • DCT (discrete cosine transform) matrix $$ \mathbf{C}_n = \left( \cos \left( j + \frac 12 \right) \frac{k \pi}{8} \right)_{0 \le j, k \le n-1} $$ emerge as eigenvector of $$ \mathbf{D}_{\text{free}} = \begin{pmatrix} -1 & 1 & & \\ 1 & -2 & 1 & \\ & & \ddots & \\ & & 1 & -1 \end{pmatrix}. $$
In [18]:
n = 8
C = [cos((j + 0.5) * k * π / 8) for j in 0:n-1, k in 0:n-1]
Out[18]:
8×8 Array{Float64,2}:
 1.0   0.980785   0.92388    0.83147   …   0.55557    0.382683   0.19509 
 1.0   0.83147    0.382683  -0.19509      -0.980785  -0.92388   -0.55557 
 1.0   0.55557   -0.382683  -0.980785      0.19509    0.92388    0.83147 
 1.0   0.19509   -0.92388   -0.55557       0.83147   -0.382683  -0.980785
 1.0  -0.19509   -0.92388    0.55557      -0.83147   -0.382683   0.980785
 1.0  -0.55557   -0.382683   0.980785  …  -0.19509    0.92388   -0.83147 
 1.0  -0.83147    0.382683   0.19509       0.980785  -0.92388    0.55557 
 1.0  -0.980785   0.92388   -0.83147      -0.55557    0.382683  -0.19509 

The columns of DCT matrix $\mathbf{C}_n$ are orthogonal.

In [19]:
C'C
Out[19]:
8×8 Array{Float64,2}:
  8.0           6.66134e-16  -4.44089e-16  …  -1.05471e-15  -4.996e-16  
  6.66134e-16   4.0           2.97753e-16      2.64855e-16   7.98239e-16
 -4.44089e-16   2.97753e-16   4.0              1.32339e-16  -8.98239e-17
  3.33067e-16  -3.0757e-16    3.81368e-16     -2.43375e-16   8.06468e-17
  2.22045e-16   8.87903e-17  -9.60906e-16     -8.60757e-16  -3.18502e-16
  1.77636e-15  -1.04433e-15   6.98248e-16  …   1.26541e-15  -1.2366e-15 
 -1.05471e-15   2.64855e-16   1.32339e-16      4.0           8.42704e-16
 -4.996e-16     7.98239e-16  -8.98239e-17      8.42704e-16   4.0        
  • JPEG compression:
    • Step 1 (DCT): apply 2D cosine transform $\mathbf{C}_8 \otimes \mathbf{C}_8$ on each $8 \times 8$ block of the image, producing 64 coefficients $c_{ij}$. Many of these cosine coefficients (in the lower right corner) will be small.
    • Step 2 (Quantization and compression): $$ q_{jk} = \frac{c_{jk}}{j + k + 3} \text{ rounded to nearest integer} $$ and discard those small integers.

Spectral clustering

  • TODO

  • Example: Fisher iris data.