Biostat 216 Homework 5

Due Dec 11 @ 11:59PM

Turn in to me directly (CHS 21-254A) or (if I'm not in office) put in my mailbox @ CHS 51-254 by the deadline.

    1. If $\mathcal{C}(\mathbf{A}')=\mathcal{C}(\mathbf{A})$ and $\mathcal{N}(\mathbf{A}) = \mathcal{N}(\mathbf{A}')$, is $\mathbf{A}$ symmetric?
    2. Give an example of $\mathbf{A}$ such that $\mathcal{C}(\mathbf{A}) = \mathcal{N}(\mathbf{A})$.
    3. Can you find an example of $\mathbf{A}$ such that $\mathcal{C}(\mathbf{A}) = \mathcal{N}(\mathbf{A}')$?
  1. Four possibilities for the rank $r$ and size $m, n$ match four possibilities for $\mathbf{A} \mathbf{x} = \mathbf{b}$. Find four matrices $\mathbf{A}_1$ to $\mathbf{A}_4$ that show those properties:

    1. $r=m=n$, $\mathbf{A}_1 \mathbf{x} = \mathbf{b}$ has 1 solution for every $\mathbf{b}$.
    2. $r=m<n$, $\mathbf{A}_2 \mathbf{x} = \mathbf{b}$ has 1 or $\infty$ solutions.
    3. $r=n<m$, $\mathbf{A}_3 \mathbf{x} = \mathbf{b}$ has 0 or 1 solution.
    4. $r<m, r<n$, $\mathbf{A}_4 \mathbf{x} = \mathbf{b}$ has 0 or $\infty$ solutions.
  2. Show that $$ \|\mathbf{A}\|_{\text{nuc}} = \min_{\mathbf{A} = \mathbf{U} \mathbf{V}} \|\mathbf{U}\|_{\text{F}} \|\mathbf{V}\|_{\text{F}} = \min_{\mathbf{A} = \mathbf{U} \mathbf{V}} \frac 12 \|\mathbf{U}\|_{\text{F}}^2 + \frac 12 \|\mathbf{V}\|_{\text{F}}^2. $$

  3. (MLE of multivariate normal model) Let $\mathbf{y}_1, \ldots, \mathbf{y}_n$ be iid samples from a multivariate normal distribution $N(\boldsymbol{\mu}, \boldsymbol{\Omega})$. The log-likelihood is $$ \ell(\boldsymbol{\mu}, \boldsymbol{\Omega}) = - \frac n2 \log \det \boldsymbol{\Omega} - \frac 12 \sum_{i=1}^n (\mathbf{y}_i - \boldsymbol{\mu})' \boldsymbol{\Omega}^{-1} (\mathbf{y}_i - \boldsymbol{\mu}). $$ Show that the maximum likelihood estimate (MLE) is achieved by \begin{eqnarray*} \widehat{\boldsymbol{\mu}} &=& \frac{\sum_{i=1}^n \mathbf{y}_i}{n} \\ \widehat{\boldsymbol{\Omega}} &=& \frac{\sum_{i=1}^n (\mathbf{y}_i - \hat{\boldsymbol{\mu}})(\mathbf{y}_i - \hat{\boldsymbol{\mu}})'}{n}. \end{eqnarray*} That is to show that $\widehat{\boldsymbol{\mu}}, \widehat{\boldsymbol{\Omega}}$ maximize $\ell$.

    Hint: To show the optimality of $\widehat{\boldsymbol{\Omega}}$, work on the Cholesky factor of $\boldsymbol{\Omega}$.

  4. (Smallest matrix subject to linear constraints) Find the matrix $\mathbf{X}$ with the smallest Frobenius norm subject to the constraint $\mathbf{X} \mathbf{U} = \mathbf{V}$, assuming $\mathbf{U}$ has full column rank.

    Hint: write down the optimization problem and use the method of Lagrange multipliers.

  5. (Linear mixed model) Given data $\mathbf{y} \in \mathbb{R}^n$, $\mathbf{X} \in \mathbb{R}^{n \times p}$, and $\mathbf{Z} \in \mathbb{R}^{n \times q}$, the log-likelihood of a linear mixed model (LMM) takes the form $$ \ell(\boldsymbol{\beta}, \boldsymbol{\Sigma}, \sigma_0^2) = - \frac n2 \log \det \left( \mathbf{Z} \boldsymbol{\Sigma} \mathbf{Z}' + \sigma_0^2 \mathbf{I}_n \right) - \frac 12 (\mathbf{y} - \mathbf{X} \boldsymbol{\beta})' \left( \mathbf{Z} \boldsymbol{\Sigma} \mathbf{Z}' + \sigma_0^2 \mathbf{I}_n \right)^{-1} (\mathbf{y} - \mathbf{X} \boldsymbol{\beta}), $$ where $\boldsymbol{\beta} \in \mathbb{R}^p$, $\boldsymbol{\Sigma} \in \mathbb{R}^{q \times q}$ is positive definite, and $\sigma_0^2>0$.

    1. Show that the matrix $\mathbf{Z} \boldsymbol{\Sigma} \mathbf{Z}' + \sigma_0^2 \mathbf{I}_n$ is positive definite.
    2. The maximum likelihood estimate (MLE) is the maximizer of the log-likelihood. It is traditionally difficult to incorporate the positive definiteness constraint in optimization algorithms. The trick is to work with the Cholesky factor $\mathbf{L}$ of $\boldsymbol{\Sigma}$. Recall that the Cholesky decomposition of $\boldsymbol{\Sigma}$ is $$ \boldsymbol{\Sigma} = \mathbf{L} \mathbf{L}', $$ where the Cholesky factor $\mathbf{L}$ is a lower triangular matrix with positive diagonal entries. Derive the derivatives \begin{eqnarray*} \operatorname{D}_{\boldsymbol{\beta}} \ell(\boldsymbol{\beta}, \mathbf{L}, \sigma_0^2) &=& \frac{\partial \ell(\boldsymbol{\beta}, \mathbf{L}, \sigma_0^2)}{\partial \boldsymbol{\beta}'}, \\ \operatorname{D}_{\operatorname{vech} \mathbf{L}} \ell(\boldsymbol{\beta}, \mathbf{L}, \sigma_0^2) &=& \frac{\partial \ell(\boldsymbol{\beta}, \mathbf{L}, \sigma_0^2)}{\partial (\operatorname{vech} \mathbf{L})'}, \\ \operatorname{D}_{\sigma_0^2} \ell(\boldsymbol{\beta}, \mathbf{L}, \sigma_0^2) &=& \frac{\partial \ell(\boldsymbol{\beta}, \mathbf{L}, \sigma_0^2)}{\partial \sigma_0^2}. \end{eqnarray*} Write your solution in the most compact way possible.

      Hint: (1) Write $\ell$ as a composition of a sequence of functions, (2) find the Jaocbian matrix for each function (feel free to use the formulas in lecture notes), (3) use the Chain rule to assemble the pieces, and (4) simplify the expression using various formulas regarding Kronecker product, commutation matrix and duplication matrix (reviewed in this lecture notes).