Matrix variety: basic geometry

Author

Bin Gao

Published

December 20, 2024

Abstract

The set of bounded-rank matrices, Rrm×n={XRm×n:rank(X)r}, is a non-smooth algebraic variety called matrix variety. More specifically, it can be defined by the matrices in which all (r+1)×(r+1) minors are zero, so it is also called a determinantal variety. This post presents the basic geometry of Rrm×n that is foundational for optimization involving low-rank structures.

1. Algebraic variety

Definition

Algebraic varieties are the central objects of study in algebraic geometry. In the classical definition, an algebraic variety V is the set of solutions of a system of polynomial equations. V={xRs:pi(x)=0 for i=1,,d}={xRs:P(x)=(p1(x),,pd(x))=0},

where pi(x) is polynomial, and we only consider real coefficients.

Singularity (local property)

A point xV is called singular if the tangent space at x is not regularly defined, i.e., either it does not exist or a special definition must be provided. In other words, it is locally non-flat at x.
A point xV is called regular if x is not singular.
A singular point xV is called a node if the Hessian matrix at x is non-singular.

Local structure: For an algebraic variety V, almost all points in V are regular. Moreover, the set of regular points is both open and dense in V, and V is a manifold near every regular point.

Jacobian criterion: A point xV is singular if the Jacobian matrix J(x)=[p1(x) p1(x)  pd(x)] of the polynomials P(x) has a lower rank at x than other points of V, i.e., rank(J(x))<rank(J(y)). The rationale behind this is that singular points are those at which all the first-order partial derivatives simultaneously vanish, implying a not well-defined tangent space.

Smoothness

An algebraic variety V is said to be non-singular or smooth if it has no singular points. The Jacobian criterion provides a computational way to test smoothness.

Example: a cubic curve
A cubic curve, y2x2(x+1)=0, defines a non-smooth algebraic variety. The origin (0,0) is a singular point because the Jacobian J(x,y)=(3x22x,2y) vanishes at the origin while having rank 1 at other points. Moreover, the Hessian of the cubic curve is H(x,y)=[6x2002], which has both positive and negative eigenvalues at the singular point (0,0), implying (0,0) is also a node.

2. Matrix variety

Definition

The set of bounded-rank matrices with rmin{m,n}, Rrm×n:={XRm×n:rank(X)r},

is an algebraic variety (also called matrix variety) since it can be defined by the matrices in which all (r+1)×(r+1) minors vanish. Let Ir+1={ir+1=(i1,,ir+1):1i1<<ir+1m} and Jr+1={jr+1=(j1,,jr+1):1j1<<jr+1n} be the index sets of (r+1)×(r+1) submatrices, it holds that Rrm×n={XRm×n:det(Xir+1,jr+1)=0 for all ir+1Ir+1,jr+1Jr+1}, where the number of equalities is d. Thus, it is also called a determinantal variety.

Singular points

The singular points of determinantal variety can be derived via the tools in algebraic geometry; e.g., [Theorem 10.3.3] or [Proposition 1.1]. Here, we provide an elementary way to determine singularity by using the Jabobian criterion.

Given a square matrix A. By the Jacobi’s formula, we have Adet(A)=(A) where A is the adjoint matrix of A. Thus, the partial gradient on the index {ir+1,jr+1} of each defining polynomial is Xir+1,jr+1det(Xir+1,jr+1)=(Xir+1,jr+1),

which leads to the full gradient Xdet(Xir+1,jr+1) in which {ir+1,jr+1} entries are those of (Xir+1,jr+1) and the others are 0. According to the representation of an adjoint matrix via the cofactor matrix, the Jacobian matrix of d defining polynomials reduces to a large matrix J(X)Rm×nd such that each entry of J(X) corresponds to either an r×r minor det(Xir,jr) or 0. In fact, the Jacobian is consist of all r×r minors. It directly implies that the Jacobian vanishes at any point XR<rm×n:={XRm×n:rank(X)<r}

since all r×r minors vanish at XR<rm×n, i.e., det(Xir,jr)=0. Therefore, the set of singular points of matrix variety is R<rm×n. Note that it is also a matrix variety since R<rm×n=R(r1)m×n. Moreover, the set of regular points is Rrm×n:={XRm×n:rank(X)=r},

which is actually a smooth manifold.

In summary, Rrm×n is a non-smooth algebraic variety and its singular points are R<rm×n.

Topology

The matrix variety Rrm×n is closed. Moreover, it is straightforward to verify that the set of regular points Rrm×n is open and dense in Rrm×n, which validates the local struture of an algebraic variety. Specifically, a sequence in Rrm×n can converge to any singular point, and the closure of Rrm×n is the matrix variety itself.

3. Fixed-rank manifold

The set of all fixed-rank matrices Rrm×n is a smooth manifold called fixed-rank manifold, which is also an embedded submanifold of Rm×n with dimension r(m+nr). To see this, one can find a local defining mapping from [Example 5.30] of the classic textbook or [Section 7.5] of a nice book for optimization on manifolds.
The matrix variety can be also interpreted by the “stratified” or “layered” space in the sense of Rrm×n=r=0r=rRrm×n.

Note that Rrm×n is not closed.

Tangent space

Given XRrm×n with rr. The tangent space at X can be parameterized by the derivative of a low-rank decomposition of X. For instance, one can follow X=AB in [Lemma 1] or the thin singular value decomposition (SVD) X=UΣV in [Proposition 2.1]. Here, we follow the latter for the sake of computational convenience.

Give the thin SVD X=UΣV, where USt(r,m), VSt(r,n), St(p,n):={XRn×p:XX=I} is the Stiefel manifold, and Σ=Diag(σ1,σ2,,σr) with σ1σ2σr>0. The tangent space can be parameterized as follows,
TXRrm×n={[UU][Rr×rRr×(nr)R(mr)×r0][VV]},

where “” denotes the orthogonal complement. A tangent vector can be illustrated as

[UU] [VV],

where a shaded square represents an arbitrary matrix and the blank represents the matrix with zero elements.

Normal space

Consider the standard Euclidean metric. The normal space at X is NXRrm×n={[UU][000R(mr)×(nr)][VV]} and a normal vector is

[UU] [VV].

With the help of illustrations, it is direct to see that the direct sum of tangent and normal spaces forms the Euclidean space, i.e., Rm×n=TXRrm×nNXRrm×n.

Projection

Given a matrix ARm×n. The orthogonal projections onto the tangent and normal space are given by PTXRrm×nA=PUAPV+PUAPV+PUAPV,PNXRrm×nA=PUAPV, where PU:=UU, PU:=ImUU, PV:=VV, and PV:=InVV are projection matrices.

4. Geometry of Rrm×n

Although Rrm×n is a non-smooth algebraic variety, its geometry is closely related to fixed-rank manifolds. If XRrm×n is regular, i.e., rank(X)=r, the geometry at X locally behaves as a manifold. Alternatively, if X is singular, one can investigate local geometry via tangent and normal cones.

Tangent cone

Given XRrm×n with rank rr. The (Bouligand) tangent cone of Rrm×n at X is defined by TXRrm×n:={ΞRm×n:t(i)0,X(i)X in Rrm×n,s. t. X(i)Xt(i)Ξ}. The parametrization of tangent cone can be obtained through different ways; e.g., [Theorem 6.1] with a diagonal normalization, or [Theorem 3.2] with a space decomposition. Specifically, the tangent cone is TXRrm×n={[UU][Rr×rRr×(nr)R(mr)×rR(rr)(mr)×(nr)][VV]}. A new parametrization: Here, we give a new reduced parametrization for the tangent cone by further decomposing an element F=U~SV~ in R(rr)(mr)×(nr). By some matrix computations, we yield a new parametrization as follows, TXRrm×n={[UU1U2][Rr×rRr×(rr)Rr×(nr)R(rr)×rR(rr)×(rr)0R(mr)×r00][VV1V2]},

where U1St(rr,m),V1St(rr,n),U2St(mr,m),V2St(nr,n) satisfying [U U1 U2]O(m) and [V V1 V2]O(n). Furthemore, it can be illustrated as

[UU1U2] [VV1V2].

Space decomposition at singular points: Note that if XRrm×n is regular, then the tangent cone boils down to the tangent space. Again from the illustration, it is easy to see the decomposition TXRrm×n=TXRrm×nN(rr)(X), where N(rr)(X):={NNXRrm×n:rank(N)(rr)} denotes a low-rank cone in the normal space. It indicates that the local geometry can be identified via the local (fixed-rank) manifold Rrm×n. As a result, a tangent vector can be viewed as follows, which gives a feeling how a tangent vector looks like.

Rank increase along normal part: Another interesing fact is that given a matrix XRrm×n with rank r<r and a vector in the cone NN(rr)(X){0}, it holds that rank(X+tN)(r,r]for t>0. As suggested in [Section 3.3], this principle can be applied to increase the rank of X, which facilitates the development of rank-adaptive methods.

Normal cone

Given XRrm×n with rank rr. The set NXRrm×n=(TXRrm×n):={NRm×n:N,Ξ0 for all ΞTXRrm×n} is called the normal cone of Rrm×n. It is easy to verify that NXRrm×n={NXRrm×n, if r=r;{0}, if r<r.

Projection

Given a matrix ARm×n. The orthogonal projection onto the tangent space is PTXRrm×nA=PTXRrm×nA+P(rr)(PUAPV),

where P(rr) is the projection onto R(rr)m×n, which can be achieved by the truncated SVD.

Footnotes

  1. Robin Hartshorne. “Algebraic geometry”. Vol. 52. Springer Science & Business Media, 2013.↩︎

  2. The Jacobian criterion for testing smoothness can be found in a textbook on computational algebraic geometry, e.g., [Corollary 5.6.14] of the book: Gert-Martin Greuel, Gerhard Pfister. “A Singular introduction to commutative algebra”. Vol. 348. Berlin: Springer, 2008.↩︎

  3. V. Lakshmibai, and Justin Brown. “The Grassmannian variety.” Developments in Mathematics. Vol. 42. Springer New York, 2015.↩︎

  4. Winfried Bruns, and Udo Vetter. “Determinantal rings”. Vol. 1327. Springer, 2006.↩︎

  5. John M. Lee, “Introduction to Smooth Manifolds”. Version 3.0. December 31, 2000↩︎

  6. Nicolas Boumal. “An introduction to optimization on smooth manifolds”. Cambridge University Press, 2023.↩︎

  7. Uri Shalit, Daphna Weinshall, and Gal Chechik. “Online learning in the manifold of low-rank matrices.” Advances in neural information processing systems 23 (2010).↩︎

  8. Bart Vandereycken. “Low-rank matrix completion by Riemannian optimization”. In: SIAM Journal on Optimization 23.2 (2013), pp. 12141236.↩︎

  9. Bin Gao, Renfeng Peng, Ya-xiang Yuan. “Low-rank optimization on Tucker tensor varieties.” arXiv:2311.18324 (2023).↩︎

  10. T. P. Cason, P.-A. Absil, and P. Van Dooren, Iterative methods for low rank approximation of graph similarity matrices, Linear Algebra Appl., 438 (2013), pp. 1863–1882.↩︎

  11. Reinhold Schneider and André Uschmajew. “Convergence results for projected line-search methods on varieties of low-rank matrices Via Lojasiewicz Inequality”. In: SIAM Journal on Optimization 25.1 (2015), pp. 622–646.↩︎

  12. See [Section 2.1] of the work by Bin Gao, Renfeng Peng, Ya-xiang Yuan. “Low-rank optimization on Tucker tensor varieties.” arXiv:2311.18324 (2023).↩︎

  13. Bin Gao and P.-A. Absil. “A Riemannian rank-adaptive method for low-rank matrix completion”. In: Computational Optimization and Applications 81 (2022), pp. 67–90.↩︎