Module_SVD_Procedures¶

This file is part of statpack.

statpack is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

statpack is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You can find a copy of the GNU Lesser General Public License in the statpack/doc directory.

MODULE EXPORTING SUBROUTINES AND FUNCTIONS FOR COMPUTING FULL OR PARTIAL SVD DECOMPOSITION AND GENERALIZED INVERSE OF A MATRIX.

LATEST REVISION : 19/11/2018

`subroutine bd_cmp ( mat, d, e, tauq, taup )`¶

Purpose¶

BD_CMP reduces a general m-by-n matrix MAT to upper or lower bidiagonal form BD by an orthogonal transformation :

Q’ * MAT * P = BD

where Q and P are orthogonal. If:

m >= n, BD is upper bidiagonal;

m < n, BD is lower bidiagonal.

BD_CMP computes BD, Q and P.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the general m-by-n matrix to be reduced.

On exit, if:

m >= n, the elements on and below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors;

m < n, the elements below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements on and above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors.

See Further Details.

D (OUTPUT) real(stnd), dimension(:)

The diagonal elements of the bidiagonal matrix BD

The size of D must be min( size(MAT,1) , size(MAT,2) ).

E (OUTPUT) real(stnd), dimension(:)

The off-diagonal elements of the bidiagonal matrix BD:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

The size of E must be min( size(MAT,1) , size(MAT,2) ).

TAUQ (OUTPUT) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix Q. See Further Details.

The size of TAUQ must be min( size(MAT,1) , size(MAT,2) ).

TAUP (OUTPUT) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix P. See Further Details.

The size of TAUP must be min( size(MAT,1) , size(MAT,2) ).

Further Details¶

The matrices Q and P are represented as products of elementary reflectors:

If m >= n,

Q = H(1) * H(2) * … * H(n) and P = G(1) * G(2) * … * G(n-1)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors; u(1:i-1) = 0 and u(i:m) is stored on exit in MAT(i:m,i); v(1:i) = 0 and v(i+1:n) is stored on exit in MAT(i,i+1:n); tauq is stored in TAUQ(i) and taup in TAUP(i).

If m < n,

Q = H(1) * H(2) * … * H(m-1) and P = G(1) * G(2) * … * G(m)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors; u(1:i) = 0 and u(i+1:m) is stored on exit in MAT(i+1:m,i); v(1:i-1) = 0 and v(i:n) is stored on exit in MAT(i,i:n); tauq is stored in TAUQ(i) and taup in TAUP(i).

The contents of MAT on exit are illustrated by the following examples:

m = 6 and n = 5 (m >= n):

( u1 v1 v1 v1 v1 )

( u1 u2 v2 v2 v2 )

( u1 u2 u3 v3 v3 )

( u1 u2 u3 u4 v4 )

( u1 u2 u3 u4 u5 )

( u1 u2 u3 u4 u5 )

m = 5 and n = 6 (m < n):

( v1 v1 v1 v1 v1 v1 )

( u1 v2 v2 v2 v2 v2 )

( u1 u2 v3 v3 v3 v3 )

( u1 u2 u3 v4 v4 v4 )

( u1 u2 u3 u4 v5 v5 )

where ui denotes an element of the vector defining H(i), and vi an element of the vector defining G(i).

This subroutine is adapted from the routine DGEBD2 in LAPACK. An efficient variant of the classic Golub and Kahan Householder bidiagonalization algorithm is used. This variant reduces the traffic on the data bus from four reads and two writes per column-row elimination of the bidiagonalization process to one read and one write. Furthermore, the algorithm is parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the efficient variant used here, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Howell, G.W., Demmel, J., Fulton, C.T., Hammarling, S., and Marmol, K., 2008:

Cache efficient bidiagonalization using BLAS 2.5 operators. ACM Transactions on Mathematical Software (TOMS) Volume 34, Issue 3.

`subroutine bd_cmp ( mat, d, e, tauq )`¶

Purpose¶

BD_CMP reduces a general m-by-n matrix MAT to upper or lower bidiagonal form BD by an orthogonal transformation :

Q’ * MAT * P = BD

where Q and P are orthogonal. If:

m >= n, BD is upper bidiagonal;

m < n, BD is lower bidiagonal.

BD_CMP computes only BD and Q.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the general m-by-n matrix to be reduced.

On exit, if:

m >= n, the elements on and below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements above the diagonal are destroyed;

m < n, the elements below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements on and above the diagonal are destroyed.

See Further Details.

D (OUTPUT) real(stnd), dimension(:)

The diagonal elements of the bidiagonal matrix BD

The size of D must be min( size(MAT,1) , size(MAT,2) ).

E (OUTPUT) real(stnd), dimension(:)

The off-diagonal elements of the bidiagonal matrix BD:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

The size of E must be min( size(MAT,1) , size(MAT,2) ).

TAUQ (OUTPUT) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix Q. See Further Details.

The size of TAUQ must be min( size(MAT,1) , size(MAT,2) ).

Further Details¶

The matrice Q is represented as products of elementary reflectors:

If m >= n,

Q = H(1) * H(2) * … * H(n)

Each H(i) has the form:

H(i) = I + tauq * u * u’

where tauq is a real scalar and u is a real vector; u(1:i-1) = 0 and u(i:m) is stored on exit in MAT(i:m,i); tauq is stored in TAUQ(i).

If m < n,

Q = H(1) * H(2) * … * H(m-1)

Each H(i) has the form:

H(i) = I + tauq * u * u’

where tauq is a real scalar and u is a real vector; u(1:i) = 0 and u(i+1:m) is stored on exit in MAT(i+1:m,i); tauq is stored in TAUQ(i).

The contents of MAT on exit are illustrated by the following examples:

m = 6 and n = 5 (m > n):

( u1 xx xx xx xx )

( u1 u2 xx xx xx )

( u1 u2 u3 xx xx )

( u1 u2 u3 u4 xx )

( u1 u2 u3 u4 u5 )

( u1 u2 u3 u4 u5 )

m = 5 and n = 6 (m < n):

( xx xx xx xx xx xx )

( u1 xx xx xx xx xx )

( u1 u2 xx xx xx xx )

( u1 u2 u3 xx xx xx )

( u1 u2 u3 u4 xx xx )

where ui denotes an element of the vector defining H(i). The upper triangular part of MAT is destroyed on exit.

This subroutine is adapted from the routine DGEBD2 in LAPACK. An efficient variant of the classic Golub and Kahan Householder bidiagonalization algorithm is used. This variant reduces the traffic on the data bus from four reads and two writes per column-row elimination of the bidiagonalization process to one read and one write. Furthermore, the algorithm is parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the efficient variant used here, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Howell, G.W., Demmel, J., Fulton, C.T., Hammarling, S., and Marmol, K., 2008:

Cache efficient bidiagonalization using BLAS 2.5 operators. ACM Transactions on Mathematical Software (TOMS) Volume 34, Issue 3.

`subroutine bd_cmp ( mat, d, e )`¶

Purpose¶

BD_CMP reduces a general m-by-n matrix MAT to upper or lower bidiagonal form BD by an orthogonal transformation :

Q’ * MAT * P = BD

where Q and P are orthogonal. If:

m >= n, BD is upper bidiagonal;

m < n, BD is lower bidiagonal.

BD_CMP computes only BD and the matrices Q and P are not saved.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the general m-by-n matrix to be reduced.

On exit, the general m-by-n matrix is destroyed.

D (OUTPUT) real(stnd), dimension(:)

The diagonal elements of the bidiagonal matrix BD

The size of D must be min( size(MAT,1) , size(MAT,2) ).

E (OUTPUT) real(stnd), dimension(:)

The off-diagonal elements of the bidiagonal matrix BD:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

The size of E must be min( size(MAT,1) , size(MAT,2) ).

Further Details¶

This subroutine is adapted from the routine DGEBD2 in LAPACK. An efficient variant of the classic Golub and Kahan Householder bidiagonalization algorithm is used. This variant reduces the traffic on the data bus from four reads and two writes per column-row elimination of the bidiagonalization process to one read and one write. Furthermore, the algorithm is parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the efficient variant used here, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Howell, G.W., Demmel, J., Fulton, C.T., Hammarling, S., and Marmol, K., 2008:

Cache efficient bidiagonalization using BLAS 2.5 operators. ACM Transactions on Mathematical Software (TOMS) Volume 34, Issue 3.

`subroutine bd_cmp2 ( mat, d, e, p, failure, gen_p )`¶

Purpose¶

BD_CMP2 reduces a m-by-n matrix MAT with m >= n to upper bidiagonal form BD by an orthogonal transformation :

Q’ * MAT * P = BD

where Q and P are orthogonal.

BD_CMP2 computes BD, Q and P using the one-sided Ralha-Barlow bidiagonal reduction algorithm.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the general m-by-n matrix to be reduced.

On exit, the first n columns of Q are stored in in MAT(1:m,1:n).

The shape of MAT must verify: size( MAT, 1 ) >= size( MAT, 2 ) = n .

D (OUTPUT) real(stnd), dimension(:)

The diagonal elements of the bidiagonal matrix BD

The size of D must be size( MAT, 2) = n .

E (OUTPUT) real(stnd), dimension(:)

The off-diagonal elements of the bidiagonal matrix BD:

E(i) = BD(i-1,i) for i = 2,3,…,n;

The size of E must be size( MAT, 2 ) = n .

P (OUTPUT) real(stnd), dimension(:,:)

On exit, the n-by-n matrix P.

The shape of P must verify: size( P, 1 ) = size( P, 2 ) = n .

FAILURE (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE = false : indicates successful exit ;

FAILURE = true : indicates that MAT is nearly singular and some loss of orthogonality of Q can be expected in the Ralha-Barlow algorithm. See further details.

GEN_P (INPUT,OPTIONAL) logical(lgl)

If the optional argument GEN_P is used and is set to true, the orthogonal matrix P is generated on output of the subroutine. If this argument is set to false, the orthogonal matrix is stored in factored form as products of elementary reflectors in the lower triangle of the array P. See further details.

The default is GEN_P = true.

Further Details¶

This subroutine is an implementation of the Ralha-Barlow one-sided method to reduce a rectangular matrix MAT to bidiagonal form BD. Q is computed by a recurrence relationship and P as a product of n-1 elementary reflectors (e.g. Householder transformations):

P = G(1) * G(2) * … * G(n-1)

Each G(i) has the form:

G(i) = I + taup * v * v’

where taup is a real scalar, and v is a real vector. IF GEN_P is used and set to false, the n-1 G(i) elementary reflectors are stored in the lower triangle of the array P. For the G(i) reflector, taup is stored in P(i+1,1) and v is stored in P(i+1:n,i+1). In addition, P(1,1) is set to -1 if GEN_P=false and is equal to 1 if GEN_P=true. In other words, the value of P(1,1) indicates if the orthogonal matrix P is stored in factored form or not. Note that if n is equal to 1, no elementary reflectors are needed and consequently P(1,1) is set to 1, independently of the value of GEN_P.

This is the blocked version of the algorithm for the special case of blocks of size 2. See the references (1), (2) and (3) for further details. Furthermore the algorithm is parallelized if OPENMP is used.

Since Q is computed by a recurrence relationship, a loss of orthogonality of Q can be observed when the rectangular matrix MAT is singular or nearly singular. To correct this deficiency, partial reorthogonalization is performed to ensure orthogonality at the expense of speed of computation. The reorthogonalization uses the Gram-Schmidt method described in the reference (4).

The reference (2) also explains how to handle the case of an exactly singular matrix MAT (a very rare event). However, in this subroutine, the partial reorthogonalization described in the reference (4) corrects automatically this problem.

For further details, see:

Ralha, R.M.S., 2003: One-sided reduction to bidiagonal form.

Linear Algebra Appl., No 358, pp. 219-238.

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Stewart, G.W., 2007: Block Gram-Schmidt Orthogonalization. Report TR-4823,

Department of Computer Science, College Park, University of Maryland.

`subroutine bd_cmp2 ( mat, d, e, failure )`¶

Purpose¶

BD_CMP2 reduces a m-by-n matrix MAT with m >= n to upper bidiagonal form BD by an orthogonal transformation :

Q’ * MAT * P = BD

where Q and P are orthogonal.

BD_CMP2 computes BD and Q using the one-sided Ralha-Barlow bidiagonal reduction algorithm.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the general m-by-n matrix to be reduced.

On exit, the first n columns of Q are stored in in MAT(1:m,1:n).

The shape of MAT must verify: size( MAT, 1 ) >= size( MAT, 2 ) = n .

D (OUTPUT) real(stnd), dimension(:)

The diagonal elements of the bidiagonal matrix BD

The size of D must be size( MAT, 2 ) = n .

E (OUTPUT) real(stnd), dimension(:)

The off-diagonal elements of the bidiagonal matrix BD:

E(i) = BD(i-1,i) for i = 2,3,…,n;

The size of E must be size( MAT, 2 ) = n .

FAILURE (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE = false : indicates successful exit ;

FAILURE = true : indicates that MAT is nearly singular and some loss of orthogonality of Q can be expected in the Ralha-Barlow algorithm. See further details.

Further Details¶

This subroutine is an implementation of the Ralha-Barlow one-sided method to reduce a rectangular matrix MAT to bidiagonal form BD. Q is computed by a recurrence relationship.

This is the blocked version of the algorithm for the special case of blocks of size 2. See the references (1), (2) and (3) for further details. Furthermore the algorithm is parallelized if OPENMP is used.

Since Q is computed by a recurrence relationship, a loss of orthogonality of Q can be observed when the rectangular matrix MAT is singular or nearly singular. To correct this deficiency, partial reorthogonalization is performed to ensure orthogonality at the expense of speed of computation. The reorthogonalization uses the Gram-Schmidt method described in the reference (4).

The reference (2) also explains how to handle the case of an exactly singular matrix MAT (a very rare event). However, in this subroutine, the partial reorthogonalization described in the reference (4) corrects automatically this problem.

For further details, see:

Ralha, R.M.S., 2003: One-sided reduction to bidiagonal form.

Linear Algebra Appl., No 358, pp. 219-238.

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Stewart, G.W., 2007: Block Gram-Schmidt Orthogonalization. Report TR-4823,

Department of Computer Science, College Park, University of Maryland.

`subroutine ortho_gen_bd ( mat, tauq, taup, p )`¶

Purpose¶

ORTHO_GEN_BD generates the real orthogonal matrices Q and P determined by BD_CMP when reducing a m-by-n real matrix MAT to bidiagonal form :

MAT = Q * BD * P’.

Q and P are defined as products of elementary reflectors H(i) and G(i), respectively, determined by BD_CMP and stored in its array arguments MAT, TAUQ and TAUP :

if m >= n,

Q = H(1) * H(2) * … * H(n) and ORTHO_GEN_BD returns the first n columns of Q in MAT;

P = G(1) * G(2) * … * G(n-1) and ORTHO_GEN_BD returns P as an n-by-n matrix in P.

if m < n,

Q = H(1) * H(2) * … * H(m-1) and ORTHO_GEN_BD returns Q as an m-by-m matrix in MAT(1:m,1:m);

P = G(1) * G(2) * … * G(m) and ORTHO_GEN_BD returns the first m columns of P, in P.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the vectors which define the elementary reflectors H(i) and G(i), as returned by BD_CMP in its array argument MAT.

On exit, the first min(m,n) columns of Q are stored in MAT(1:m,1:min(m,n)).

TAUQ (INPUT) real(stnd), dimension(:)

TAUQ(i) must contain the scalar factor of the elementary reflector H(i), which determines Q, as returned by BD_CMP in its array argument TAUQ.

The size of TAUQ must verify: size( TAUQ ) = min(m,n) .

TAUP (INPUT) real(stnd), dimension(:)

TAUP(i) must contain the scalar factor of the elementary reflector G(i), which determines P, as returned by BD_CMP in its array argument TAUP.

The size of TAUP must verify: size( TAUP ) = min(m,n) .

P (OUTPUT) real(stnd), dimension(:,:)

On exit, the first min(m,n) columns of the n-by-n matrix P

The shape of p must verify:

size( P, 1 ) = n ,

size( P, 2 ) = min(m,n).

Further Details¶

This subroutine used a blocked algorithm for agregating the Householder transformations (e.g. the elementary reflectors) stored in MAT and generating the orthogonal matrices Q and P of the bidiagonal decomposition of MAT.

Furthermore, the computations are parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the blocked algorithm used here, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Dongarra, J.J., Sorensen, D.C., and Hammarling, S.J., 1989: Block

reduction of matrices to condensed form for eigenvalue computations. J. of Computational and Applied Mathematics, Vol. 27, pp. 215-227.

Walker, H.F., 1988: Implementation of the GMRES method using Householder

transformations. Siam J. Sci. Stat. Comput., Vol. 9, No 1, pp. 152-163.

`subroutine ortho_gen_bd2 ( mat, tauq, taup, q_pt )`¶

Purpose¶

ORTHO_GEN_BD2 generates the real orthogonal matrices Q and P’ determined by BD_CMP when reducing a m-by-n real matrix MAT to bidiagonal form :

MAT = Q * BD * P’.

Q and P’ are defined as products of elementary reflectors H(i) and G(i), respectively, determined by BD_CMP and stored in its array arguments MAT, TAUQ and TAUP :

if m >= n,

Q = H(1) * H(2) * … * H(n) and ORTHO_GEN_BD2 returns the first n columns of Q in MAT;

P’ = G(n-1) * … * G(2) * G(1) and ORTHO_GEN_BD2 returns P’ as an n-by-n matrix in Q_PT.

if m < n,

Q = H(1) * H(2) * … * H(m-1) and ORTHO_GEN_BD2 returns Q as an m-by-m matrix in Q_PT;

P’ = G(m) * … * G(2) * G(1) and ORTHO_GEN_BD2 returns the first m rows of P’, in MAT.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the vectors which define the elementary reflectors H(i) and G(i), as returned by BD_CMP in its array argument MAT.

On exit:

the first n columns of Q if m >= n ;

the first m rows of P’ if m < n .

TAUQ (INPUT) real(stnd), dimension(:)

TAUQ(i) must contain the scalar factor of the elementary reflector H(i), which determines Q, as returned by BD_CMP in its array argument TAUQ.

The size of TAUQ must verify: size( TAUQ ) = min(m,n) .

TAUP (INPUT) real(stnd), dimension(:)

TAUP(i) must contain the scalar factor of the elementary reflector G(i), which determines P’, as returned by BD_CMP in its array argument TAUP.

The size of TAUP must verify: size( TAUP ) = min(m,n) .

Q_PT (OUTPUT) real(stnd), dimension(:,:)

On exit:

the n-by-n matrix P’ if m >= n ;

the m-by-m matrix Q if m < n .

The shape of Q_PT must verify:

size( Q_PT, 1 ) = size( Q_PT, 2 ) = min(m,n).

Further Details¶

This subroutine used a blocked algorithm for agregating the Householder transformations (e.g. the elementary reflectors) stored in MAT and generating the orthogonal matrices Q and P of the bidiagonal decomposition of MAT.

Furthermore, the computations are parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the blocked algorithm, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Dongarra, J.J., Sorensen, D.C., and Hammarling, S.J., 1989: Block

reduction of matrices to condensed form for eigenvalue computations. J. of Computational and Applied Mathematics, Vol. 27, pp. 215-227.

Walker, H.F., 1988: Implementation of the GMRES method using Householder

transformations. Siam J. Sci. Stat. Comput., Vol. 9, No 1, pp. 152-163.

`subroutine ortho_gen_q_bd ( mat, tauq )`¶

Purpose¶

ORTHO_GEN_Q_BD generates the real orthogonal matrix Q determined by BD_CMP when reducing a m-by-n real matrix MAT to bidiagonal form :

MAT = Q * BD * P’.

Q is defined as products of elementary reflectors H(i) determined by BD_CMP and stored in its array arguments MAT and TAUQ:

if m >= n,

Q = H(1) * H(2) * … * H(n) and ORTHO_GEN_Q_BD returns the first n columns of Q in MAT.

if m < n,

Q = H(1) * H(2) * … * H(m-1) and ORTHO_GEN_Q_BD returns Q as an m-by-m matrix in MAT(:m,:m).

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the vectors which define the elementary reflectors H(i), as returned by BD_CMP.

On exit, the first min(m,n) columns of Q.

TAUQ (INPUT) real(stnd), dimension(:)

TAUQ(i) must contain the scalar factor of the elementary reflector H(i), which determines Q, as returned by BD_CMP in its array argument TAUQ.

The size of TAUQ must verify: size( TAUQ ) = min(m,n) .

Further Details¶

This subroutine used a blocked algorithm for agregating the Householder transformations (e.g. the elementary reflectors) stored in MAT and generating the orthogonal matrix Q of the bidiagonal decomposition of MAT.

Furthermore, the computations are parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the blocked algorithm, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Dongarra, J.J., Sorensen, D.C., and Hammarling, S.J., 1989: Block

reduction of matrices to condensed form for eigenvalue computations. J. of Computational and Applied Mathematics, Vol. 27, pp. 215-227.

Walker, H.F., 1988: Implementation of the GMRES method using Householder

transformations. Siam J. Sci. Stat. Comput., Vol. 9, No 1, pp. 152-163.

`subroutine ortho_gen_p_bd ( mat, taup, p )`¶

Purpose¶

ORTHO_GEN_P_BD generates the real orthogonal matrix P determined by BD_CMP when reducing a m-by-n real matrix MAT to bidiagonal form :

MAT = Q * BD * P’.

P is defined as products of elementary reflectors G(i) determined by BD_CMP and stored in its array arguments MAT and TAUP :

if m >= n,

P = G(1) * G(2) * … * G(n-1) and ORTHO_GEN_P_BD returns P as an n-by-n matrix in P.

if m < n,

P = G(1) * G(2) * … * G(m) and ORTHO_GEN_P_BD returns the first m columns of P, in P.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the vectors which define the elementary reflectors G(i), as returned by BD_CMP in its array argument MAT.

TAUP (INPUT) real(stnd), dimension(:)

TAUP(i) must contain the scalar factor of the elementary reflector G(i), which determines P, as returned by BD_CMP in its array argument TAUP.

The size of TAUP must verify: size( TAUP ) = min(m,n) .

P (OUTPUT) real(stnd), dimension(:,:)

On exit, the first min(m,n) columns of the n-by-n matrix P

The shape of p must verify:

size( P, 1 ) = n ,

size( P, 2 ) = min(m,n).

Further Details¶

This subroutine used a blocked algorithm for agregating the Householder transformations (e.g. the elementary reflectors) stored in MAT and generating the orthogonal matrix P of the bidiagonal decomposition of MAT.

Furthermore, the computations are parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and its use or the blocked algorithm, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Dongarra, J.J., Sorensen, D.C., and Hammarling, S.J., 1989: Block

reduction of matrices to condensed form for eigenvalue computations. J. of Computational and Applied Mathematics, Vol. 27, pp. 215-227.

Walker, H.F., 1988: Implementation of the GMRES method using Householder

transformations. Siam J. Sci. Stat. Comput., Vol. 9, No 1, pp. 152-163.

`subroutine apply_q_bd ( mat, tauq, c, left, trans )`¶

Purpose¶

APPLY_Q_BD overwrites the general real m-by-n matrix C with:

Q * C if LEFT = true and TRANS = false ;

Q’ * C if LEFT = true and TRANS = true ;

C * Q if LEFT = false and TRANS = false ;

C * Q’ if LEFT = false and TRANS = true .

Here Q is the orthogonal matrix determined by BD_CMP when reducing a real matrix MAT to bidiagonal form:

MAT = Q * BD * P’

and Q is defined as products of elementary reflectors H(i).

Let nq = m if LEFT = true and nq = n if LEFT = false. Thus nq is the order of the orthogonal matrix Q that is applied. MAT is assumed to have been an nq-by-k matrix and

Q = H(1) * H(2) * … * H(k) , if nq >= k ;

or

Q = H(1) * H(2) * … * H(nq-1) , if nq < k .

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

The vectors which define the elementary reflectors H(i), whose products determine the matrix Q, as returned by BD_CMP. MAT must be specified as in BD_CMP and is not modified by the routine.

The shape of MAT must verify:

if LEFT = true : size( C, 1 ) = size( MAT, 1 ) = nq ;

if LEFT = false : size( C, 2 ) = size( MAT, 1 ) = nq .

TAUQ (INPUT) real(stnd), dimension(:)

TAUQ(i) must contain the scalar factor of the elementary reflector H(i) which determines Q, as returned by BD_CMP in the array argument TAUQ.

The size of TAUQ must verify:

size( TAUQ ) = min( size( MAT, 1 ), ( MAT, 2 ) ) .

C (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m by n matrix C.

On exit, C is overwritten by Q * C or Q’ * C or C * Q’ or C * Q .

The shape of C must verify:

if LEFT = true : size( C, 1 ) = size( MAT, 1 ) = nq ;

if LEFT = false : size( C, 2 ) = size( MAT, 1 ) = nq .

LEFT (INPUT) logical(lgl)

On entry, if:

LEFT= true : apply Q or Q’ from the left

LEFT= false : apply Q or Q’ from the right

TRANS (INPUT) logical(lgl)

On entry, if:

TRANS = false : apply Q (no transpose)

TRANS = true : apply Q’ (transpose)

Further Details¶

This subroutine is adapted from the routine DORMBR in LAPACK.

This subroutine used a blocked algorithm for agregating the Householder transformations (e.g. the elementary reflectors) stored in the lower triangle of MAT and applying the orthogonal matrix Q of the bidiagonal factorization to the real m-by-n matrix C.

Furthermore, the computations are parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and the blocked version of the algorithm used here, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Dongarra, J.J., Sorensen, D.C., and Hammarling, S.J., 1989: Block

reduction of matrices to condensed form for eigenvalue computations. J. of Computational and Applied Mathematics, Vol. 27, pp. 215-227.

Walker, H.F., 1988: Implementation of the GMRES method using Householder

transformations. Siam J. Sci. Stat. Comput., Vol. 9, No 1, pp. 152-163.

`subroutine apply_p_bd ( mat, taup, c, left, trans )`¶

Purpose¶

APPLY_P_BD overwrites the general real m-by-n matrix C with

P * C if LEFT = true and TRANS = false ;

P’ * C if LEFT = true and TRANS = true ;

C * P if LEFT = false and TRANS = false ;

C * P’ if LEFT = false and TRANS = true .

Here P is the orthogonal matrix determined by BD_CMP when reducing a real matrix MAT to bidiagonal form:

MAT = Q * BD * P’

and P is defined as products of elementary reflectors G(i).

Let np = m if LEFT = true and np = n if LEFT = false. Thus np is the order of the orthogonal matrix P that is applied. MAT is assumed to have been an k-by-np matrix and

P = G(1) * G(2) * … * G(k) , if k < np ;

or

P = G(1) * G(2) * … * G(np-1) , if k >= np .

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

The vectors which define the elementary reflectors G(i), whose products determine the matrix P, as returned by BD_CMP. MAT must be specified as in BD_CMP and is not modified by the routine.

The shape of MAT must verify:

if LEFT = true : size( C, 1 ) = size( MAT, 2 ) = np ;

if LEFT = false : size( C, 2 ) = size( MAT, 2 ) = np .

TAUP (INPUT) real(stnd), dimension(:)

TAUP(i) must contain the scalar factor of the elementary reflector G(i) which determines P, as returned by BD_CMP in the array argument TAUP.

The size of TAUP must verify:

size( TAUP ) = min( size( MAT, 1 ), ( MAT, 2 ) ) .

C (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m by n matrix C.

On exit, C is overwritten by P * C or P’ * C or C * P or C * P’.

The shape of C must verify:

if LEFT = true : size( C, 1 ) = size( MAT, 2 ) = np ;

if LEFT = false : size( C, 2 ) = size( MAT, 2 ) = np .

LEFT (INPUT) logical(lgl)

On entry, if:

LEFT= true : apply P or P’ from the left

LEFT= false : apply P or P’ from the right

TRANS (INPUT) logical(lgl)

On entry, if:

TRANS = false : apply P (no transpose)

TRANS = true : apply P’ (transpose)

Further Details¶

This subroutine is adapted from the routine DORMBR in LAPACK.

This subroutine used a blocked algorithm for agregating the Householder transformations (e.g. the elementary reflectors) stored in the upper triangle of MAT and applying the orthogonal matrix P of the bidiagonal factorization to the real m-by-n matrix C.

Furthermore, the computations are parallelized if OPENMP is used.

For further details on the bidiagonal reduction algorithm and the blocked version of the algorithm used here, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Dongarra, J.J., Sorensen, D.C., and Hammarling, S.J., 1989: Block

reduction of matrices to condensed form for eigenvalue computations. J. of Computational and Applied Mathematics, Vol. 27, pp. 215-227.

Walker, H.F., 1988: Implementation of the GMRES method using Householder

transformations. Siam J. Sci. Stat. Comput., Vol. 9, No 1, pp. 152-163.

`subroutine bd_svd ( upper, d, e, failure, u, v, sort, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

BD_SVD computes the singular value decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal matrix B:

B = Q * S * P’

, where S is a diagonal matrix with non-negative diagonal elements (the singular values of B), and, Q and P are orthogonal matrices (P’ denotes the transpose of P).

The routine computes S, U * Q, and V * P, for given real input matrices U, V.

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : B is upper bidiagonal ;

UPPER = false : B is lower bidiagonal.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix B.

On exit, D contains the singular values of B.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix whose SVD is desired. E(1) is arbitrary.

On exit, E is destroyed.

The size of E must verify: size( E ) = size( D ) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit ;

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of B.

U (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the matrix U.

On exit, U is overwritten by U * Q.

The shape of U must verify: size( U, 2 ) = size( D ) .

V (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the matrix V.

On exit, V is overwritten by V * P.

The shape of V must verify: size( V, 2 ) = size( D ) .

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. The singular vectors are rearranged accordingly.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the algorithm. The algorithm fails to converge if the number of QR sweeps exceeds MAXITER * size(D). Convergence usually occurs in about 2 * size(D) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the implicit QR algorithm. MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

If, on entry, arguments U and V are n-by-n identity matrices, on exit they are replaced by Q and P, respectively.

This subroutine is adapted from subroutine QRBD given in the reference (1), with modifications suggested in the reference (2) and extensions to the bidiagonal case of the methods presented in the references (3) and (4).

Furthermore, the computation of the singular vectors is parallelized if OPENMP is used.

Note, finally, that the bidiagonal matrix is not scaled before computing the singular values and vectors. If some of the elements of the bidiagonal matrix are very small or large, it may be appropriate to scale the bidiagonal matrix before calling BD_SVD.

For further details, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Van Zee, F.G., Van de Geijn, R., and Quintana-Orti, G., 2011:

Restructuring the QR Algorithm for High-Performance Application of Givens Rotations. FLAME Working Note 60. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-11-36.

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. SIAM J. MATRIX ANAL. APPL, Vol. 18, 1013-1034.

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

`subroutine bd_svd2 ( upper, d, e, failure, u, vt, sort, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

BD_SVD2 computes the singular value decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal matrix B:

B = Q * S * P’

, where S is a diagonal matrix with non-negative diagonal elements (the singular values of B), and, Q and P are orthogonal matrices (P’ denotes the transpose of P).

The routine computes S, U * Q, and P’ * VT, for given real input matrices U, VT.

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : B is upper bidiagonal ;

UPPER = false : B is lower bidiagonal.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix B.

On exit, D contains the singular values of B.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix whose SVD is desired. E(1) is arbitrary.

On exit, E is destroyed.

The size of E must verify: size( E ) = size( D ) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit ;

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of B.

U (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the matrix U.

On exit, U is overwritten by U * Q.

The shape of U must verify: size( U, 2 ) = size( D ) .

VT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the matrix VT.

On exit, VT is overwritten by P’ * VT.

The shape of VT must verify: size( VT, 1 ) = size( D ) .

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. The singular vectors are rearranged accordingly.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the algorithm. The algorithm fails to converge if the number of QR sweeps exceeds MAXITER * size(D). Convergence usually occurs in about 2 * size(D) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the implicit QR algorithm. MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

If arguments U and VT are n-by-n identity matrices, on exit they are replaced by Q and P’, respectively.

This subroutine is adapted from subroutine QRBD given in the reference (1), with modifications suggested in the reference (2) and extensions to the bidiagonal case of the methods presented in the references (3) and (4).

Furthermore, the computation of the singular vectors is parallelized if OPENMP is used.

Note, finally, that the bidiagonal matrix is not scaled before computing the singular values and vectors. If some of the elements of the bidiagonal matrix are very small or large, it may be appropriate to scale the bidiagonal matrix before calling BD_SVD2.

For further details, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Van Zee, F.G., Van de Geijn, R., and Quintana-Orti, G., 2011:

Restructuring the QR Algorithm for High-Performance Application of Givens Rotations. FLAME Working Note 60. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-11-36.

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. SIAM J. MATRIX ANAL. APPL, Vol. 18, 1013-1034.

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

`subroutine bd_svd ( upper, d, e, failure, u, sort, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

BD_SVD computes the singular value decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal matrix B:

B = Q * S * P’

, where S is a diagonal matrix with non-negative diagonal elements (the singular values of B), and, Q and P are orthogonal matrices (P’ denotes the transpose of P).

The routine computes S and U * Q for a given real input matrix U.

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : B is upper bidiagonal ;

UPPER = false : B is lower bidiagonal.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix B.

On exit, D contains the singular values of B.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix whose SVD is desired. E(1) is arbitrary.

On exit, E is destroyed.

The size of E must verify: size( E ) = size( D ) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit ;

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of B.

U (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the matrix U.

On exit, U is overwritten by U * Q.

The shape of U must verify: size( U, 2 ) = size( D ) .

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. The singular vectors U are rearranged accordingly.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the algorithm. The algorithm fails to converge if the number of QR sweeps exceeds MAXITER * size(D). Convergence usually occurs in about 2 * size(D) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the implicit QR algorithm. MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

If argument U is a n-by-n identity matrix, on exit it is replaced by Q.

This subroutine is adapted from subroutine QRBD given in the reference (1), with modifications suggested in the reference (2) and extensions to the bidiagonal case of the methods presented in the references (3) and (4).

Furthermore, the computation of the singular vectors is parallelized if OPENMP is used.

Note, finally, that the bidiagonal matrix is not scaled before computing the singular values and vectors. If some of the elements of the bidiagonal matrix are very small or large, it may be appropriate to scale the bidiagonal matrix before calling BD_SVD.

For further details, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

Van Zee, F.G., Van de Geijn, R., and Quintana-Orti, G., 2011:

Restructuring the QR Algorithm for High-Performance Application of Givens Rotations. FLAME Working Note 60. The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-11-36.

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. SIAM J. MATRIX ANAL. APPL, Vol. 18, 1013-1034.

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

`subroutine bd_svd ( upper, d, e, failure, sort, maxiter )`¶

Purpose¶

BD_SVD computes the singular values, S, of a real n-by-n (upper or lower) bidiagonal matrix B:

B = Q * S * P’

, where S is a diagonal matrix with non-negative diagonal elements (the singular values of B), and, Q and P are orthogonal matrices (P’ denotes the transpose of P).

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : B is upper bidiagonal ;

UPPER = false : B is lower bidiagonal.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix B.

On exit, D contains the singular values of B.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix whose singular values are desired. E(1) is arbitrary.

On exit, E is destroyed.

The size of E must verify: size( E ) = size( D ) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit ;

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of B.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the algorithm. The algorithm fails to converge if the number of QR sweeps exceeds MAXITER * size(D). Convergence usually occurs in about 2 * size(D) QR sweeps.

The default is 10.

Further Details¶

This subroutine is adapted from subroutine QRBD in the reference (1).

For further details, see:

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine bd_singval ( d, e, nsing, s, failure, sort, vector, abstol, ls, theta, scaling, init )`¶

Purpose¶

BD_SINGVAL computes all or some of the greatest singular values of a real n-by-n (upper or lower) bidiagonal matrix B by a bisection algorithm.

The Singular Value Decomposition of B is:

B = Q * S * P’

where S is a diagonal matrix with non-negative diagonal elements (the singular values of B), and, Q and P are orthogonal matrices (P’ denotes the transpose of P).

Arguments¶

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix B.

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix whose singular values are desired. E(1) is arbitrary.

The size of E must verify: size( E ) = size( D ) .

NSING (OUTPUT) integer(i4b)

On output, NSING specifies the number of singular values which have been computed. Note that NSING may be greater than the optional argument LS, if multiple singular values at index LS make unique selection impossible.

If none of the optional arguments LS and THETA are used, NSING is set to size(D) and all the singular values are computed.

S (OUTPUT) real(stnd), dimension(:)

On exit, S(1:NSING) contains the first NSING singular values of B. The other values in S ( S(NSING+1:size(D)) ) are flagged by a quiet NAN.

The size of S must verify: size( S ) = size( D ) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit and the bisection algorithm converged for all the computed singular values to the desired accuracy ;

FAILURE = true : indicates that some or all of the singular values failed to converge or were not computed. This is generally caused by unexpectedly inaccurate arithmetic. The sign of the incorrect singular values is set to negative.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. For other values of SORT nothing is done and S(:nsing) may not be sorted in decreasing order of of magnitude.

VECTOR (INPUT, OPTIONAL) logical(lgl)

On entry, if VECTOR is set to TRUE, a vectorized version of the bisection algorithm is used.

The default is VECTOR=false.

ABSTOL (INPUT, OPTIONAL) real(stnd)

On entry, the absolute tolerance for the singular values. A singular value (or cluster) is considered to be located if it has been determined to lie in an interval whose width is ABSTOL or less.

If ABSTOL is less than or equal to zero, or is not specified, then ULP * | T(GK) | will be used, where | T(GK) | means the 1-norm of the GOLUB-KAHAN tridiagonal form of the bidiagonal matrix B and ULP is the machine precision (distance from 1 to the next larger floating point number).

Singular values will be computed most accurately when ABSTOL is set to the square root of the underflow threshold, sqrt(LAMCH(‘S’)), not zero.

LS (INPUT, OPTIONAL) integer(i4b)

On entry, LS specifies the number of singular values which must be computed by the subroutine. On output, NSING may be different than LS if multiple singular values at index LS make unique selection impossible.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

LS must be greater than 0 and less or equal to size( D ) .

The default is LS = size(D).

THETA (INPUT, OPTIONAL) real(stnd)

On entry, THETA specifies that the singular values which are greater or equal to THETA must be computed. If none of the singular values are greater or equal to THETA, NSING is set to zero and S(:) to a quiet NAN.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

The default is THETA = 0.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix B is scaled before computing the singular values.

The default is to scale the bidiagonal matrix.

INIT (INPUT,OPTIONAL) logical(lgl)

On entry, if INIT=true the initial intervals for the bisection steps are computed from estimates of the eigenvalues of the associated B’ * B tridiagonal matrix obtained from the Pal-Walker-Kahan algorithm.

The default is not to use the Pal-Walker-Kahan algorithm.

Further Details¶

Let S(i), i=1,…,N=size(D), be the N singular values of the bidiagonal matrix B in decreasing order of magnitude. BD_SINGVAL then computes the LS largest singular values ( or the singular values which are greater or equal to THETA) of B by a bisection method (see the reference (1) below, Sec.8.5 ). The bisection method is applied to an associated 2N by 2N symmetric tridiagonal matrix T (the so-called GOLUB-KAHAN form of B) whose eigenvalues are the singular values of B and their negatives (see the reference (2) below, Sec.3.3 ).

For further details, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine bd_singval2 ( d, e, nsing, s, failure, sort, vector, abstol, ls, theta, scaling, init )`¶

Purpose¶

BD_SINGVAL2 computes all or some of the greatest singular values of a real n-by-n (upper or lower) bidiagonal matrix B by a bisection algorithm.

The Singular Value Decomposition of B is:

B = Q * S * P’

where S is a diagonal matrix with non-negative diagonal elements (the singular values of B), and, Q and P are orthogonal matrices (P’ denotes the transpose of P).

Arguments¶

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix B.

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix whose singular values are desired. E(1) is arbitrary.

The size of E must verify: size( E ) = size( D ) .

NSING (OUTPUT) integer(i4b)

On output, NSING specifies the number of singular values which have been computed. Note that NSING may be greater than the optional argument LS, if multiple singular values at index LS make unique selection impossible.

If none of the optional arguments LS and THETA are used, NSING is set to size(D) and all the singular values are computed.

S (OUTPUT) real(stnd), dimension(:)

On exit, S(1:NSING) contains the first NSING singular values of B. The other values in S ( S(NSING+1:size(D)) ) are flagged by a quiet NAN.

The size of S must verify: size( S ) = size( D ) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit and the bisection algorithm converged for all the computed singular values to the desired accuracy ;

FAILURE = true : indicates that some or all of the singular values failed to converge or were not computed. This is generally caused by unexpectedly inaccurate arithmetic. The sign of the incorrect singular values is set to negative.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. For other values of SORT nothing is done and S(:nsing) may not be sorted in decreasing order of of magnitude.

VECTOR (INPUT, OPTIONAL) logical(lgl)

On entry, if VECTOR is set to TRUE, a vectorized version of the bisection algorithm is used.

The default is VECTOR=false.

ABSTOL (INPUT, OPTIONAL) real(stnd)

On entry, the absolute tolerance for the singular values. A singular value (or cluster) is considered to be located if its square has been determined to lie in an interval whose width is ABSTOL or less.

If ABSTOL is less than or equal to zero, or is not specified, then ULP * | B’ * B | will be used, where | B’ * B | means the 1-norm of the tridiagonal matrix B’ * B ( B’ means the transpose of B) and ULP is the machine precision (distance from 1 to the next larger floating point number).

Singular values will be computed most accurately when ABSTOL is set to the square root of the underflow threshold, sqrt(LAMCH(‘S’)), not zero.

LS (INPUT, OPTIONAL) integer(i4b)

On entry, LS specifies the number of singular values which must be computed by the subroutine. On output, NSING may be different than LS if multiple singular values at index LS make unique selection impossible.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

LS must be greater than 0 and less or equal to size( D ) .

The default is LS = size(D).

THETA (INPUT, OPTIONAL) real(stnd)

On entry, THETA specifies that the singular values which are greater or equal to THETA must be computed. If none of the singular values are greater or equal to THETA, NSING is set to zero and S(:) to a quiet NAN.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

The default is THETA = 0.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix B is scaled before computing the singular values.

The default is to scale the bidiagonal matrix.

INIT (INPUT,OPTIONAL) logical(lgl)

On entry, if INIT=true the initial intervals for the bisection steps are computed from estimates of the eigenvalues of the associated B’ * B tridiagonal matrix obtained from the Pal-Walker-Kahan algorithm.

The default is not to use the Pal-Walker-Kahan algorithm.

Further Details¶

Let S(i), i=1,…,N=size(D), be the N singular values of the bidiagonal matrix B in decreasing order of magnitude. BD_SINGVAL2 then computes the LS largest singular values ( or the singular values which are greater or equal to THETA) of B by a bisection method (see the reference (1) below, Sec.8.5 ). The bisection method is applied (implicitly) to the associated N by N symmetric tridiagonal matrix B’ * B whose eigenvalues are the squares of the singular values of B by using the differential stationary form of the qd algorithm of Rutishauser (see the reference (2) below, Sec.3.1 ).

BD_SINGVAL2 is faster than BD_SINGVAL, however if relative accuracy for small singular values is required, BD_SINGVAL (which is based on the Golub-Kahan form of the bidiagonal matrix) is the best choice.

For further details, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`function singvalues ( mat, sort, mul_size, maxiter )`¶

Purpose¶

Function SINGVALUES computes the singular values of a real m-by-n matrix MAT. The Singular Value Decomposition (SVD) is written

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= max(m,n), otherwise a default value is used. For good performance, at the expense of more workspace, a large value can be used.

The default is 32.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER*min(m,n). Convergence usually occurs in about 2*min(m,n) QR sweeps.

The default is 10.

Further Details¶

If the SVD algorithm did not converge and full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT, function SINGVALUES returns a min(m,n)-vector filled with NAN() function.

`subroutine select_singval_cmp ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, tauq, taup, scaling, init )`¶

Purpose¶

SELECT_SINGVAL_CMP computes all or some of the greatest singular values of a real m-by-n matrix MAT.

The Singular Value Decomposition (SVD) is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative.

The original matrix MAT is first reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal (see the reference (1) below).

The singular values SIGMA of the bidiagonal matrix BD, which are also the singular values of MAT, are then computed by a bisection algorithm applied to the Tridiagonal Golub-Kahan form of the bidiagonal matrix BD (see the reference (2) below, Sec.3.3 ).

The routine outputs (parts of) SIGMA and optionally Q and P (in packed form), and BD for a given matrix MAT. SIGMA, Q, P and BD may then be used to obtain selected singular vectors with subroutines BD_INVITER, BD_INVITER2, BD_DEFLATE or BD_DEFLATE2.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is destroyed and if TAUQ or TAUP are present MAT is overwritten as follows:

if m >= n, the elements on and below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors;

if m < n, the elements below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements on and above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors.

See Further Details.

NSING (OUTPUT) integer(i4b)

On output, NSING specifies the number of singular values which have been computed. Note that NSING may be greater than the optional argument LS, if multiple singular values at index LS make unique selection impossible.

If none of the optional arguments LS and THETA are used, NSING is set to min( size(MAT,1) , size(MAT,2) ) and all the singular values are computed.

S (OUTPUT) real(stnd), dimension(:)

On exit, S(1:NSING) contains the first NSING singular values of MAT. The other values in S ( S(NSING+1:) ) are flagged by a quiet NAN.

The size of S must be min( size(MAT,1) , size(MAT,2) ).

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit and the bisection algorithm converged for all the computed singular values to the desired accuracy ;

FAILURE = true : indicates that some or all of the singular values failed to converge or were not computed. This is generally caused by unexpectedly inaccurate arithmetic. The sign of the incorrect singular values is set to negative.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= n, otherwise a default value is used. For good performance, at the expense of more workspace, a large value can be used.

The default is min( 32, n) .

VECTOR (INPUT, OPTIONAL) logical(lgl)

On entry, if VECTOR is set to TRUE, a vectorized version of the bisection algorithm is used to compute the singular values SIGMA of the bidiagonal matrix BD.

The default is VECTOR=false.

ABSTOL (INPUT, OPTIONAL) real(stnd)

On entry, the absolute tolerance for the singular values. A singular value (or cluster) is considered to be located if it has been determined to lie in an interval whose width is ABSTOL or less.

Singular values will be computed most accurately when ABSTOL is set to the square root of the underflow threshold, sqrt(LAMCH(‘S’)), not zero.

If ABSTOL is less than or equal to zero, or is not specified, then ULP * | T(GK) | will be used, where | T(GK) | means the 1-norm of the GOLUB-KAHAN tridiagonal form of the bidiagonal matrix BD and ULP is the machine precision (distance from 1 to the next larger floating point number).

LS (INPUT, OPTIONAL) integer(i4b)

On entry, LS specifies the number of singular values which must be computed by the subroutine. On output, NSING may be different than LS if multiple singular values at index LS make unique selection impossible.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

LS must be greater than 0 and less or equal to min( size(MAT,1) , size(MAT,2) ).

The default is LS = min( size(MAT,1) , size(MAT,2) )

THETA (INPUT, OPTIONAL) real(stnd)

On entry, THETA specifies that the singular values which are greater or equal to THETA must be computed. If none of the singular values are greater or equal to THETA, NSING is set to zero and S(:) to a quiet NAN.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

The default is THETA = 0.

D (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The diagonal elements of the intermediate bidiagonal matrix BD

The size of D must be min( size(MAT,1) , size(MAT,2) ).

E (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The off-diagonal elements of the intermediate bidiagonal matrix BD:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

The size of E must be min( size(MAT,1) , size(MAT,2) ).

TAUQ (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix Q. See Further Details.

The size of TAUQ must be min( size(MAT,1) , size(MAT,2) ).

TAUP (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix P. See Further Details.

The size of TAUP must be min( size(MAT,1) , size(MAT,2) ).

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the singular values.

The default is to scale the bidiagonal matrix.

INIT (INPUT,OPTIONAL) logical(lgl)

On entry, if INIT=true the initial intervals for the bisection steps are computed from estimates of the eigenvalues of the associated BD’ * BD tridiagonal matrix obtained from the Pal-Walker-Kahan algorithm.

The default is not to use the Pal-Walker-Kahan algorithm.

Further Details¶

The matrices Q and P are represented as products of elementary reflectors:

If m >= n,

Q = H(1) * H(2) * … * H(n) and P = G(1) * G(2) * … * G(n-1)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors. Moreover, u(1:i-1) = 0 and v(1:i) = 0.

If TAUQ or TAUP are present:

u(i:m) is stored on exit in MAT(i:m,i);

v(i+1:n) is stored on exit in MAT(i,i+1:n).

If TAUQ is present : tauq is stored in TAUQ(i).

If TAUP is present : taup is stored in TAUP(i).

If m < n,

Q = H(1) * H(2) * … * H(m-1) and P = G(1) * G(2) * … * G(m)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors. Moreover, u(1:i) = 0 and v(1:i-1) = 0.

If TAUQ or TAUP are present:

u(i+1:m) is stored on exit in MAT(i+1:m,i);

v(i:n) is stored on exit in MAT(i,i:n).

If TAUQ is present : tauq is stored in TAUQ(i).

If TAUP is present : taup is stored in TAUP(i).

The contents of MAT on exit, if TAUQ or TAUP are present, are illustrated by the following examples:

m = 6 and n = 5 (m >= n):

( u1 v1 v1 v1 v1 )

( u1 u2 v2 v2 v2 )

( u1 u2 u3 v3 v3 )

( u1 u2 u3 u4 v4 )

( u1 u2 u3 u4 u5 )

( u1 u2 u3 u4 u5 )

m = 5 and n = 6 (m < n):

( v1 v1 v1 v1 v1 v1 )

( u1 v2 v2 v2 v2 v2 )

( u1 u2 v3 v3 v3 v3 )

( u1 u2 u3 v4 v4 v4 )

( u1 u2 u3 u4 v5 v5 )

where ui denotes an element of the vector defining H(i), and vi an element of the vector defining G(i).

Now, let SIGMA(i), i=1,…,N=min(m,n), be the singular values of the intermediate bidiagonal matrix BD in decreasing order of magnitude. The subroutine computes the LS largest singular values ( or the singular values which are greater or equal to THETA) of BD by a bisection method (see the reference (1) below, Sec.8.5 ). The bisection method is applied to an associated 2N by 2N symmetric tridiagonal matrix T (the so-called GOLUB-KAHAN form of BD) whose eigenvalues are the singular values of BD and their negatives (see the reference (2) below).

For further details, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine select_singval_cmp2 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, tauq, taup, scaling, init )`¶

Purpose¶

SELECT_SINGVAL_CMP2 computes all or some of the greatest singular values of a real m-by-n matrix MAT.

The Singular Value Decomposition (SVD) is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative.

The original matrix MAT is first reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal (see the reference (1) below).

The singular values SIGMA of the bidiagonal matrix BD, which are also the singular values of MAT, are then computed by a bisection algorithm (see the reference (1) below, Sec.8.5 ). The bisection method is applied (implicitly) to the associated min(m,n)-by-min(m,n) symmetric tridiagonal matrix BD’ * BD whose eigenvalues are the squares of the singular values of BD by using the differential stationary form of the qd algorithm of Rutishauser (see the reference (2) below, Sec.3.1 ).

The routine outputs (parts of) SIGMA and optionally Q and P (in packed form), and BD for a given matrix MAT. SIGMA, Q, P and BD may then be used to obtain selected singular vectors with subroutines BD_INVITER, BD_INVITER2, BD_DEFLATE or BD_DEFLATE2.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is destroyed and if TAUQ or TAUP are present MAT is overwritten as follows:

if m >= n, the elements on and below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors;

if m < n, the elements below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements on and above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors.

See Further Details.

NSING (OUTPUT) integer(i4b)

On output, NSING specifies the number of singular values which have been computed. Note that NSING may be greater than the optional argument LS, if multiple singular values at index LS make unique selection impossible.

If none of the optional arguments LS and THETA are used, NSING is set to min( size(MAT,1) , size(MAT,2) ) and all the singular values are computed.

S (OUTPUT) real(stnd), dimension(:)

On exit, S(1:NSING) contains the first NSING singular values of MAT. The other values in S ( S(NSING+1:) ) are flagged by a quiet NAN.

The size of S must be min( size(MAT,1) , size(MAT,2) ).

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit and the bisection algorithm converged for all the computed singular values to the desired accuracy ;

FAILURE = true : indicates that some or all of the singular values failed to converge or were not computed. This is generally caused by unexpectedly inaccurate arithmetic. The sign of the incorrect singular values is set to negative.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= n, otherwise a default value is used. For good performance, at the expense of more workspace, a large value can be used.

The default is min( 32, n) .

VECTOR (INPUT, OPTIONAL) logical(lgl)

On entry, if VECTOR is set to TRUE, a vectorized version of the bisection algorithm is used to compute the singular values SIGMA of the bidiagonal matrix BD.

The default is VECTOR=false.

ABSTOL (INPUT, OPTIONAL) real(stnd)

On entry, the absolute tolerance for the singular values. A singular value (or cluster) is considered to be located if its square has been determined to lie in an interval whose width is ABSTOL or less.

Singular values will be computed most accurately when ABSTOL is set to the square root of the underflow threshold, sqrt(LAMCH(‘S’)), not zero.

If ABSTOL is less than or equal to zero, or is not specified, then ULP * | BD’ * BD | will be used, where | BD’ * BD | means the 1-norm of the tridiagonal matrix BD’ * BD ( BD’ means the transpose of BD) and ULP is the machine precision (distance from 1 to the next larger floating point number).

LS (INPUT, OPTIONAL) integer(i4b)

On entry, LS specifies the number of singular values which must be computed by the subroutine. On output, NSING may be different than LS if multiple singular values at index LS make unique selection impossible.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

LS must be greater than 0 and less or equal to min( size(MAT,1) , size(MAT,2) ).

The default is LS = min( size(MAT,1) , size(MAT,2) )

THETA (INPUT, OPTIONAL) real(stnd)

On entry, THETA specifies that the singular values which are greater or equal to THETA must be computed. If none of the singular values are greater or equal to THETA, NSING is set to zero and S(:) to a quiet NAN.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

The default is THETA = 0.

D (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The diagonal elements of the intermediate bidiagonal matrix BD

The size of D must be min( size(MAT,1) , size(MAT,2) ).

E (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The off-diagonal elements of the intermediate bidiagonal matrix BD:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

The size of E must be min( size(MAT,1) , size(MAT,2) ).

TAUQ (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix Q. See Further Details.

The size of TAUQ must be min( size(MAT,1) , size(MAT,2) ).

TAUP (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix P. See Further Details.

The size of TAUP must be min( size(MAT,1) , size(MAT,2) ).

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the singular values.

The default is to scale the bidiagonal matrix.

INIT (INPUT,OPTIONAL) logical(lgl)

On entry, if INIT=true the initial intervals for the bisection steps are computed from estimates of the eigenvalues of the associated BD’ * BD tridiagonal matrix obtained from the Pal-Walker-Kahan algorithm.

The default is not to use the Pal-Walker-Kahan algorithm.

Further Details¶

The matrices Q and P are represented as products of elementary reflectors:

If m >= n,

Q = H(1) * H(2) * … * H(n) and P = G(1) * G(2) * … * G(n-1)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors. Moreover, u(1:i-1) = 0 and v(1:i) = 0.

If TAUQ or TAUP are present:

u(i:m) is stored on exit in MAT(i:m,i);

v(i+1:n) is stored on exit in MAT(i,i+1:n).

If TAUQ is present : tauq is stored in TAUQ(i). If TAUP is present : taup is stored in TAUP(i).

If m < n,

Q = H(1) * H(2) * … * H(m-1) and P = G(1) * G(2) * … * G(m)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors. Moreover, u(1:i) = 0 and v(1:i-1) = 0.

If TAUQ or TAUP are present:

u(i+1:m) is stored on exit in MAT(i+1:m,i);

v(i:n) is stored on exit in MAT(i,i:n).

If TAUQ is present : tauq is stored in TAUQ(i).

If TAUP is present : taup is stored in TAUP(i).

The contents of MAT on exit, if TAUQ or TAUP are present, are illustrated by the following examples:

m = 6 and n = 5 (m >= n):

( u1 v1 v1 v1 v1 )

( u1 u2 v2 v2 v2 )

( u1 u2 u3 v3 v3 )

( u1 u2 u3 u4 v4 )

( u1 u2 u3 u4 u5 )

( u1 u2 u3 u4 u5 )

m = 5 and n = 6 (m < n):

( v1 v1 v1 v1 v1 v1 )

( u1 v2 v2 v2 v2 v2 )

( u1 u2 v3 v3 v3 v3 )

( u1 u2 u3 v4 v4 v4 )

( u1 u2 u3 u4 v5 v5 )

where ui denotes an element of the vector defining H(i), and vi an element of the vector defining G(i).

Now, let SIGMA(i), i=1,…,N=min(m,n), be the singular values of the intermediate bidiagonal matrix BD in decreasing order of magnitude. The subroutine computes the LS largest singular values ( or the singular values which are greater or equal to THETA) of BD by a bisection method (see the reference (1) below, Sec.8.5 ). The bisection method is applied (implicitly) to the associated N by N symmetric tridiagonal matrix BD’ * BD whose eigenvalues are the squares of the singular values of BD by using the differential stationary form of the qd algorithm of Rutishauser (see the reference (2) below, Sec.3.1 ).

SELECT_SINGVAL_CMP2 subroutine is less accurate, but faster than SELECT_SINGVAL_CMP subroutine since SELECT_SINGVAL_CMP works on the 2N by 2N symmetric tridiagonal GOLUB-KAHAN form of BD, while SELECT_SINGVAL_CMP2 works implicitly on the associated N by N symmetric tridiagonal matrix BD’ * BD whose eigenvalues are the squares of the singular values of BD.

For further details, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine select_singval_cmp3 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, p, gen_p, scaling, init, failure_bd )`¶

Purpose¶

SELECT_SINGVAL_CMP3 computes all or some of the greatest singular values of a real m-by-n matrix MAT with m>=n.

The Singular Value Decomposition (SVD) is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative.

The original matrix MAT is first reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal (see the reference (5) below). The Ralha-Barlow one-sided method is used for this purpose (see the references (1) to (3) below).

The singular values SIGMA of the bidiagonal matrix BD, which are also the singular values of MAT, are then computed by a bisection algorithm applied to the Golub-Kahan form of the bidiagonal matrix BD (see the references (5) and (6) below).

The routine outputs (parts of) SIGMA, Q and optionally P (in packed form) and BD for a given matrix MAT. SIGMA, Q, P and BD may then be used to obtain selected singular vectors with subroutines BD_INVITER, BD_INVITER2, BD_DEFLATE or BD_DEFLATE2.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is overwritten with the first n columns of Q (stored column-wise), the orthogonal matrix used to reduce MAT to bidiagonal form as returned by subroutine BD_CMP2 in its argument MAT.

The shape of MAT must verify:

size( MAT, 1 ) >= size( MAT, 2 ) = n .

NSING (OUTPUT) integer(i4b)

On output, NSING specifies the number of singular values which have been computed. Note that NSING may be greater than the optional argument LS, if multiple singular values at index LS make unique selection impossible.

If none of the optional arguments LS and THETA are used, NSING is set to size(MAT,2) and all the singular values are computed.

S (OUTPUT) real(stnd), dimension(:)

On exit, S(1:NSING) contains the first NSING singular values of MAT. The other values in S ( S(NSING+1:) ) are flagged by a quiet NAN.

The size of S must be equal to size( MAT, 2 ) = n .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit and the bisection algorithm converged for all the computed singular values to the desired accuracy ;

FAILURE = true : indicates that some or all of the singular values failed to converge or were not computed. This is generally caused by unexpectedly inaccurate arithmetic. The sign of the incorrect singular values is set to negative.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= n, otherwise a default value is used. For good performance, at the expense of more workspace, a large value can be used.

The default is min(32,n).

VECTOR (INPUT, OPTIONAL) logical(lgl)

On entry, if VECTOR is set to TRUE, a vectorized version of the bisection algorithm is used to compute the singular values SIGMA of the bidiagonal matrix BD.

The default is VECTOR=false.

ABSTOL (INPUT, OPTIONAL) real(stnd)

On entry, the absolute tolerance for the singular values. A singular value (or cluster) is considered to be located if it has been determined to lie in an interval whose width is ABSTOL or less.

Singular values will be computed most accurately when ABSTOL is set to the square root of the underflow threshold, sqrt(LAMCH(‘S’)), not zero.

If ABSTOL is less than or equal to zero, or is not specified, then ULP * | T(GK) | will be used, where | T(GK) | means the 1-norm of the GOLUB-KAHAN tridiagonal form of the bidiagonal matrix BD and ULP is the machine precision (distance from 1 to the next larger floating point number).

LS (INPUT, OPTIONAL) integer(i4b)

On entry, LS specifies the number of singular values which must be computed by the subroutine. On output, NSING may be different than LS if multiple singular values at index LS make unique selection impossible.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

LS must be greater than 0 and less or equal to size( MAT, 2 ) = n .

The default is LS = size( MAT, 2 ) = n .

THETA (INPUT, OPTIONAL) real(stnd)

On entry, THETA specifies that the singular values which are greater or equal to THETA must be computed. If none of the singular values are greater or equal to THETA, NSING is set to zero and S(:) to a quiet NAN.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

The default is THETA = 0 .

D (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The diagonal elements of the intermediate bidiagonal matrix BD

The size of D must be equal to size( MAT, 2 ) = n .

E (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The off-diagonal elements of the intermediate upper bidiagonal matrix BD:

E(i) = B(i-1,i) for i = 2,3,…,n;

E(1) is arbitrary.

The size of E must be equal to size( MAT, 2 ) = n .

P (OUTPUT, OPTIONAL) real(stnd), dimension(:,:)

On exit, P is overwritten with the n-by-n orthogonal matrix P (stored column-wise or in packed form), the orthogonal matrix used to reduce MAT to bidiagonal form as returned by subroutine BD_CMP2 in its argument P.

The shape of P must verify:

size( P, 1 ) = size( P, 2 ) = n .

GEN_P (INPUT,OPTIONAL) logical(lgl)

On entry, this optional argument has an effect only if the optional argument P is also used.

In this case, if the optional argument GEN_P is used and is set to true, the orthogonal matrix P used to reduce MAT to bidiagonal form is generated on output of the subroutine in its argument P.

If GEN_P is set to false, the orthogonal matrix P is stored in factored form as products of elementary reflectors in the lower triangle of the array P. See the description of BD_CMP2 subroutine for more details.

The default is true.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the singular values.

The default is to scale the bidiagonal matrix.

INIT (INPUT,OPTIONAL) logical(lgl)

On entry, if INIT=true the initial intervals for the bisection steps are computed from estimates of the eigenvalues of the associated BD’ * BD tridiagonal matrix obtained from the Pal-Walker-Kahan algorithm.

The default is not to use the Pal-Walker-Kahan algorithm.

FAILURE_BD (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE_BD = false : indicates that maximum accuracy was obtained in the Ralha-Barlow one-sided bidiagonalization of MAT.

FAILURE_BD = true : indicates that MAT is nearly singular and some loss of orthogonality can be expected in the Ralha-Barlow bidiagonalization algorithm.

Further Details¶

The matrices Q, P and BD are computed with the help of the Ralha-Barlow one-sided method. Q is computed by a recurrence relationship and P as a product of n-1 elementary reflectors (e.g. Householder transformations):

P = G(1) * G(2) * … * G(n-1)

Each G(i) has the form:

G(i) = I + taup * v * v’

where taup is a real scalar, and v is a real vector. IF GEN_P is used and set to false, the n-1 G(i) elementary reflectors are stored in the lower triangle of the array P.

For the G(i) reflector, taup is stored in P(i+1,1) and v is stored in P(i+1:n,i+1). In addition, P(1,1) is set to -1 if GEN_P=false and is equal to 1 if GEN_P=true.

In other words, the value of P(1,1) indicates if the orthogonal matrix P is stored in factored form or not. Note that if n is equal to 1, elementary reflectors are not needed and consequently P(1,1) is set to 1, independently of the value of GEN_P.

This is the blocked version of the algorithm for the special case of blocks of size 2. See the references (1), (2) and (3) for further details. Furthermore the algorithm is parallelized if OPENMP is used.

Since Q is computed by a recurrence relationship, a loss of orthogonality of Q can be observed when the rectangular matrix MAT is singular or nearly singular.

To correct this deficiency, partial reorthogonalization is performed to ensure orthogonality at the expense of speed of computation. The reorthogonalization uses the Gram-Schmidt method described in the reference (4).

The reference (2) also explains how to handle the case of an exactly singular matrix MAT (a very rare event). However, in this subroutine, the partial reorthogonalization described above corrects automatically this problem as described in the reference (4).

Now, let SIGMA(i), i=1,…,n, be the singular values of the intermediate bidiagonal matrix BD in decreasing order of magnitude. The subroutine computes the LS largest singular values ( or the singular values which are greater or equal to THETA) of BD by a bisection method (see the reference (5) below, Sec.8.5 ). The bisection method is applied to an associated 2N by 2N symmetric tridiagonal matrix T (the so-called GOLUB-KAHAN form of BD) whose eigenvalues are the singular values of BD and their negatives (see the reference (6) below).

For further details, see:

Ralha, R.M.S., 2003: One-sided reduction to bidiagonal form.

Linear Algebra Appl., No 358, pp. 219-238.

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Stewart, G.W., 2007: Block Gram-Schmidt Orthogonalization. Report TR-4823,

Department of Computer Science, College Park, University of Maryland.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine select_singval_cmp4 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, p, gen_p, scaling, init, failure_bd )`¶

Purpose¶

SELECT_SINGVAL_CMP4 computes all or some of the greatest singular values of a real m-by-n matrix MAT with m>=n.

The Singular Value Decomposition (SVD) is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative.

The original matrix MAT is first reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal (see the reference (5) below). The Ralha-Barlow one-sided method is used for this purpose (see the references (1) to (3) below).

The singular values SIGMA of the bidiagonal matrix BD, which are also the singular values of MAT, are then computed by a bisection algorithm (see the reference (5) below, Sec.8.5 ). The bisection method is applied (implicitly) to the associated min(m,n)-by-min(m,n) symmetric tridiagonal matrix BD’ * BD whose eigenvalues are the squares of the singular values of BD by using the differential stationary form of the qd algorithm of Rutishauser (see the reference (6) below, Sec.3.1 ).

The routine outputs (parts of) SIGMA, Q and optionally P (in packed form) and BD for a given matrix MAT. SIGMA, Q, P and BD may then be used to obtain selected singular vectors with subroutines BD_INVITER, BD_INVITER2, BD_DEFLATE or BD_DEFLATE2.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is overwritten with the first n columns of Q (stored column-wise), the orthogonal matrix used to reduce MAT to bidiagonal form as returned by subroutine BD_CMP2 in its argument MAT.

The shape of MAT must verify:

size( MAT, 1 ) >= size( MAT, 2 ) = n .

NSING (OUTPUT) integer(i4b)

On output, NSING specifies the number of singular values which have been computed. Note that NSING may be greater than the optional argument LS, if multiple singular values at index LS make unique selection impossible.

If none of the optional arguments LS and THETA are used, NSING is set to size(MAT,2) and all the singular values are computed.

S (OUTPUT) real(stnd), dimension(:)

On exit, S(1:NSING) contains the first NSING singular values of MAT. The other values in S ( S(NSING+1:) ) are flagged by a quiet NAN.

The size of S must be equal to size( MAT, 2 ) = n .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit and the bisection algorithm converged for all the computed singular values to the desired accuracy ;

FAILURE = true : indicates that some or all of the singular values failed to converge or were not computed. This is generally caused by unexpectedly inaccurate arithmetic. The sign of the incorrect singular values is set to negative.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= n, otherwise a default value is used. For good performance, at the expense of more workspace, a large value can be used.

The default is min(32,n).

VECTOR (INPUT, OPTIONAL) logical(lgl)

On entry, if VECTOR is set to TRUE, a vectorized version of the bisection algorithm is used to compute the singular values SIGMA of the bidiagonal matrix BD.

The default is VECTOR=false.

ABSTOL (INPUT, OPTIONAL) real(stnd)

On entry, the absolute tolerance for the singular values. A singular value (or cluster) is considered to be located if its square has been determined to lie in an interval whose width is ABSTOL or less.

Singular values will be computed most accurately when ABSTOL is set to the square root of the underflow threshold, sqrt(LAMCH(‘S’)), not zero.

If ABSTOL is less than or equal to zero, or is not specified, then ULP * | BD’ * BD | will be used, where | BD’ * BD | means the 1-norm of the tridiagonal matrix BD’ * BD ( BD’ means the transpose of BD) and ULP is the machine precision (distance from 1 to the next larger floating point number).

LS (INPUT, OPTIONAL) integer(i4b)

On entry, LS specifies the number of singular values which must be computed by the subroutine. On output, NSING may be different than LS if multiple singular values at index LS make unique selection impossible.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

LS must be greater than 0 and less or equal to size( MAT, 2 ) = n .

The default is LS = size( MAT, 2 ) = n .

THETA (INPUT, OPTIONAL) real(stnd)

On entry, THETA specifies that the singular values which are greater or equal to THETA must be computed. If none of the singular values are greater or equal to THETA, NSING is set to zero and S(:) to a quiet NAN.

Only one of the optional arguments LS and THETA must be specified, otherwise the subroutine will stop with an error message.

The default is THETA = 0.

D (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The diagonal elements of the intermediate bidiagonal matrix BD

The size of D must be equal to size( MAT, 2 ) = n .

E (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The off-diagonal elements of the intermediate upper bidiagonal matrix BD:

E(i) = B(i-1,i) for i = 2,3,…,n;

E(1) is arbitrary.

The size of E must be equal to size( MAT, 2 ) = n .

P (OUTPUT, OPTIONAL) real(stnd), dimension(:,:)

On exit, P is overwritten with the n-by-n orthogonal matrix P (stored column-wise or in packed form), the orthogonal matrix used to reduce MAT to bidiagonal form as returned by subroutine BD_CMP2 in its argument P.

The shape of P must verify:

size( P, 1 ) = size( P, 2 ) = n .

GEN_P (INPUT,OPTIONAL) logical(lgl)

On entry, this optional argument has an effect only if the optional argument P is also used.

In this case, if the optional argument GEN_P is used and is set to true, the orthogonal matrix P used to reduce MAT to bidiagonal form is generated on output of the subroutine in its argument P..

If GEN_P is set to false, the orthogonal matrix P is stored in factored form as products of elementary reflectors in the lower triangle of the array P. See the description of BD_CMP2 subroutine for more details.

The default is true.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the singular values.

The default is to scale the bidiagonal matrix.

INIT (INPUT,OPTIONAL) logical(lgl)

On entry, if INIT=true the initial intervals for the bisection steps are computed from estimates of the eigenvalues of the associated BD’ * BD tridiagonal matrix obtained from the Pal-Walker-Kahan algorithm.

The default is not to use the Pal-Walker-Kahan algorithm.

FAILURE_BD (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE_BD = false : indicates that maximum accuracy was obtained in the Ralha-Barlow one-sided bidiagonalization of MAT.

FAILURE_BD = true : indicates that MAT is nearly singular and some loss of orthogonality can be expected in the Ralha-Barlow bidiagonalization algorithm.

Further Details¶

The matrices Q, P and BD are computed with the help of the Ralha-Barlow one-sided method. Q is computed by a recurrence relationship and P as a product of n-1 elementary reflectors (e.g. Householder transformations):

P = G(1) * G(2) * … * G(n-1)

Each G(i) has the form:

G(i) = I + taup * v * v’

where taup is a real scalar, and v is a real vector. IF GEN_P is used and set to false, the n-1 G(i) elementary reflectors are stored in the lower triangle of the array P.

For the G(i) reflector, taup is stored in P(i+1,1) and v is stored in P(i+1:n,i+1). In addition, P(1,1) is set to -1 if GEN_P=false and is equal to 1 if GEN_P=true.

In other words, the value of P(1,1) indicates if the orthogonal matrix P is stored in factored form or not. Note that if n is equal to 1, elementary reflectors are not needed and consequently P(1,1) is set to 1, independently of the value of GEN_P.

This is the blocked version of the algorithm for the special case of blocks of size 2. See the references (1), (2) and (3) for further details. Furthermore the algorithm is parallelized if OPENMP is used.

Since Q is computed by a recurrence relationship, a loss of orthogonality of Q can be observed when the rectangular matrix MAT is singular or nearly singular.

To correct this deficiency, partial reorthogonalization is performed to ensure orthogonality at the expense of speed of computation. The reorthogonalization uses the Gram-Schmidt method described in the reference (4).

The reference (2) also explains how to handle the case of an exactly singular matrix MAT (a very rare event). However, in this subroutine, the partial reorthogonalization described above corrects automatically this problem as described in the reference (4).

Now, let SIGMA(i), i=1,…,N=min(m,n), be the singular values of the intermediate bidiagonal matrix BD in decreasing order of magnitude. The subroutine computes the LS largest singular values ( or the singular values which are greater or equal to THETA) of BD by a bisection method (see the reference (5) below, Sec.8.5 ). The bisection method is applied (implicitly) to the associated N by N symmetric tridiagonal matrix BD’ * BD whose eigenvalues are the squares of the singular values of BD by using the differential stationary form of the qd algorithm of Rutishauser (see the reference (6) below, Sec.3.1 ).

SELECT_SINGVAL_CMP4 subroutine is less accurate, but faster than SELECT_SINGVAL_CMP3 subroutine since SELECT_SINGVAL_CMP3 works on the 2N by 2N symmetric tridiagonal GOLUB-KAHAN form of BD, while SELECT_SINGVAL_CMP3 works implicitly on the associated N by N symmetric tridiagonal matrix BD’ * BD whose eigenvalues are the squares of the singular values of BD.

For further details, see:

Ralha, R.M.S., 2003: One-sided reduction to bidiagonal form.

Linear Algebra Appl., No 358, pp. 219-238.

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Stewart, G.W., 2007: Block Gram-Schmidt Orthogonalization. Report TR-4823,

Department of Computer Science, College Park, University of Maryland.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine svd_cmp ( mat, s, failure, v, sort, mul_size, maxiter, max_francis_steps, perfect_shift, use_svd2 )`¶

Purpose¶

SVD_CMP computes the Singular Value Decomposition (SVD) of a real m-by-n matrix MAT. The SVD is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative. The columns of U and V are the left and right singular vectors of MAT.

SVD_CMP computes only the first min(m,n) columns of U and V (e.g. the left and right singular vectors of MAT in the thin SVD of MAT).

The routine returns the first min(m,n) singular values and the associated left and right singular vectors.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is overwritten with the first min(m,n) columns of U, the left singular vectors.

S (OUTPUT) real(stnd), dimension(:)

The singular values of MAT.

The size of S must verify: size( S ) = min(m,n) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

V (OUTPUT) real(stnd), dimension(:,:)

On exit, V contains the first min(m,n) columns of V, the right singular vectors.

The shape of V must verify:

size( V, 1 ) = n,

size( V, 2 ) = min(m,n).

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. The singular vectors are rearranged accordingly.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= max(m,n), otherwise a default value is used. MUL_SIZE can be increased or decreased to improve the performance of the SVD algorithm when 2 * min(m,n) <= max(m,n). Maximum performance will be obtained when a real matrix of size MUL_SIZE * min(m,n) and kind stnd fits in the cache of the processors.

The default value is 32.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

USE_SVD2 (INPUT, OPTIONAL) logical(lgl)

If the optional argument USE_SVD2 is used and is set to true, an alternate SVD algorithm which used less workspace (but which may be slower) is automatically used if m is much larger than n or if n is much larger than m (e.g. if max(m,n)>=2 * min(m,n) ).

Further Details¶

Computing the SVD of a rectangular matrix in SVD_CMP consists of three steps:

reduction of the rectangular matrix to bidiagonal form via orthogonal transformations (e.g. Householder transformations);

in place accumulation of the orthogonal transformations used in the reduction to bidiagonal form;

computation of the SVD of the bidiagonal matrix.

For further details, on the SVD of a rectangular matrix and the algorithm to compute it, see the references (1) or (2).

All the three steps of the SVD algorithm (e.g. the reduction to bidiagonal form, accumulation of the Householder transformations used in the reduction to bidiagonal form and computation of the SVD of the bidiagonal matrix) are parallelized if OPENMP is used.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine svd_cmp2 ( mat, s, failure, u_vt, sort, mul_size, maxiter, max_francis_steps, perfect_shift, use_svd2 )`¶

Purpose¶

SVD_CMP2 computes the Singular Value Decomposition (SVD) of a real m-by-n matrix MAT. The SVD is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative. The columns of U and V are the left and right singular vectors of MAT.

SVD_CMP2 computes only the first min(m,n) columns of U and V (e.g. the left and right singular vectors of MAT in the thin SVD of MAT).

The routine returns the first min(m,n) singular values and the associated left and right singular vectors. The right singular vectors are returned row-wise.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit:

if m>=n, MAT is overwritten with the first min(m,n) columns of U (the left singular vectors, stored column-wise);

if m<n, MAT is overwritten with the first min(m,n) rows of V’ (the right singular vectors, stored row-wise).

S (OUTPUT) real(stnd), dimension(:)

The singular values of MAT.

The size of S must verify: size( S ) = min(m,n) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

U_VT (OUTPUT) real(stnd), dimension(:,:)

On exit:

if m>=n, U_VT contains the n-by-n orthogonal matrix V’;

if m<n, U_VT contains the m-by-m orthogonal matrix U.

The shape of U_VT must verify:

size( U_VT, 1 ) = size( U_VT, 2 ) = min(m,n).

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. The singular vectors are rearranged accordingly.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify 1 <= MUL_SIZE <= max(m,n), otherwise a default value is used. MUL_SIZE can be increased or decreased to improve the performance of the SVD algorithm when 1.6 * n <= m. Maximum performance will be obtained when a real matrix of size MUL_SIZE * n and kind stnd fits in the cache of the processors.

The default value is 32.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

USE_SVD2 (INPUT, OPTIONAL) logical(lgl)

If the optional argument USE_SVD2 is used and is set to true, an alternate SVD algorithm which used less workspace (but which may be slower) is automatically used if m is much larger than n or if n is much larger than m (e.g. if max(m,n)>=1.6 * min(m,n) ).

Further Details¶

Computing the SVD of a rectangular matrix in SVD_CMP2 consists of three steps:

reduction of the rectangular matrix to bidiagonal form via orthogonal transformations (e.g. Householder transformations);

in place accumulation of the orthogonal transformations used in the reduction to bidiagonal form;

computation of the SVD of the bidiagonal matrix.

For further details, on the SVD of a rectangular matrix and the algorithm to compute it, see the references (1) or (2).

All the three steps of the SVD algorithm (e.g. the reduction to bidiagonal form, accumulation of the Householder transformations used in the reduction to bidiagonal form and computation of the SVD of the bidiagonal matrix) are parallelized if OPENMP is used.

For more informations, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine svd_cmp ( mat, s, failure, sort, mul_size, maxiter, d, e, tauq, taup )`¶

Purpose¶

SVD_CMP computes the singular values of a real m-by-n matrix MAT.

The Singular Value Decomposition (SVD) is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative.

The original matrix MAT is first reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal. The singular values SIGMA of the bidiagonal matrix BD are then computed.

The routine outputs SIGMA and optionally Q and P (in packed form), and BD for a given matrix MAT. SIGMA, Q, P and BD may then be used to obtain selected singular vectors with subroutines BD_INVITER or BD_INVITER2.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is destroyed and if TAUQ or TAUP are present MAT is overwritten as follows:

if m >= n, the elements on and below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors;

if m < n, the elements below the diagonal, with the array TAUQ, represent the orthogonal matrix Q as a product of elementary reflectors, and the elements on and above the diagonal, with the array TAUP, represent the orthogonal matrix P as a product of elementary reflectors.

See Further Details.

S (OUTPUT) real(stnd), dimension(:)

The singular values SIGMA of MAT.

The size of S must be min( size(MAT,1) , size(MAT,2) ).

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form BD of MAT.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= max(m,n), otherwise a default value is used. For good performance, at the expense of more workspace, a large value can be used.

The default is 32.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form BD of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

D (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The diagonal elements of the intermediate bidiagonal matrix BD

The size of D must be min( size(MAT,1) , size(MAT,2) ).

E (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The off-diagonal elements of the intermediate bidiagonal matrix BD:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

The size of E must be min( size(MAT,1) , size(MAT,2) ).

TAUQ (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix Q. See Further Details.

The size of TAUQ must be min( size(MAT,1) , size(MAT,2) ).

TAUP (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The scalar factors of the elementary reflectors which represent the orthogonal matrix P. See Further Details.

The size of TAUP must be min( size(MAT,1) , size(MAT,2) ).

Further Details¶

The matrices Q and P are represented as products of elementary reflectors:

If m >= n,

Q = H(1) * H(2) * … * H(n) and P = G(1) * G(2) * … * G(n-1)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors. Moreover, u(1:i-1) = 0 and v(1:i) = 0.

If TAUQ or TAUP are present :

u(i:m) is stored on exit in MAT(i:m,i);

v(i+1:n) is stored on exit in MAT(i,i+1:n).

If TAUQ is present : tauq is stored in TAUQ(i).

If TAUP is present : taup is stored in TAUP(i).

If m < n,

Q = H(1) * H(2) * … * H(m-1) and P = G(1) * G(2) * … * G(m)

Each H(i) and G(i) has the form:

H(i) = I + tauq * u * u’ and G(i) = I + taup * v * v’

where tauq and taup are real scalars, and u and v are real vectors. Moreover, u(1:i) = 0 and v(1:i-1) = 0.

If TAUQ or TAUP are present :

u(i+1:m) is stored on exit in MAT(i+1:m,i);

v(i:n) is stored on exit in MAT(i,i:n).

If TAUQ is present : tauq is stored in TAUQ(i).

If TAUP is present : taup is stored in TAUP(i).

The contents of MAT on exit, if TAUQ or TAUP are present, are illustrated by the following examples:

m = 6 and n = 5 (m >= n):

( u1 v1 v1 v1 v1 )

( u1 u2 v2 v2 v2 )

( u1 u2 u3 v3 v3 )

( u1 u2 u3 u4 v4 )

( u1 u2 u3 u4 u5 )

( u1 u2 u3 u4 u5 )

m = 5 and n = 6 (m < n):

( v1 v1 v1 v1 v1 v1 )

( u1 v2 v2 v2 v2 v2 )

( u1 u2 v3 v3 v3 v3 )

( u1 u2 u3 v4 v4 v4 )

( u1 u2 u3 u4 v5 v5 )

where ui denotes an element of the vector defining H(i), and vi an element of the vector defining G(i).

For further details, on the SVD of a rectangular matrix and the algorithm to compute it, see the references (1) or (2). In SVD_CMP subroutine, the reduction to bidiagonal form by orthogonal transformations is parallelized if OPENMP is used, but not the computation of the singular values.

For more informations, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine svd_cmp3 ( mat, s, failure, u_v, sort, maxiter, max_francis_steps, perfect_shift, failure_bd )`¶

Purpose¶

SVD_CMP3 computes the Singular Value Decomposition (SVD) of a real m-by-n matrix MAT. The SVD is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative. The columns of U and V are, respectively, the left and right singular vectors of MAT.

The routine returns the first min(m,n) singular values and the associated left and right singular vectors. The right singular vectors are returned row-wise if m<n.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit:

if m>=n, MAT is overwritten with the first n columns of U (the left singular vectors, stored column-wise);

if m<n, MAT is overwritten with the first m rows of V’ (the first m right singular vectors, stored row-wise);

S (OUTPUT) real(stnd), dimension(:)

The singular values of MAT.

The size of S must verify: size( S ) = min(m,n) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the bidiagonal SVD algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

U_V (OUTPUT) real(stnd), dimension(:,:)

On exit:

if m>=n, U_V contains the n-by-n orthogonal matrix V;

if m<n, U_V contains the m-by-m orthogonal matrix U.

The shape of U_V must verify:

size( U_V, 1 ) = size( U_V, 2 ) = min(m,n) .

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. If this argument is not used the singular values are not sorted. The singular vectors are rearranged accordingly.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

FAILURE_BD (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE_BD = false : indicates that maximum accuracy was obtained in the Ralha-Barlow one-sided bidiagonalization of MAT.

FAILURE_BD = true : indicates that MAT is nearly singular and some loss of orthogonality can be expected in the Ralha-Barlow bidiagonalization algorithm.

Further Details¶

Computing the SVD of a rectangular matrix in SVD_CMP3 consists of three steps:

reduction of the rectangular matrix to bidiagonal form via the Ralha-Barlow one-sided bidiagonalization algorithm, see the references (1) and (2).

in place accumulation of the orthogonal transformations used in the reduction to bidiagonal form;

computation of the SVD of the bidiagonal matrix, see the references (3) and (4).

For further details, on the SVD of a rectangular matrix and the algorithms to compute it, see the references below.

The three steps of the SVD algorithm used here (e.g. the reduction to bidiagonal form, in place accumulation of the orthogonal transformations and computation of the SVD of the bidiagonal matrix) are parallelized if OPENMP is used.

For more details, see:

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine svd_cmp4 ( mat, s, failure, v, sort, maxiter, max_francis_steps, perfect_shift, sing_vec, gen_p, failure_bd, d, e )`¶

Purpose¶

SVD_CMP4 computes the Singular Value Decomposition (SVD) of a real m-by-n matrix MAT with m>=n. The SVD is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative. The columns of U and V are, respectively, the left and right singular vectors of MAT.

The routine returns the first n singular values and the associated left and right singular vectors.

Optionally, if the logical argument SING_VEC is used with the value false, the routine computes only the singular values and the orthogonal matrices Q and P used to reduce MAT to bidiagonal form B. This is useful for computing a partial SVD of the matrix MAT with subroutines BD_INVITER2 or BD_DEFLATE2 for example.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit:

if SING_VEC=true, MAT is overwritten with the first n columns of U (the left singular vectors, stored column-wise);

if SING_VEC=false, MAT is overwritten with the first n columns of Q (stored column-wise), the orthogonal matrix used to reduce MAT to bidiagonal form as returned by subroutine BD_CMP2 in its argument MAT.

The shape of MAT must verify:

size( MAT, 1 ) >= size( MAT, 2 ) = n .

S (OUTPUT) real(stnd), dimension(:)

The singular values of MAT.

The size of S must verify: size( S ) = size( MAT, 2 ) = n .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the bidiagonal SVD algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

V (OUTPUT) real(stnd), dimension(:,:)

On exit:

if SING_VEC=true, V is overwritten with the n-by-n orthogonal matrix V (the right singular vectors, stored column-wise);

if SING_VEC=false, V is overwritten with the n-by-n orthogonal matrix P (stored column-wise or in packed form), the orthogonal matrix used to reduce MAT to bidiagonal form as returned by subroutine BD_CMP2 in its argument P.

The shape of V must verify:

size( V, 1 ) = size( V, 2 ) = n .

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. If this argument is not used the singular values are not sorted.

The singular vectors are rearranged accordingly if they are computed by the subroutine.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

On entry, this optional argument has an effect only if the optional argument SING_VEC has the value true.

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

On entry, this optional argument has an effect only if the optional argument SING_VEC has the value true.

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

SING_VEC (INPUT,OPTIONAL) logical(lgl)

On entry:

if SING_VEC=true, the routine computes the singular values and vectors of MAT.

If SING_VEC=false the routine computes only the singular values of MAT and the orthogonal matrices Q and P used to reduce MAT to upper bidiagonal form as returned by subroutine BD_CMP2. See the description of BD_CMP2 subroutine for more details.

The default is true.

GEN_P (INPUT,OPTIONAL) logical(lgl)

On entry, this optional argument has an effect only if the optional argument SING_VEC is also used with the value false.

In this case, if the optional argument GEN_P is used and is set to true, the orthogonal matrix P used to reduce MAT to bidiagonal form is generated on output of the subroutine in its argument V.

If this argument is set to false, the orthogonal matrix P is stored in factored form as products of elementary reflectors in the lower triangle of the array V. See the description of BD_CMP2 subroutine for more details.

The default is true.

FAILURE_BD (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE_BD = false : indicates that maximum accuracy was obtained in the Ralha-Barlow one-sided bidiagonalization of MAT.

FAILURE_BD = true : indicates that MAT is nearly singular and some loss of orthogonality can be expected in the Ralha-Barlow bidiagonalization algorithm.

D (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The diagonal elements of the intermediate upper bidiagonal matrix B.

The size of D must be size( D ) = size( MAT, 2 ) = n .

E (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The off-diagonal elements of the intermediate upper bidiagonal matrix B:

E(i) = B(i-1,i) for i = 2,3,…,n;

E(1) is arbitrary.

The size of E must be size( E ) = size( MAT, 2 ) = n .

Further Details¶

Computing the SVD of a rectangular matrix in SVD_CMP4 consists of three steps:

reduction of the rectangular matrix to bidiagonal form via the Ralha-Barlow one-sided bidiagonalization algorithm, see the references (1) and (2).

in place accumulation of the orthogonal transformations used in the reduction to bidiagonal form;

computation of the SVD of the bidiagonal matrix, see the references (3) and (4).

For further details, on the SVD of a rectangular matrix and the algorithms to compute it, see the references below.

The three steps of the SVD algorithm used here (e.g. the reduction to bidiagonal form, in place accumulation of the orthogonal transformations and computation of the SVD of the bidiagonal matrix) are parallelized if OPENMP is used.

Optionally, the intermediate bidiagonal decomposition of MAT can be output by the subroutine if the optional logical argument SING_VEC is used with the value false and the optional arguments D and E are specified.

For more details, see:

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine svd_cmp3 ( mat, s, failure, sort, maxiter, save_mat, failure_bd )`¶

Purpose¶

SVD_CMP3 computes the singular values of a real m-by-n matrix MAT. The singular value decomposition (SVD) is written:

MAT = U * SIGMA * V’

where SIGMA is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m orthogonal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of MAT; they are real and non-negative. The columns of U and V are, respectively, the left and right singular vectors of MAT.

The routine returns only the first min(m,n) singular values of MAT.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, the m-by-n matrix MAT is destroyed if m>=n and the optional argument SAVE_MAT is not used with the value true.

S (OUTPUT) real(stnd), dimension(:)

The singular values of MAT.

The size of S must verify: size( S ) = min(m,n) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the bidiagonal SVD algorithm did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. If this argument is not used the singular values are not sorted.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

SAVE_MAT (INPUT, OPTIONAL) logical(lgl)

On entry, if SAVE_MAT is set to true, the m-by-n matrix MAT is not modified by the routine.

The default is false.

FAILURE_BD (OUTPUT,OPTIONAL) logical(lgl)

On exit:

FAILURE_BD = false : indicates that maximum accuracy was obtained in the Ralha-Barlow one-sided bidiagonalization of MAT.

FAILURE_BD = true : indicates that MAT is nearly singular and some loss of orthogonality can be expected in the Ralha-Barlow bidiagonalization algorithm.

Further Details¶

Computing the singular values of a rectangular matrix in SVD_CMP3 consists of two steps:

reduction of the rectangular matrix to bidiagonal form via the Ralha-Barlow one-sided bidiagonalization algorithm, see the references (1) and (2).

computation of the singular values of the bidiagonal matrix, see the references (3) and (4).

For further details, on the SVD of a rectangular matrix and the algorithms to compute it, see the references below.

The first step of the SVD algorithm used here (e.g. the reduction to bidiagonal form) is parallelized if OPENMP is used.

For more details, see:

Barlow, J.L., Bosner, N., and Drmac, Z., 2005: A new stable bidiagonal

reduction algorithm. Linear Algebra Appl., No 397, pp. 35-84.

Bosner, N., and Barlow, J.L., 2007: Block and Parallel versions of one-sided

bidiagonalization. SIAM J. Matrix Anal. Appl., Volume 29, No 3, pp. 927-953.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`function maxdiag_gkinv_qr ( e, lambda )`¶

Purpose¶

This function computes the index of the element of maximum absolute value in the diagonal entries of

( GK - LAMBDA * I )**(-1)

where GK is a n-by-n symmetric tridiagonal matrix with a zero diagonal, I is the identity matrix and LAMBDA is a scalar.

Arguments¶

E (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the tridiagonal matrix.

LAMBDA (INPUT) real(stnd)

On entry, the eigenvalue or shift used in the QR factorization.

Further Details¶

The diagonal entries of ( GK - LAMBDA * I )**(-1) are computed by means of the QR factorization of ( GK - LAMBDA * I ). For the latter computation, the semiseparable structure of ( GK - LAMBDA * I )**(-1) is used, see the reference (1). Moreover, it is assumed that GK is unreduced, but no check is done in the subroutine to verify this assumption.

This subroutine is adapted from the pseudo-code trace_Tinv given in the reference (1).

For further details, see:

Bini, D.A., Gemignani, L., and Tisseur, F., 2005: The Ehrlich-Aberth method for the

nonsymmetric tridiagonal eigenvalue problem. SIAM J. Matrix Anal. Appl., 27, 153-175.

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

`function maxdiag_gkinv_ldu ( e, lambda )`¶

Purpose¶

This function computes the index of the element of maximum absolute value in the diagonal entries of

( GK - LAMBDA * I )**(-1)

where GK is a n-by-n symmetric tridiagonal matrix with a zero diagonal, I is the identity matrix and LAMBDA is a scalar.

Arguments¶

E (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the tridiagonal matrix.

LAMBDA (INPUT) real(stnd)

On entry, the eigenvalue or shift used.

Further Details¶

The diagonal entries of ( GK - LAMBDA * I )**(-1) are computed by means of two triangular factorizations of ( GK - LAMBDA * I ) of the forms L(+) * D(+) * U(+) and U(-) * D(-) * L(-) where L(+) and L(-) are unit lower bidiagonal, U(+) and U(-) are unit upper bidiagonal, and D(+) and D(-) are diagonal.

It is assumed that GK is unreduced, but no check is done in the subroutine to verify this assumption.

This subroutine is adapted from the references (1) and (2).

For further details, on Fernando’s method for computing eigenvectors of tridiagonal matrices, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

`subroutine gk_qr_cmp ( e, lambda, cs, sn, diag, sup1, sup2, maxdiag_gkinv )`¶

Purpose¶

GK_QR_CMP factorizes the symmetric matrix GK - LAMBDA * I, where GK is an n-by-n symmetric tridiagonal matrix with a zero diagonal, I is the identity matrix and LAMBDA is a scalar, as

GK - LAMBDA * I = Q * R

where Q is an orthogonal matrix represented as the product of n-1 Givens rotations and R is an upper triangular matrix with at most two non-zero super-diagonal elements per column.

The parameter LAMBDA is included in the routine so that GK_QR_CMP may be used to obtain eigenvectors of GK by inverse iteration.

The subroutine also computes the index of the entry of maximum absolute value in the diagonal of ( GK - LAMBDA * I )**(-1), which provides a good initial approximation to start the inverse iteration process for computing the eigenvector associated with the eigenvalue LAMBDA, see the references (1), (2) and (3) for further details.

Arguments¶

E (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the tridiagonal matrix.

LAMBDA (INPUT) real(stnd)

On entry, the eigenvalue or shift used in the QR factorization.

CS (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations for the QR factorization of GK - LAMBDA * I.

The size of CS must be size(CS) = size(E) = n - 1.

SN (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations for the QR factorization of GK - LAMBDA * I.

The size of SN must be size(SN) = size(E) = n - 1.

DIAG (OUTPUT) real(stnd), dimension(:)

On exit, DIAG(:) contains the n diagonal elements of the upper triangular matrix R of the QR factorization of GK - LAMBDA * I.

The size of DIAG must verify: size(DIAG) = size(E) + 1 = n .

SUP1 (OUTPUT) real(stnd), dimension(:)

On exit, SUP1(:n-1) contains the n-1 superdiagonal elements of the upper triangular matrix R of the QR factorization of GK - LAMBDA * I, SUP1(n) is arbitrary .

The size of SUP1 must verify: size(SUP1) = size(E) + 1 = n .

SUP2 (OUTPUT) real(stnd), dimension(:)

On exit, SUP2(:n-2) contains the n-2 second superdiagonal elements of the upper triangular matrix R of the QR factorization of GK - LAMBDA * I, SUP2(n-1:n) is arbitrary .

The size of SUP2 must verify: size(SUP2) = size(E) + 1 = n .

MAXDIAG_GKINV (OUPTPUT) integer(i4b)

On exit, MAXDIAG_GKINV is the index of the entry of maximum modulus in the main diagonal of ( GK - LAMBDA * I )**(-1).

Further Details¶

The QR factorization of ( GK - LAMBDA * I ) is obtained by means of n-1 unitary Givens rotations.

The diagonal entries of ( GK - LAMBDA * I )**(-1) are computed by means of this QR factorization of ( GK - LAMBDA * I ). For the latter computation, the semiseparable structure of ( GK - LAMBDA * I )**(-1) is used, see the reference (1). Moreover, it is assumed that GK is unreduced for computing the index of the entry of maximum absolute value in the diagonal of ( GK - LAMBDA * I )**(-1), but no check is done in the subroutine to verify this assumption.

For further details, see:

Bini, D.A., Gemignani, L., and Tisseur, F., 2005: The Ehrlich-Aberth method for the

nonsymmetric tridiagonal eigenvalue problem. SIAM J. Matrix Anal. Appl., 27, 153-175.

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

`subroutine bd_inviter ( upper, d, e, s, leftvec, rightvec, failure, maxiter, scaling, initvec )`¶

Purpose¶

BD_INVITER computes the left and right singular vectors of a real n-by-n bidiagonal matrix BD corresponding to a specified singular value, using Fernando’s method and inverse iteration on the tridiagonal Golub-Kahan (TGK) form of the bidiagonal matrix BD.

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : BD is upper bidiagonal ;

UPPER = false : BD is lower bidiagonal.

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD. E(1) is arbitrary.

The size of E must verify: size( E ) = size( D ) = n .

S (INPUT) real(stnd)

On entry, the selected singular value of the bidiagonal matrix BD. The singular value must be positive or zero.

LEFTVEC (OUTPUT) real(stnd), dimension(:)

On exit, the computed left singular vector.

The shape of LEFTVEC must verify:

size(LEFTVEC) = size(D) = n .

RIGHTVEC (OUTPUT) real(stnd), dimension(:)

On exit, the computed right singular vector.

The shape of RIGHTVEC must verify:

size(RIGHTVEC) = size(D) = n .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = FALSE : indicates successful exit,

FAILURE = TRUE : indicates that some singular vectors failed to converge in MAXITER iterations.

MAXITER (INPUT,OPTIONAL) integer(i4b)

The number of inverse iterations performed in the subroutine. By default, 2 inverse iterations are performed.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the bidiagonal matrix BD is scaled before computing the singular vector;

SCALING=false, the bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INITVEC (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INITVEC=true, a Fernando vector is used to start the inverse iteration process for computing the singular vectors of the bidiagonal matrix BD (e.g. the eigenvector of the associated tridiagonal Golub-Kahan matrix);

INITVEC=false, a random uniform starting vector is used.

The default is to use a Fernando starting vector if the Golub-Kahan form of the input bidiagonal matrix is unreduced, and a random uniform starting vector otherwise.

Further Details¶

A first estimate of the singular vectors is computed by the Fernando method applied to the tridiagonal Golub-Kahan matrix associated with the bidiagonal matrix BD (see the reference (1) for details) if this Golub-Kahan form of the input bidiagonal matrix is unreduced. Otherwise, a random start is used as a first estimate of the singular vectors as in the standard inverse-iteration algorithm.

The singular vectors are then computed or refined using inverse iteration on the tridiagonal Golub-Kahan matrix.

For further details, on Fernando’s method for computing eigenvectors of tridiagonal matrices or inverse iteration, see

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Bini, D.A., Gemignani, L., and Tisseur, F., 2005: The Ehrlich-Aberth method for the

nonsymmetric tridiagonal eigenvalue problem. SIAM J. Matrix Anal. Appl., 27, 153-175.

`subroutine bd_inviter ( upper, d, e, s, leftvec, rightvec, failure, maxiter, ortho, backward_sweep, scaling, initvec )`¶

Purpose¶

BD_INVITER computes the left and right singular vectors of a real n-by-n bidiagonal matrix BD corresponding to specified singular values, using Fernando’s method and inverse iteration on the tridiagonal Golub-Kahan (TGK) form of the bidiagonal matrix BD.

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : BD is upper bidiagonal ;

UPPER = false : BD is lower bidiagonal.

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD. E(1) is arbitrary.

The size of E must verify: size( E ) = size( D ) = n .

S (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix BD. The singular values must be given in decreasing order and must be positive or zero.

The size of S must verify: size(S) <= size(D) = n .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value S(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size(D) = n ,

size(LEFTVEC,2) = size(S) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value S(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size(D) = n ,

size(RIGHTVEC,2) = size(S) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = FALSE : indicates successful exit,

FAILURE = TRUE : indicates that some singular vectors failed to converge in MAXITER iterations.

MAXITER (INPUT,OPTIONAL) integer(i4b)

The number of inverse iterations performed in the subroutine. By default, 2 inverse iterations are performed for all the singular vectors.

ORTHO (INPUT,OPTIONAL) logical(lgl)

On entry, if:

ORTHO=true, all the singular vectors are orthogonalized by the Modified Gram-Schmidt algorithm;

ORTHO=false, the singular vectors are not orthogonalized by the Modified Gram-Schmidt algorithm.

The default is to orthogonalize the singular vectors only for the singular values, which are not well-separated.

BACKWARD_SWEEP (INPUT,OPTIONAL) logical(lgl)

On entry, if:

BACKWARD_SWEEP=true and the singular vectors are orthogonalized by the modified Gram-Schmidt algorithm, a backward sweep of the modified Gram-Schmidt algorithm is also performed;

BACKWARD_SWEEP=false, a backward sweep is not performed.

The default is not to perform a backward sweep of the modified Gram-Schmidt algorithm.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the bidiagonal matrix BD is scaled before computing the singular vectors;

SCALING=false, the bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INITVEC (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INITVEC=true, Fernando vectors are used to start the inverse iteration process for computing the singular vectors of the bidiagonal matrix BD (e.g. the eigenvectors of the associated Golub-Kahan tridiagonal matrix);

INITVEC=false, random uniform starting vectors are used.

The default is to use Fernando starting vectors if the singular values are well-separated and the Golub-Kahan form of the input bidiagonal matrix is unreduced, and random uniform starting vectors otherwise.

Further Details¶

A first estimate of the singular vectors is computed by the Fernando method applied to the tridiagonal Golub-Kahan matrix associated with the bidiagonal matrix BD (see the reference (1) for details) for the singular values which are well-separated and if the Golub-Kahan form of the input bidiagonal matrix is unreduced. For the other singular values, a random start is used as a first estimate of the singular vectors as in the standard inverse-iteration algorithm.

The singular vectors are then computed or refined using inverse iteration on the tridiagonal Golub-Kahan matrix for all the singular values at one step.

By default, the singular vectors are then orthogonalized by the Modified Gram-Schmidt algorithm only if the singular values are not well-separated.

The computation of the singular vectors is parallelized if OPENMP is used.

BD_INVITER may fail if clusters of tiny singular values are present in parameter S.

For further details, on Fernando’s method for computing eigenvectors of tridiagonal matrices or inverse iteration, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed. The Johns Hopkins

University Press, Baltimore.

Bini, D.A., Gemignani, L., and Tisseur, F., 2005: The Ehrlich-Aberth method for the

nonsymmetric tridiagonal eigenvalue problem. SIAM J. Matrix Anal. Appl., 27, 153-175.

`subroutine bd_inviter2 ( mat, tauq, taup, d, e, s, leftvec, rightvec, failure, maxiter, ortho, backward_sweep, scaling, initvec )`¶

Purpose¶

BD_INVITER2 computes the left and right singular vectors of a full real m-by-n matrix MAT corresponding to specified singular values, using inverse iteration.

It is required that the original matrix MAT has been reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal. This can be done with a call to BD_CMP with parameters TAUQ and TAUP, before calling BD_SVD for computing singular values and BD_INVITER2 for computing selected singular vectors. If m >= n, BD is upper bidiagonal and if m < n, BD is lower bidiagonal.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the original m-by-n matrix after reduction by BD_CMP. MAT must contains the vectors which define the elementary reflectors H(i) and G(i) whose products determine the matrices Q and P, as returned by BD_CMP. MAT must be specified as returned by BD_CMP and is not modified by the routine.

TAUQ (INPUT) real(stnd), dimension(:)

TAUQ(i) must contain the scalar factor of the elementary reflector H(i) which determines Q, as returned by BD_CMP in the array argument TAUQ.

The size of TAUQ must verify:

size( TAUQ ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

TAUP (INPUT) real(stnd), dimension(:)

TAUP(i) must contain the scalar factor of the elementary reflector G(i), which determines P, as returned by BD_CMP in its array argument TAUP.

The size of TAUP must verify:

size( TAUP ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD as returned by BD_CMP.

The size of D must verify:

size( D ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD as returned by BD_CMP:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

E(1) is arbitrary.

The size of E must verify:

size( E ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

S (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix BD. The singular values must be given in decreasing order and are assumed to be positive or zero.

The size of S must verify:

size(S) <= min( size( MAT, 1 ), size( MAT, 2 ) ) .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value S(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size( MAT, 1 ) = m ,

size(LEFTVEC,2) = size(S) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value S(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size( MAT, 2 ) = n ,

size(RIGHTVEC,2) = size(S) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = FALSE : indicates successful exit,

FAILURE = TRUE : indicates that some singular vectors of BD failed to converge in MAXITER iterations.

MAXITER (INPUT,OPTIONAL) integer(i4b)

The number of inverse iterations performed in the subroutine.

By default, 2 inverse iterations are performed for all the singular vectors.

ORTHO (INPUT,OPTIONAL) logical(lgl)

On entry, if:

ORTHO=true, all the singular vectors of the bidiagonal matrix BD are orthogonalized by the Modified Gram-Schmidt algorithm;

ORTHO=false, the singular vectors of the bidiagonal matrix BD are not orthogonalized by the Modified Gram-Schmidt algorithm.

The default is to orthogonalize the singular vectors only for the singular values, which are not well-separated.

BACKWARD_SWEEP (INPUT,OPTIONAL) logical(lgl)

On entry, if:

BACKWARD_SWEEP=true and the singular vectors of the bidiagonal matrix BD are orthogonalized by the modified Gram-Schmidt algorithm, a backward sweep of the modified Gram-Schmidt algorithm is also performed;

BACKWARD_SWEEP=false, a backward sweep is not performed.

The default is not to perform a backward sweep of the modified Gram-Schmidt algorithm.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the bidiagonal matrix BD is scaled before computing the singular vectors;

SCALING=false, the bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INITVEC (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INITVEC=true, Fernando vectors are used to start the inverse iteration process for computing the singular vectors of the bidiagonal matrix BD (e.g. the eigenvectors of the associated Golub-Kahan tridiagonal matrix);

INITVEC=false, random uniform starting vectors are used.

The default is to use Fernando starting vectors if the singular values are well-separated and the Golub-Kahan form of the input bidiagonal matrix is unreduced, and random uniform starting vectors otherwise.

Further Details¶

A first estimate of the singular vectors is computed by the Fernando method applied to the tridiagonal Golub-Kahan matrix associated with the bidiagonal matrix BD (see the reference (1) for details) for the singular values which are well-separated and if the Golub-Kahan form of the input bidiagonal matrix is unreduced. For the other singular values, a random start is used as a first estimate of the singular vectors as in the standard inverse-iteration algorithm.

The singular vectors of BD are then computed or refined using inverse iteration on the tridiagonal Golub-Kahan matrix for all the singular values at one step.

By default, the singular vectors of BD are then orthogonalized by the Modified Gram-Schmidt algorithm only if the singular values are not well-separated.

The singular vectors of MAT are finally computed by a blocked back-transformation algorithm.

The computation of the singular vectors of BD and the blocked back-transformation algorithm to find the singular vectors of MAT are parallelized if OPENMP is used.

BD_INVITER2 may fail if some singular values specified in parameter S are nearly identical for some pathological matrices.

For further details, on Fernando method for computing eigenvectors of tridiagonal matrices, the blocked back-transformation algorithm or inverse iteration, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. SIAM J. MATRIX ANAL. APPL, Vol. 18, 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed. The Johns Hopkins

University Press, Baltimore.

`subroutine bd_inviter2 ( mat, p, d, e, s, leftvec, rightvec, failure, maxiter, ortho, backward_sweep, scaling, initvec )`¶

Purpose¶

BD_INVITER2 computes the left and right singular vectors of a full real m-by-n matrix MAT with m>=n corresponding to specified singular values, using inverse iteration.

It is required that the original matrix MAT has been reduced to upper bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal. This can be done with a call to BD_CMP2 (or a call to BD_CMP followed by a call to ORTHO_GEN_BD), before calling BD_SVD for computing singular values and BD_INVITER2 for computing selected singular vectors.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the m-by-n orthogonal matrix Q after reduction by BD_CMP2 or by BD_CMP and ORTHO_GEN_BD. MAT is not modified by the routine.

The shape of MAT must verify:

size( MAT, 1 ) >= size( MAT, 2 ) .

P (INPUT) real(stnd), dimension(:,:)

On entry, the n-by-n orthogonal matrix P after reduction by BD_CMP2 or by BD_CMP and ORTHO_GEN_BD. If P has been computed by BD_CMP2, P can be stored in factored form or not. Both cases are handled by the subroutine. P is not modified by the routine.

The shape of P must verify:

size( P, 1 ) = size( P, 2 ) = size( MAT, 2 ) .

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD as returned by BD_CMP or BD_CMP2.

The size of D must verify:

size( D ) = size( MAT, 2 ) .

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD as returned by BD_CMP or BD_CMP2:

E(i) = BD(i-1,i) for i = 2,3,…,n;

E(1) is arbitrary.

The size of E must verify:

size( E ) = size( MAT, 2 ) .

S (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix BD. The singular values must be given in decreasing order and are assumed to be positive or zero.

The size of S must verify:

size(S) <= size( MAT, 2 ) .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value S(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size( MAT, 1 ) = m ,

size(LEFTVEC,2) = size(S) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value S(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size( MAT, 2 ) = n ,

size(RIGHTVEC,2) = size(S) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = FALSE : indicates successful exit,

FAILURE = TRUE : indicates that some singular vectors of BD failed to converge in MAXITER iterations.

MAXITER (INPUT,OPTIONAL) integer(i4b)

The number of inverse iterations performed in the subroutine.

By default, 2 inverse iterations are performed for all the singular vectors.

ORTHO (INPUT,OPTIONAL) logical(lgl)

On entry, if:

ORTHO=true, all the singular vectors of the bidiagonal matrix BD are orthogonalized by the Modified Gram-Schmidt algorithm;

ORTHO=false, the singular vectors of the bidiagonal matrix BD are not orthogonalized by the Modified Gram-Schmidt algorithm.

The default is to orthogonalize the singular vectors only for the singular values, which are not well-separated.

BACKWARD_SWEEP (INPUT,OPTIONAL) logical(lgl)

On entry, if:

BACKWARD_SWEEP=true and the singular vectors of the bidiagonal matrix BD are orthogonalized by the modified Gram-Schmidt algorithm, a backward sweep of the modified Gram-Schmidt algorithm is also performed;

BACKWARD_SWEEP=false, a backward sweep is not performed.

The default is not to perform a backward sweep of the modified Gram-Schmidt algorithm.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the bidiagonal matrix BD is scaled before computing the singular vectors;

SCALING=false, the bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INITVEC (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INITVEC=true, Fernando vectors are used to start the inverse iteration process for computing the singular vectors of the bidiagonal matrix BD (e.g. the eigenvectors of the associated Golub-Kahan tridiagonal matrix);

INITVEC=false, random uniform starting vectors are used.

The default is to use Fernando starting vectors if the singular values are well-separated and the Golub-Kahan form of the input bidiagonal matrix is unreduced, and random uniform starting vectors otherwise.

Further Details¶

A first estimate of the singular vectors is computed by the Fernando method applied to the tridiagonal Golub-Kahan matrix associated with the bidiagonal matrix BD (see the reference (1) for details) for the singular values which are well-separated and if the Golub-Kahan form of the input bidiagonal matrix is unreduced. For the other singular values, a random start is used as a first estimate of the singular vectors as in the standard inverse-iteration algorithm.

The singular vectors of BD are then computed or refined using inverse iteration on the tridiagonal Golub-Kahan matrix for all the singular values at one step.

By default, the singular vectors of BD are then orthogonalized by the Modified Gram-Schmidt algorithm only if the singular values are not well-separated.

The singular vectors of MAT are finally computed by a blocked back-transformation algorithm.

The computation of the singular vectors of BD and the blocked back-transformation algorithm to find the singular vectors of MAT are parallelized if OPENMP is used.

BD_INVITER2 may fail if some singular values specified in parameter S are nearly identical for some pathological matrices.

For further details, on Fernando method for computing eigenvectors of tridiagonal matrices, the blocked back-transformation algorithm or inverse iteration, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. SIAM J. MATRIX ANAL. APPL, Vol. 18, 1013-1034.

Parlett, B.N., and Dhillon, I.S., 1997: Fernando’s solution to Wilkinson’s problem:

An application of double factorization. Linear Algebra and its Appl., 267, pp.247-279.

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed. The Johns Hopkins

University Press, Baltimore.

`subroutine upper_bd_dsqd2 ( q2, e2, shift, flip, d )`¶

Purpose¶

UPPER_BD_DSQD2 computes:

the L * D * L’ factorization of the matrix BD’ * BD - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD * BD’ - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the stationary QD algorithm of Rutishauser is used to compute the factorization from the squared elements of the bidiagonal matrix BD (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization.

Arguments¶

Q2 (INPUT) real(stnd), dimension(:)

On entry, Q2 contains the squared diagonal elements of the bidiagonal matrix BD.

E2 (INPUT) real(stnd), dimension(:)

On entry, the n-1 squared off-diagonal elements of the bidiagonal matrix BD.

The size of E2 must be size(E2) = size(Q2) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD’ * BD - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD * BD’ - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(Q2).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dpqd2 ( q2, e2, shift, flip, d )`¶

Purpose¶

UPPER_BD_DPQD2 computes:

the L * D * L’ factorization of the matrix BD * BD’ - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD’ * BD - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the progressive QD algorithm of Rutishauser is used to compute the factorization from the squared elements of the bidiagonal matrix BD (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization.

Arguments¶

Q2 (INPUT) real(stnd), dimension(:)

On entry, Q2 contains the squared diagonal elements of the bidiagonal matrix BD.

E2 (INPUT) real(stnd), dimension(:)

On entry, the n-1 squared off-diagonal elements of the bidiagonal matrix BD.

The size of E2 must be size(E2) = size(Q2) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD * BD’ - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD’ * BD - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(Q2).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dsqd2 ( q2, e2, shift, flip, d, t )`¶

Purpose¶

UPPER_BD_DSQD2 computes:

the L * D * L’ factorization of the matrix BD’ * BD - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD * BD’ - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the stationary QD algorithm of Rutishauser is used to compute the factorization from the squared elements of the bidiagonal matrix BD (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization and the auxiliary variable T in the differential form of the stationary QD algorithm.

Arguments¶

Q2 (INPUT) real(stnd), dimension(:)

On entry, Q2 contains the squared diagonal elements of the bidiagonal matrix BD.

E2 (INPUT) real(stnd), dimension(:)

On entry, the n-1 squared off-diagonal elements of the bidiagonal matrix BD.

The size of E2 must be size(E2) = size(Q2) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD’ * BD - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD * BD’ - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(Q2).

T (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the auxiliary values T(i) in the differential form of the stationary QD algorithm.

The size of T must be size(T) = size(D) = size(Q2).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dpqd2 ( q2, e2, shift, flip, d, s )`¶

Purpose¶

UPPER_BD_DPQD2 computes:

the L * D * L’ factorization of the matrix BD * BD’ - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD’ * BD - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the progressive QD algorithm of Rutishauser is used to compute the factorization from the squared elements of the bidiagonal matrix BD (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization and the auxiliary variable S in the differential form of the progressive QD algorithm.

Arguments¶

Q2 (INPUT) real(stnd), dimension(:)

On entry, Q2 contains the squared diagonal elements of the bidiagonal matrix BD.

E2 (INPUT) real(stnd), dimension(:)

On entry, the n-1 squared off-diagonal elements of the bidiagonal matrix BD.

The size of E2 must be size(E2) = size(Q2) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD * BD’ - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD’ * BD - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(Q2).

S (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the auxiliary values S(i) in the differential form of the progressive QD algorithm.

The size of S must be size(S) = size(D) = size(Q2).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dsqd ( a, b, shift, flip, d )`¶

Purpose¶

UPPER_BD_DSQD computes:

the L * D * L’ factorization of the matrix BD’ * BD - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD * BD’ - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the stationary QD algorithm of Rutishauser is used to compute the factorization (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization.

Arguments¶

A (INPUT) real(stnd), dimension(:)

On entry, A contains the diagonal elements of the bidiagonal matrix BD.

B (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of B must be size(B) = size(A) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD’ * BD - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD * BD’ - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(A).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dpqd ( a, b, shift, flip, d )`¶

Purpose¶

UPPER_BD_DPQD computes:

the L * D * L’ factorization of the matrix BD * BD’ - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD’ * BD - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the progressive QD algorithm of Rutishauser is used to compute the factorization (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization.

Arguments¶

A (INPUT) real(stnd), dimension(:)

On entry, A contains the diagonal elements of the bidiagonal matrix BD.

B (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of B must be size(B) = size(A) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD * BD’ - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD’ * BD - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(A).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dsqd ( a, b, shift, flip, d, t )`¶

Purpose¶

UPPER_BD_DSQD computes:

the L * D * L’ factorization of the matrix BD’ * BD - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD * BD’ - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the stationary QD algorithm of Rutishauser is used to compute the factorization (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization and the auxiliary variable T in the differential form of the stationary QD algorithm.

Arguments¶

A (INPUT) real(stnd), dimension(:)

On entry, A contains the diagonal elements of the bidiagonal matrix BD.

B (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of B must be size(B) = size(A) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD’ * BD - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD * BD’ - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(A).

T (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the auxiliary values T(i) in the differential form of the stationary QD algorithm.

The size of T must be size(T) = size(D) = size(A).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dpqd ( a, b, shift, flip, d, s )`¶

Purpose¶

UPPER_BD_DPQD computes:

the L * D * L’ factorization of the matrix BD * BD’ - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD’ * BD - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the progressive QD algorithm of Rutishauser is used to compute the factorization (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization and the auxiliary variable S in the differential form of the progressive QD algorithm.

Arguments¶

A (INPUT) real(stnd), dimension(:)

On entry, A contains the diagonal elements of the bidiagonal matrix BD.

B (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of B must be size(B) = size(A) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD * BD’ - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD’ * BD - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(A).

S (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the auxiliary values S(i) in the differential form of the progressive QD algorithm.

The size of S must be size(S) = size(D) = size(A).

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dsqd ( a, b, shift, flip, d, t, l )`¶

Purpose¶

UPPER_BD_DSQD computes:

the L * D * L’ factorization of the matrix BD’ * BD - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD * BD’ - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the stationary QD algorithm of Rutishauser is used to compute the factorization (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization, the off-diagonal entries of L (or of U if FLIP=true) and the auxiliary variable T in the differential form of the stationary QD algorithm.

Arguments¶

A (INPUT) real(stnd), dimension(:)

On entry, A contains the diagonal elements of the bidiagonal matrix BD.

B (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of B must be size(B) = size(A) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD’ * BD - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD * BD’ - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(A).

T (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the auxiliary values T(i) in the differential form of the stationary QD algorithm.

The size of T must be size(T) = size(D) = size(A).

L (OUTPUT) real(stnd), dimension(:)

On exit, the off-diagonal entries of L if FLIP=false or the off-diagonal entries of U if FLIP=true.

The size of L must be size(L) = size(B) = size(A) - 1.

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine upper_bd_dpqd ( a, b, shift, flip, d, s, l )`¶

Purpose¶

UPPER_BD_DPQD computes:

the L * D * L’ factorization of the matrix BD * BD’ - shift * I , if FLIP=false;

the U * D * U’ factorization of the matrix BD’ * BD - shift * I , if FLIP=true;

for a n-by-n (upper) bidiagonal matrix BD and a given shift. L and U are, respectively, unit lower and unit upper bidiagonal matrices and D is a diagonal matrix.

The differential form of the progressive QD algorithm of Rutishauser is used to compute the factorization (see the reference (1) below for further details).

The subroutine outputs the diagonal matrix D of the factorization, the off-diagonal entries of L (or of U if FLIP=true) and the auxiliary variable S in the differential form of the progressive QD algorithm.

Arguments¶

A (INPUT) real(stnd), dimension(:)

On entry, A contains the diagonal elements of the bidiagonal matrix BD.

B (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of B must be size(B) = size(A) - 1.

SHIFT (INPUT) real(stnd)

On entry, the shift.

FLIP (INPUT) logical(lgl)

On entry, if FLIP=false the L * D * L’ factorization of the matrix BD * BD’ - shift * I is computed. Otherwise, if FLIP=true the U * D * U’ factorization of the matrix BD’ * BD - shift * I is computed.

D (OUTPUT) real(stnd), dimension(:)

On exit, the elements of the diagonal matrix D.

The size of D must be size(D) = size(A).

S (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the auxiliary values S(i) in the differential form of the progressive QD algorithm.

The size of S must be size(S) = size(D) = size(A).

L (OUTPUT) real(stnd), dimension(:)

On exit, the off-diagonal entries of L if FLIP=false or the off-diagonal entries of U if FLIP=true.

The size of L must be size(L) = size(B) = size(A) - 1.

Further Details¶

The bidiagonal matrix BD must be scaled appropriately before using this subroutine in order to avoid overflows (see the reference (1) below for further details).

This subroutine is adapted from the algorithms given in reference (1). See:

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine dflgen_bd ( d, e, lambda, cs_left, sn_left, cs_right, sn_right, scaling )`¶

Purpose¶

DFLGEN_BD computes deflation parameters (e.g. two chains of Givens rotations) for a n-by-n (upper) bidiagonal matrix BD and a given singular value of BD.

On output, the arguments CS_LEFT, SN_LEFT, CS_RIGHT and SN_RIGHT contain, respectively, the vectors of the cosines and sines coefficients of the chain of n-1 planar rotations that deflates the real n-by-n bidiagonal matrix BD corresponding to a singular value LAMBDA.

Arguments¶

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

E (INPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

The size of E must be size(E) = size(D) - 1.

LAMBDA (INPUT) real(stnd)

On entry, a singular value of the bidiagonal matrix BD.

CS_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the left.

The size of CS_LEFT must be size(CS_LEFT) = size(E) = size(D) - 1.

SN_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the left.

The size of SN_LEFT must be size(SN_LEFT) = size(E) = size(D) - 1.

CS_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the right.

The size of CS_RIGHT must be size(CS_RIGHT) = size(E) = size(D) - 1.

SN_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the right.

The size of SN_RIGHT must be size(SN_RIGHT) = size(E) = size(D) - 1.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows.

The default is to scale the bidiagonal matrix.

Further Details¶

This subroutine is adapted from the matlab routine DFLGEN in the reference (1) and algorithms given in reference (2).

For further details, see:

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine dflgen2_bd ( d, e, lambda, cs_left, sn_left, cs_right, sn_right, deflate, scaling )`¶

Purpose¶

DFLGEN2_BD computes and applies deflation parameters (e.g. two chains of Givens rotations) for a n-by-n (upper) bidiagonal matrix BD and a given singular value of BD.

On input:

The arguments D and E contain, respectively, the main diagonal and off-diagonal of the bidiagonal matrix, and the argument LAMBDA contains an estimate of the singular value.

On output:

The arguments D and E contain, respectively, the new main diagonal and off-diagonal of the deflated bidiagonal matrix if DEFLATE is set to true, otherwise D and E are not changed.

The arguments CS_LEFT, SN_LEFT, CS_RIGHT and SN_RIGHT contain, respectively, the vectors of the cosines and sines coefficients of the chain of n-1 planar rotations that deflates the real n-by-n bidiagonal matrix BD corresponding to the singular value LAMBDA. One chain is applied to the left of BD (CS_LEFT, SN_LEFT) and the other is applied to the right of BD (CS_RIGHT, SN_RIGHT).

Arguments¶

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

On exit, the new main diagonal of the bidiagonal matrix if DEFLATE=true. Otherwise, D is not changed.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

On exit, the new off-diagonal of the bidiagonal matrix if DEFLATE=true. Otherwise, E is not changed.

The size of E must be size(E) = size(D) - 1.

LAMBDA (INPUT) real(stnd)

On entry, a singular value of the bidiagonal matrix BD.

CS_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the left.

The size of CS_LEFT must be size(CS_LEFT) = size(E) = size(D) - 1.

SN_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the left.

The size of SN_LEFT must be size(SN_LEFT) = size(E) = size(D) - 1.

CS_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the right.

The size of CS_RIGHT must be size(CS_RIGHT) = size(E) = size(D) - 1.

SN_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the right.

The size of SN_RIGHT must be size(SN_RIGHT) = size(E) = size(D) - 1.

DEFLATE (OUTPUT) logical(lgl)

On exit:

DEFLATE = true : indicates successful exit.

DEFLATE = false: indicates that full accuracy was not attained in the deflation of the bidiagonal matrix.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows.

The default is to scale the bidiagonal matrix.

Further Details¶

This subroutine is adapted from the matlab routine DFLGEN in the reference (1) and algorithms given in reference (2).

For further details, see:

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Fernando, K.V., 1998: Accurately counting singular values of bidiagonal matrices and

eigenvalues of skew-symmetric tridiagonal matrices. SIAM J. Matrix Anal. Appl., Vol. 20, no 2, pp.373-399.

`subroutine dflapp_bd ( d, e, cs_left, sn_left, cs_right, sn_right, deflate )`¶

Purpose¶

DFLAPP_BD deflates a real n-by-n (upper) bidiagonal matrix BD by two chains of planar rotations produced by DFLGEN_BD or DFLGEN2_BD.

On entry, the arguments D and E contain, respectively, the main diagonal and off-diagonal of the bidiagonal matrix.

On output, the arguments D and E contain, respectively, the new main diagonal and off-diagonal of the deflated bidiagonal matrix if DEFLATE is set to true.

Arguments¶

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

On exit, the new main diagonal of the bidiagonal matrix if DEFLATE=true. Otherwise, D is not changed.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the n-1 off-diagonal elements of the bidiagonal matrix BD.

On exit, the new off-diagonal of the bidiagonal matrix if DEFLATE=true. Otherwise, E is not changed.

The size of E must be size(E) = size(D) - 1.

CS_LEFT (INPUT) real(stnd), dimension(:)

On entry, the vector of the cosines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the left.

The size of CS_LEFT must be size(CS_LEFT) = size(E) = size(D) - 1.

SN_LEFT (INPUT) real(stnd), dimension(:)

On entry, the vector of the sines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the left.

The size of SN_LEFT must be size(SN_LEFT) = size(E) = size(D) - 1.

CS_RIGHT (INPUT) real(stnd), dimension(:)

On entry, the vector of the cosines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the right.

The size of CS_RIGHT must be size(CS_RIGHT) = size(E) = size(D) - 1.

SN_RIGHT (INPUT) real(stnd), dimension(:)

On entry, the vector of the sines coefficients of the chain of n-1 Givens rotations that deflates the bidiagonal matrix BD on the right.

The size of SN_RIGHT must be size(SN_RIGHT) = size(E) = size(D) - 1.

DEFLATE (OUTPUT) logical(lgl)

On exit:

DEFLATE = true : indicates successful exit.

DEFLATE = false: indicates that full accuracy was not attained in the deflation of the bidiagonal matrix.

Further Details¶

For further details, see:

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Dhillon, I.S., 1998: Reliable computation of the condition number of a

tridiagonal matrix in O(n) time. SIAM J. MATRIX ANAL. APPL, Vol. 19, 776-796.

`subroutine qrstep_bd ( d, e, lambda, cs_left, sn_left, cs_right, sn_right, deflate, update_bd )`¶

Purpose¶

QRSTEP_BD performs one QR step with a given shift LAMBDA on a n-by-n real (upper) bidiagonal matrix BD.

On entry, the arguments D and E contain, respectively, the main diagonal and superdiagonal of the bidiagonal matrix.

On output, the arguments D and E contain, respectively, the new main diagonal and superdiagonal of the updated (e.g. deflated) bidiagonal matrix, if DEFLATE is set to true or if the optional logical argument UPDATE_BD is used with the value true, otherwise they are not changed.

The two chains of n-1 planar rotations produced during the QR step with shift LAMBDA are saved in the arguments CS_LEFT, SN_LEFT, CS_RIGHT, SN_RIGHT.

Arguments¶

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

On exit, the new main diagonal of the bidiagonal matrix if DEFLATE=true or if UPDATE_BD=true. Otherwise, D is not changed.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the n-1 superdiagonal elements of the bidiagonal matrix BD.

On exit, the new superdiagonal of the bidiagonal matrix if DEFLATE=true or if UPDATE_BD=true. Otherwise, E is not changed.

The size of E must be size(E) = size(D) - 1.

LAMBDA (INPUT) real(stnd)

On entry, the shift used in the current QR step.

CS_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the left in the current QR step.

The size of CS_LEFT must be size(CS_LEFT) = size(E) = size(D) - 1.

SN_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the left in the current QR step.

The size of SN_LEFT must be size(SN_LEFT) = size(E) = size(D) - 1.

CS_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the right in the current QR step.

The size of CS_RIGHT must be size(CS_RIGHT) = size(E) = size(D) - 1.

SN_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the right in the current QR step.

The size of SN_RIGHT must be size(SN_RIGHT) = size(E) = size(D) - 1.

DEFLATE (OUTPUT) logical(lgl)

On exit:

DEFLATE = true : indicates that deflation occured at the end of the step.

DEFLATE = false: indicates that the last superdiagonal element of the bidiagonal matrix is not small.

UPDATE_BD (INPUT,OPTIONAL) logical(lgl)

On entry:

UPDATE_BD = true : indicates that the bidiagonal matrix will be updated on exit.

UPDATE_BD = false: indicates that the bidiagonal matrix will be updated on exit only if DEFLATE = true.

The default value for UPDATE_BD is false.

Further Details¶

This subroutine is adapted from the matlab routine QRSTEP given in the reference (1). The bidiagonal matrix BD is assumed to be unreduced, but no checks are done in the subroutine to verify this hypothesis.

For further details, see:

Mastronardi, M., Van Barel, M., Van Camp, E., and Vandebril, R., 2006: On computing

the eigenvectors of a class of structured matrices. Journal of Computational and Applied Mathematics, 189, 580-591.

Demmel, J.W., and Kahan, W., 1990: Accurate singular values

of bidiagonal matrices. SIAM J. Sci. Statist. Comput., 11:5, 873-912.

`subroutine qrstep_zero_bd ( d, e, cs_left, sn_left, cs_right, sn_right, deflate, update_bd )`¶

Purpose¶

QRSTEP_ZERO_BD performs one implicit QR step with a zero shift on a n-by-n real (upper) bidiagonal matrix BD.

On entry, the arguments D and E contain, respectively, the main diagonal and superdiagonal of the bidiagonal matrix.

On output, the arguments D and E contain, respectively, the new main diagonal and superdiagonal of the updated (e.g. deflated) bidiagonal matrix, if DEFLATE is set to true or if the optional logical argument UPDATE_BD is used with the value true, otherwise they are not changed.

The two chains of n-1 planar rotations produced during the QR step with zero shift are saved in the arguments CS_LEFT, SN_LEFT, CS_RIGHT, SN_RIGHT.

Arguments¶

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

On exit, the new main diagonal of the bidiagonal matrix if DEFLATE=true or if UPDATE_BD=true. Otherwise, D is not changed.

E (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the n-1 superdiagonal elements of the bidiagonal matrix BD.

On exit, the new superdiagonal of the bidiagonal matrix if DEFLATE=true or if UPDATE_BD=true. Otherwise, E is not changed.

The size of E must be size(E) = size(D) - 1.

CS_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the left in the current QR step.

The size of CS_LEFT must be size(CS_LEFT) = size(E) = size(D) - 1.

SN_LEFT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the left in the current QR step.

The size of SN_LEFT must be size(SN_LEFT) = size(E) = size(D) - 1.

CS_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the cosines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the right in the current QR step.

The size of CS_RIGHT must be size(CS_RIGHT) = size(E) = size(D) - 1.

SN_RIGHT (OUTPUT) real(stnd), dimension(:)

On exit, the vector of the sines coefficients of the chain of n-1 Givens rotations applied to the bidiagonal matrix BD on the right in the current QR step.

The size of SN_RIGHT must be size(SN_RIGHT) = size(E) = size(D) - 1.

DEFLATE (OUTPUT) logical(lgl)

On exit:

DEFLATE = true : indicates that deflation occured at the end of the step.

DEFLATE = false: indicates that the last superdiagonal element of the bidiagonal matrix is not small.

UPDATE_BD (INPUT,OPTIONAL) logical(lgl)

On entry:

UPDATE_BD = true : indicates that the bidiagonal matrix will be updated on exit.

UPDATE_BD = false: indicates that the bidiagonal matrix will be updated on exit only if DEFLATE = true.

The default value for UPDATE_BD is false.

Further Details¶

This subroutine is adapted from the implicit zero-shift QR algorithm given in the reference (1).

For further details, see:

Demmel, J.W., and Kahan, W., 1990: Accurate singular values

of bidiagonal matrices. SIAM J. Sci. Statist. Comput., 11:5, 873-912.

`subroutine upper_bd_deflate ( d, e, singval, leftvec, rightvec, failure, max_qr_steps, scaling )`¶

Purpose¶

UPPER_BD_DEFLATE computes the left and right singular vectors of a real (upper) bidiagonal matrix BD corresponding to a specified singular value, using a deflation technique.

Arguments¶

D (INPUT) real(stnd), dimension(:)

On entry, the diagonal elements of the bidiagonal matrix BD.

E (INPUT) real(stnd), dimension(:)

On entry, the n-1 superdiagonal elements of the bidiagonal matrix BD.

The size of E must be size(E) = size(D) - 1 = n - 1.

SINGVAL (INPUT) real(stnd)

On entry, a singular value of the bidiagonal matrix. SINGVAL is assumed to be positive or zero.

LEFTVEC (OUTPUT) real(stnd), dimension(:)

On exit, the computed left singular vector associated with the singular value SINGVAL.

The shape of LEFTVEC must verify:

size(LEFTVEC) = size(D) = n .

RIGHTVEC (OUTPUT) real(stnd), dimension(:)

On exit, the computed right singular vector associated with the singular value SINGVAL.

The shape of RIGHTVEC must verify:

size(RIGHTVEC) = size(D) = n .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the deflation procedure of the bidiagonal matrix.

MAX_QR_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_QR_STEPS controls the maximum number of QR sweeps for deflating the bidiagonal matrix for a given singular value.

The algorithm fails to converge if the total number of QR sweeps exceeds MAX_QR_STEPS.

The default is 4.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows.

The default is to scale the bidiagonal matrix.

Further Details¶

UPPER_BD_DEFLATE is a low-level subroutine used by BD_DEFLATE subroutines. Its use as a stand-alone method for computing singular vectors of a bidiagonal matrix is not recommended.

Note also that the sign of the singular vectors computed by this subroutine is arbitrary and not necessarily consistent between the left and right singular vectors. In order to compute consistent singular triplets, subroutine BD_DEFLATE must be used instead.

For further details, see:

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Mastronardi, M., Van Barel, M., Van Camp, E., and Vandebril, R., 2006: On computing

the eigenvectors of a class of structured matrices. Journal of Computational and Applied Mathematics, 189, 580-591.

Demmel, J.W., and Kahan, W., 1990: Accurate singular values

of bidiagonal matrices. SIAM J. Sci. Statist. Comput., 11:5, 873-912.

`subroutine upper_bd_deflate ( d, e, singval, leftvec, rightvec, failure, max_qr_steps, scaling )`¶

Purpose¶

UPPER_BD_DEFLATE computes the left and right singular vectors of a real (upper) bidiagonal matrix BD corresponding to specified singular values, using a deflation technique.

Arguments¶

D (INPUT) real(stnd), dimension(:)

On entry, the diagonal elements of the bidiagonal matrix BD.

E (INPUT) real(stnd), dimension(:)

On entry, the n-1 superdiagonal elements of the bidiagonal matrix BD.

The size of E must be size(E) = size(D) - 1 = n - 1.

SINGVAL (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix. The singular values can be given in any order, but are assumed to be positive or zero.

The size of SINGVAL must verify: size(SINGVAL) <= size(D) = n .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value SINGVAL(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size(D) = n ,

size(LEFTVEC,2) = size(SINGVAL) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value SINGVAL(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size(D) = n ,

size(RIGHTVEC,2) = size(SINGVAL) .

FAILURE (OUTPUT) logical(lgl), dimension(:)

On exit:

FAILURE(j) = FALSE : indicates successful exit for the jth singular triplet.

FAILURE(j) = TRUE : indicates that the algorithm did not converge and full accuracy was not attained in the deflation procedure of the bidiagonal matrix for the jth singular triplet.

The size of FAILURE must verify: size(FAILURE) = size(SINGVAL) .

MAX_QR_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_QR_STEPS controls the maximum number of QR sweeps for deflating the bidiagonal matrix for a given singular value. The algorithm fails to converge if the total number of QR sweeps for all eigenvalues exceeds MAX_QR_STEPS * size(EIGVAL).

The default is 4.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if SCALING=true the bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows.

The default is to scale the bidiagonal matrix.

Further Details¶

UPPER_BD_DEFLATE is a low-level subroutine used by BD_DEFLATE subroutines. Its use as a stand-alone method for computing singular vectors of a bidiagonal matrix is not recommended.

Note also that the sign of the singular vectors computed by this subroutine is arbitrary and not necessarily consistent between the left and right singular vectors. In order to compute consistent singular triplets, subroutine BD_DEFLATE must be used instead.

For further details, see:

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Mastronardi, M., Van Barel, M., Van Camp, E., and Vandebril, R., 2006: On computing

the eigenvectors of a class of structured matrices. Journal of Computational and Applied Mathematics, 189, 580-591.

Demmel, J.W., and Kahan, W., 1990: Accurate singular values

of bidiagonal matrices. SIAM J. Sci. Statist. Comput., 11:5, 873-912.

`subroutine bd_deflate ( upper, d, e, s, leftvec, rightvec, failure, max_qr_steps, ortho, scaling, inviter )`¶

Purpose¶

BD_DEFLATE computes the left and right singular vectors of a real n-by-n bidiagonal matrix BD corresponding to specified singular values, using deflation techniques on the bidiagonal matrix BD.

Arguments¶

UPPER (INPUT) logical(lgl)

On entry, if:

UPPER = true : BD is upper bidiagonal ;

UPPER = false : BD is lower bidiagonal.

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD.

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD. E(1) is arbitrary.

The size of E must verify: size( E ) = size( D ) = n .

S (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix BD. The singular values must be given in decreasing order and are assumed to be positive or zero.

The size of S must verify: size(S) <= size(D) = n .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value S(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size(D) = n ,

size(LEFTVEC,2) = size(S) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value S(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size(D) = n ,

size(RIGHTVEC,2) = size(S) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the deflation procedure of the bidiagonal matrix.

MAX_QR_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_QR_STEPS controls the maximum number of QR sweeps for deflating the bidiagonal matrix for a given singular value. The algorithm fails to converge if the total number of QR sweeps for all singular values exceeds MAX_QR_STEPS * size(S).

The default is 4.

ORTHO (INPUT,OPTIONAL) logical(lgl)

On entry, if:

ORTHO=true, the bidiagonal matrix BD is deflated sequentially for all the specified singular values; this implies that the singular vectors of the bidiagonal matrix BD will be automatically orthogonal on exit.

ORTHO=false, the bidiagonal matrix BD is deflated in parallel for the different clusters of singular values or isolated singular values; this implies that orthogonality of the singular vectors of bidiagonal matrix BD is preserved inside each cluster, but not automatically between clusters.

The default is ORTHO=false.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows;

SCALING=false, the bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INVITER (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INVITER=true, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros are computed by inverse iteration instead of deflation.

INVITER=false, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros are computed by deflation.

The default is INVITER=true.

Further Details¶

The singular vectors are computed using deflation techniques applied to the bidiagonal matrix BD. The first deflation technique used in BD_DEFLATE combines an extension to bidiagonal matrices of Fernando’s approach for computing eigenvectors of tridiagonal matrices with a deflation procedure by Givens rotations originally developed by Godunov and his collaborators (see references (1) and (2) for more details). If this deflation technique failed, QR iterations are used instead as described in (3) and (4).

Optionally, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros may be also computed by inverse iteration on the Golub-Kahan tridiagonal form of the bidiagonal matrix BD. This is the default since in these cases inverse iteration is safer and faster than the deflation algorithms.

The computation of the singular vectors is parallelized if OPENMP is used.

It is essential that singular values given on entry of BD_DEFLATE are computed to high relative accuracy. Subroutines BD_SINGVAL or BD_SINVAL2 may be used for this purpose.

BD_DEFLATE may fail if some the singular values specified in parameter S are nearly identical or for clusters of small singular values.

For further details, on the deflation techniques used in BD_DEFLATE, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Mastronardi, M., Van Barel, M., Van Camp, E., and Vandebril, R., 2006: On computing

the eigenvectors of a class of structured matrices. Journal of Computational and Applied Mathematics, 189, 580-591.

Demmel, J.W., and Kahan, W., 1990: Accurate singular values

of bidiagonal matrices. SIAM J. Sci. Statist. Comput., 11:5, 873-912.

`subroutine bd_deflate2 ( mat, tauq, taup, d, e, s, leftvec, rightvec, failure, max_qr_steps, ortho, scaling, inviter )`¶

Purpose¶

BD_DEFLATE2 computes the left and right singular vectors of a full real m-by-n matrix MAT corresponding to specified singular values, using deflation techniques.

It is required that the original matrix MAT has been reduced to upper or lower bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal. This can be done with a call to BD_CMP with parameters TAUQ and TAUP, before calling BD_SINGVAL (or BD_SINGVAL2) for computing singular values and a call to BD_DEFLATE2 for computing selected singular vectors.

If m >= n, BD is upper bidiagonal and if m < n, BD is lower bidiagonal.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the original m-by-n matrix after reduction by BD_CMP. MAT must contains the vectors which define the elementary reflectors H(i) and G(i) whose products determine the matrices Q and P, as returned by BD_CMP. MAT must be specified as returned by BD_CMP and is not modified by the routine.

TAUQ (INPUT) real(stnd), dimension(:)

TAUQ(i) must contain the scalar factor of the elementary reflector H(i) which determines Q, as returned by BD_CMP in the array argument TAUQ.

The size of TAUQ must verify:

size( TAUQ ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

TAUP (INPUT) real(stnd), dimension(:)

TAUP(i) must contain the scalar factor of the elementary reflector G(i), which determines P, as returned by BD_CMP in its array argument TAUP.

The size of TAUP must verify:

size( TAUP ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD as returned by BD_CMP.

The size of D must verify:

size( D ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD as returned by BD_CMP:

if m >= n, E(i) = BD(i-1,i) for i = 2,3,…,n;

if m < n, E(i) = BD(i,i-1) for i = 2,3,…,m.

E(1) is arbitrary.

The size of E must verify:

size( E ) = min( size( MAT, 1 ), size( MAT, 2 ) ) .

S (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix BD. The singular values must be given in decreasing order and are assumed to be positive or zero.

The size of S must verify:

size(S) <= min( size( MAT, 1 ), size( MAT, 2 ) ) .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value S(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size( MAT, 1 ) = m ,

size(LEFTVEC,2) = size(S) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value S(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size( MAT, 2 ) = n ,

size(RIGHTVEC,2) = size(S) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the deflation procedure of the bidiagonal matrix BD.

MAX_QR_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_QR_STEPS controls the maximum number of QR sweeps for deflating the bidiagonal matrix BD for a given singular value. The algorithm fails to converge if the total number of QR sweeps for all singular values exceeds MAX_QR_STEPS * size(S).

The default is 4.

ORTHO (INPUT,OPTIONAL) logical(lgl)

On entry:

ORTHO=true, the bidiagonal matrix BD is deflated sequentially for all the specified singular values; this implies that the singular vectors of the bidiagonal matrix BD will be automatically orthogonal on exit.

ORTHO=false, the bidiagonal matrix BD is deflated in parallel for the different clusters of singular values or isolated singular values; this implies that orthogonality of the singular vectors of bidiagonal matrix BD is preserved inside each cluster, but not automatically between clusters.

The default is ORTHO=false.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the intermediate bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows;

SCALING=false, the intermediate bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INVITER (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INVITER=true, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros are computed by inverse iteration instead of deflation.

INVITER=false, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros are computed by deflation.

The default is INVITER=true.

Further Details¶

The singular vectors are computed using deflation techniques applied implicitly to the associated tridiagonal forms BD’ * BD and BD * BD’ of the bidiagonal matrix BD. See description of the BD_DEFLATE subroutine for more details.

The computation of the singular vectors is parallelized if OPENMP is used.

It is essential that singular values given on entry of BD_DEFLATE2 are computed to high (relative) accuracy. Subroutines BD_SINGVAL or BD_SINVAL2 may be used for this purpose.

BD_DEFLATE2 may fail if some the singular values specified in parameter S are nearly identical or for clusters of small singular values for some pathological matrices.

The deflation algorithms used in BD_DEFLATE2 are competitive with the inverse iteration procedure implemented in BD_INVITER2.

For further details, on the deflation techniques used in BD_DEFLATE2, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Mastronardi, M., Van Barel, M., Van Camp, E., and Vandebril, R., 2006: On computing

the eigenvectors of a class of structured matrices. Journal of Computational and Applied Mathematics, 189, 580-591.

`subroutine bd_deflate2 ( mat, p, d, e, s, leftvec, rightvec, failure, max_qr_steps, ortho, scaling, inviter )`¶

Purpose¶

BD_DEFLATE2 computes the left and right singular vectors of a full real m-by-n matrix MAT with m>=n corresponding to specified singular values, using deflation techniques.

It is required that the original matrix MAT has been reduced to upper bidiagonal form BD by an orthogonal transformation:

Q’ * MAT * P = BD

where Q and P are orthogonal. This can be done with a call to BD_CMP2 (or a call to BD_CMP followed by a call to ORTHO_GEN_BD), before calling BD_SINGVAL (or BD_SINGVAL2) for computing singular values and a call to BD_DEFLATE2 for computing selected singular vectors.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the m-by-n orthogonal matrix Q after reduction by BD_CMP2 or by BD_CMP and ORTHO_GEN_BD. MAT is not modified by the routine.

The shape of MAT must verify:

size( MAT, 1 ) >= size( MAT, 2 ) .

P (INPUT) real(stnd), dimension(:,:)

On entry, the n-by-n orthogonal matrix P after reduction by BD_CMP2 or by BD_CMP and ORTHO_GEN_BD. If P has been computed by BD_CMP2, P can be stored in factored form or not. Both cases are handled by the subroutine. P is not modified by the routine.

The shape of P must verify:

size( P, 1 ) = size( P, 2 ) = size( MAT, 2 ) .

D (INPUT) real(stnd), dimension(:)

On entry, D contains the diagonal elements of the bidiagonal matrix BD as returned by BD_CMP or BD_CMP2.

The size of D must verify:

size( D ) = size( MAT, 2 ) .

E (INPUT) real(stnd), dimension(:)

On entry, E contains the off-diagonal elements of the bidiagonal matrix BD as returned by BD_CMP or BD_CMP2:

E(i) = BD(i-1,i) for i = 2,3,…,n;

E(1) is arbitrary.

The size of E must verify:

size( E ) = size( MAT, 2 ) .

S (INPUT) real(stnd), dimension(:)

On entry, selected singular values of the bidiagonal matrix BD. The singular values must be given in decreasing order and are assumed to be positive or zero.

The size of S must verify:

size(S) <= size( MAT, 2 ) .

LEFTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed left singular vectors. The left singular vector associated with the singular value S(j) is stored in the j-th column of LEFTVEC.

The shape of LEFTVEC must verify:

size(LEFTVEC,1) = size( MAT, 1 ) = m ,

size(LEFTVEC,2) = size(S) .

RIGHTVEC (OUTPUT) real(stnd), dimension(:,:)

On exit, the computed right singular vectors. The right singular vector associated with the singular value S(j) is stored in the j-th column of RIGHTVEC.

The shape of RIGHTVEC must verify:

size(RIGHTVEC,1) = size( MAT, 2 ) = n ,

size(RIGHTVEC,2) = size(S) .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the deflation procedure of the bidiagonal matrix BD.

MAX_QR_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_QR_STEPS controls the maximum number of QR sweeps for deflating the bidiagonal matrix BD for a given singular value. The algorithm fails to converge if the total number of QR sweeps for all singular values exceeds MAX_QR_STEPS * size(S).

The default is 4.

ORTHO (INPUT,OPTIONAL) logical(lgl)

On entry:

ORTHO=true, the bidiagonal matrix BD is deflated sequentially for all the specified singular values; this implies that the singular vectors of the bidiagonal matrix BD will be automatically orthogonal on exit.

ORTHO=false, the bidiagonal matrix BD is deflated in parallel for the different clusters of singular values or isolated singular values; this implies that orthogonality of the singular vectors of bidiagonal matrix BD is preserved inside each cluster, but not automatically between clusters.

The default is ORTHO=false.

SCALING (INPUT,OPTIONAL) logical(lgl)

On entry, if:

SCALING=true, the intermediate bidiagonal matrix BD is scaled before computing the deflation parameters in order to avoid overflows;

SCALING=false, the intermediate bidiagonal matrix BD is not scaled.

The default is to scale the bidiagonal matrix.

INVITER (INPUT,OPTIONAL) logical(lgl)

On entry, if:

INVITER=true, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros are computed by inverse iteration instead of deflation.

INVITER=false, singular vectors corresponding to isolated singular values or singular vectors of bidiagonal matrices with zeros are computed by deflation.

The default is INVITER=true.

Further Details¶

The singular vectors are computed using deflation techniques applied implicitly to the associated tridiagonal forms BD’ * BD and BD * BD’ of the bidiagonal matrix BD. See description of the BD_DEFLATE subroutine for more details.

The computation of the singular vectors is parallelized if OPENMP is used.

It is essential that singular values given on entry of BD_DEFLATE2 are computed to high (relative) accuracy. Subroutines BD_SINGVAL or BD_SINVAL2 may be used for this purpose.

BD_DEFLATE2 may fail if some the singular values specified in parameter S are nearly identical or for clusters of small singular values for some pathological matrices.

The deflation algorithms used in BD_DEFLATE2 are competitive with the inverse iteration procedure implemented in BD_INVITER2.

For further details, on the deflation techniques used in BD_DEFLATE2, see:

Fernando, K.V., 1997: On computing an eigenvector of a tridiagonal matrix.

Part I: Basic results. Siam J. Matrix Anal. Appl., Vol. 18, pp. 1013-1034.

Malyshev, A.N., 2000: On deflation for symmetric tridiagonal matrices.

Report 182 of the Department of Informatics, University of Bergen, Norway.

Mastronardi, M., Van Barel, M., Van Camp, E., and Vandebril, R., 2006: On computing

the eigenvectors of a class of structured matrices. Journal of Computational and Applied Mathematics, 189, 580-591.

`subroutine svd_sort ( sort, d, u, v )`¶

Purpose¶

Given the singular values D and singular vectors U and V as output from BD_SVD, SVD_CMP or SVD_CMP3, this subroutine sorts the singular values into ascending or descending order, and, rearranges the columns of U and V correspondingly.

Arguments¶

SORT (INPUT) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

The singular vectors are rearranged accordingly.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the singular values.

On exit, the singular values in ascending or decreasing order.

U (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the columns of U are the (left) singular vectors.

On exit, U contains the rearranged (left) singular vectors.

The shape of U must verify: size(U,2) = size(D) .

V (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the columns of V are the (right) singular vectors.

On exit, V contains the rearranged (right) singular vectors.

The shape of V must verify: size(V,2) = size(D) .

Further Details¶

The method is straight insertion.

`subroutine svd_sort2 ( sort, d, u, vt )`¶

Purpose¶

Given the singular values D and singular vectors U and VT as output from BD_SVD2 or SVD_CMP2, this subroutine sorts the singular values into ascending or descending order, and, rearranges the columns of U and the rows of VT correspondingly.

Arguments¶

SORT (INPUT) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

The singular vectors are rearranged accordingly.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the singular values.

On exit, the singular values in ascending or decreasing order.

U (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the columns of U are the (left) singular vectors.

On exit, U contains the rearranged (left) singular vectors.

The shape of U must verify: size(U,2) = size(D) .

VT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the rows of VT are the (right) singular vectors.

On exit, VT contains the rearranged (right) singular vectors.

The shape of VT must verify: size(VT,1) = size(D) .

Further Details¶

The method is straight insertion.

`subroutine singvec_sort ( sort, d, u )`¶

Purpose¶

Given the singular values D and singular vectors U, stored columwise, as output from BD_SVD, SVD_CMP, BD_SVD2, SVD_CMP2 or SVD_CMP3, this subroutine sorts the singular values into ascending or descending order, and, rearranges the columns of U correspondingly.

Arguments¶

SORT (INPUT) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

The singular vectors are reordered accordingly.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the singular values.

On exit, the singular values in ascending or decreasing order.

U (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the columns of U are the singular vectors.

On exit, U contains the reordered singular vectors.

The shape of U must verify: size(U,2) = size(D) = n .

Further Details¶

The method is straight insertion.

`subroutine singval_sort ( sort, d )`¶

Purpose¶

Given the singular values D as output from BD_SVD, BD_SVD2, SVD_CMP, SVD_CMP2 or SVD_CMP3, this routine sorts the singular values into ascending or descending order.

Arguments¶

SORT (INPUT) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’.

D (INPUT/OUTPUT) real(stnd), dimension(:)

On entry, the singular values.

On exit, the singular values in ascending or decreasing order.

Further Details¶

The method is quick sort.

`subroutine product_svd_cmp ( a, b, s, failure, sort, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

This subroutine computes the singular value decomposition of the product of a m-by-n matrix A by the transpose of a p-by-n matrix B:

A * B’ = U * SIGMA * V’

where A and B have more rows than columns ( n<=min(m,p) ), SIGMA is an n-by-n matrix which is zero except for its diagonal elements, U is an m-by-n orthogonal matrix, and V is an p-by-n orthogonal matrix. The diagonal elements of SIGMA are the singular values of A * B’; they are real and non-negative. The columns of U and V are the left and right singular vectors of A * B’, respectively.

Arguments¶

A (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix A.

On exit, the m-by-n left-singular matrix U.

B (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the p-by-n matrix B.

On exit, the p-by-n right-singular matrix V.

S (OUTPUT) real(stnd), dimension(:)

The singular values of A * B’.

The size of S must verify: size( S ) = n .

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit

FAILURE = true : indicates that the algorithm did not converge and that full accuracy was not attained in the SVD.

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’. The singular vectors are rearranged accordingly.

MAXITER (INPUT,OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form of A * B’ fails to converge if the number of QR sweeps exceeds MAXITER * n. Convergence usually occurs in about 2 * n QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

The size of S must match: size(S) = size(A,2) = size(B,2) .

`function ginv ( mat, tol, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

GINV returns the generalized inverse of a m-by-n real matrix, MAT. The generalized inverse of MAT is a n-by-m matrix.

Arguments¶

MAT (INPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

TOL (INPUT, OPTIONAL) real(stnd)

On entry:

If TOL is less than or equal to zero or is absent, the function computes the generalized inverse of MAT.

If TOL is greater than zero, the subroutine computes the generalized inverse of a matrix close to MAT, but having condition number in the 2-norm less than 1/TOL.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD phase of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

If MAT is the null matrix or the SVD algorithm used to compute the generalized inverse of MAT did not converge and full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form of MAT, function GINV returns a n-by-m matrix filled with NAN() function.

The computation of the generalized inverse is parallelized if OPENMP is used.

For further details, on the generalized inverse of a rectangular matrix and the algorithm to compute it, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations. The Johns

Hopkins University Press, Baltimore, Maryland.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine comp_ginv ( mat, failure, matginv, tol, singvalues, krank, mul_size, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

COMP_GINV computes the generalized inverse of a m-by-n real matrix, MAT.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT is destroyed.

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that MAT is the null matrix or that the SVD algorithm which is used to compute the generalized inverse of MAT did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

MATGINV (OUTPUT) real(stnd), dimension(:,:)

On exit, MATGINV contains the generalized inverse of MAT or the generalized inverse of a matrix close to MAT.

The shape of MATGINV must verify:

size(MATGINV,1) = size(MAT,2) = n ,

size(MATGINV,2) = size(MAT,1) = m .

TOL (INPUT, OPTIONAL) real(stnd)

On entry, if:

TOL is less than or equal to zero or is absent, the subroutine computes the generalized inverse of MAT.

TOL is greater than zero, the subroutine computes the generalized inverse of a matrix close to MAT, but having condition number in the 2-norm less than 1/TOL.

SINGVALUES (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The singular values of MAT in decreasing order. The condition number of MAT in the 2-norm is

SINGVALUES(1)/SINGVALUES(min(m,n)).

The size of SINGVALUES must verify : size( SINGVALUES ) = min(m,n) .

KRANK (OUTPUT, OPTIONAL) integer(i4b)

On exit, the effective rank of MAT, i.e., the number of singular values which are greater than TOL * SINGVALUES(1).

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= max(m,n), otherwise a default value is used. MUL_SIZE can be increased or decreased to improve the performance of the algorithm.

The default value is 32.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm. MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

If all the elements of MAT are equal to zero, subroutine COMP_GINV returns a n-by-m matrix filled with NAN() function in argument MATGINV and the logical argument FAILURE is set to .true. .

The computation of the generalized inverse is parallelized if OPENMP is used.

For further details, on the generalized inverse of a rectangular matrix and the algorithm to compute it, see:

Golub, G.H., and Van Loan, C.F., 1996: Matrix Computations, 3rd ed.

The Johns Hopkins University Press, Baltimore.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine comp_ginv ( mat, failure, tol, singvalues, krank, mul_size, maxiter, max_francis_steps, perfect_shift )`¶

Purpose¶

COMP_GINV computes the generalized inverse of a m-by-n real matrix, MAT.

Arguments¶

MAT (INPUT/OUTPUT) real(stnd), dimension(:,:)

On entry, the m-by-n matrix MAT.

On exit, MAT contains the transpose of the generalized inverse of MAT or of the generalized inverse of a matrix close to MAT.

FAILURE (OUTPUT) logical(lgl)

On exit:

FAILURE = false : indicates successful exit.

FAILURE = true : indicates that MAT is the null matrix or that the SVD algorithm which is used to compute the generalized inverse of MAT did not converge and that full accuracy was not attained in the bidiagonal SVD of an intermediate bidiagonal form B of MAT.

TOL (INPUT, OPTIONAL) real(stnd)

On entry, if:

TOL is less than or equal to zero or is absent, the subroutine computes the generalized inverse of MAT.

TOL is greater than zero, the subroutine computes the generalized inverse of a matrix close to MAT, but having condition number in the 2-norm less than 1/TOL.

SINGVALUES (OUTPUT, OPTIONAL) real(stnd), dimension(:)

The singular values of MAT in decreasing order. The condition number of MAT in the 2-norm is

SINGVALUES(1)/SINGVALUES(min(m,n)).

The size of SINGVALUES must verify: size( SINGVALUES ) = min(m,n) .

KRANK (OUTPUT, OPTIONAL) integer(i4b)

On exit, the effective rank of MAT, i.e., the number of singular values which are greater than TOL * SINGVALUES(1).

MUL_SIZE (INPUT, OPTIONAL) integer(i4b)

Internal parameter. MUL_SIZE must verify: 1 <= MUL_SIZE <= max(m,n), otherwise a default value is used. MUL_SIZE can be increased or decreased to improve the performance of the algorithm.

The default value is 32.

MAXITER (INPUT, OPTIONAL) integer(i4b)

MAXITER controls the maximum number of QR sweeps in the bidiagonal SVD phase of the SVD algorithm.

The bidiagonal SVD algorithm of an intermediate bidiagonal form B of MAT fails to converge if the number of QR sweeps exceeds MAXITER * min(m,n). Convergence usually occurs in about 2 * min(m,n) QR sweeps.

The default is 10.

MAX_FRANCIS_STEPS (INPUT,OPTIONAL) integer(i4b)

MAX_FRANCIS_STEPS controls the maximum number of Francis sets (e.g. QR sweeps) of Givens rotations which must be saved before applying them with a wavefront algorithm to accumulate the singular vectors in the bidiagonal SVD algorithm.

MAX_FRANCIS_STEPS is a strictly positive integer, otherwise the default value is used.

The default is 10.

PERFECT_SHIFT (INPUT,OPTIONAL) logical(lgl)

PERFECT_SHIFT determines if a perfect shift strategy is used in the implicit QR algorithm in order to minimize the number of QR sweeps in the bidiagonal SVD algorithm.

The default is true.

Further Details¶

If all the elements of MAT are equal to zero, subroutine COMP_GINV returns a m-by-n matrix filled with NAN() function in argument MAT and the logical argument FAILURE is set to .true. .

The computation of the generalized inverse is parallelized if OPENMP is used.

For further details, on the generalized inverse of a rectangular matrix and the algorithm to compute it, see:

Golub, G.H., and van Loan, C.F., 1996: Matrix Computations. The Johns

Hopkins University Press, Baltimore, Maryland.

Lawson, C.L., and Hanson, R.J., 1974: Solving least square problems.

Prentice-Hall.

`subroutine gen_bd_mat ( type, d, e, failure, known_singval, from_tridiag, singval, sort, val1, val2, l0, glu0 )`¶

Purpose¶

GEN_BD_MAT generates different types of bidiagonal matrices with known singular values or specific numerical properties such as clustered singular values for testing purposes of singular value decomposition bidiagonal solvers.

Optionally, the singular values of the selected bidiagonal matrix can be computed analytically, if possible, or by a bisection algorithm with high absolute and relative accuracies.

Arguments¶

TYPE (INPUT) integer(i4b)

Select the type of bidiagonal matrix BD to be generated by the subroutine.

If TYPE is between 1 and 56, the subroutine generates a specific bidiagonal matrix as described in the comments inside the code of the subroutine. For other values of TYPE, all diagonal and off-diagonal elements of the bidiagonal matrix are generated from an uniform random numbers distribution between 0 and 1.

For TYPE between 1 and 17, the singular values of the bidiagonal matrix are known analytically. For other values of TYPE, the singular values may be estimated by a bisection algorithm with high accuracy. In all cases, the singular values may be output in the optional parameter SINGVAL.

For TYPE between 1 and 11 or 52 and 56, the bidiagonal matrix BD is computed as the Cholesky factor of symmetric positive-definite tridiagonal matrices.

D (OUTPUT) real(stnd), dimension(:)

On exit, D contains the diagonal elements of the bidiagonal matrix BD.

The size of D must verify: size( D )>=2 .

E (OUTPUT) real(stnd), dimension(:)

On exit, E(2:) contains the off-diagonal elements of the bidiagonal matrix BD. E(1) is arbitrary, but is set to zero.

The size of E must verify: size( E ) = size( D ) .

FAILURE (OUTPUT, OPTIONAL) logical(lgl)

On exit:

FAILURE = false : indicates that the singular values of BD are known analytically or have been computed with high accuracy;

FAILURE = true : indicates that the singular values of BD are not known analytically and have not been computed with maximum accuracy with the bisection algorithm.

KNOWN_SINGVAL (OUTPUT, OPTIONAL) logical(lgl)

On exit:

KNOWN_SINGVAL = true : indicates that the singular values of BD are known analytically for the selected TYPE.

KNOWN_SINGVAL = false : indicates that the eigenvalues of BD are not known analytically for the selected TYPE.

FROM_TRIDIAG (OUTPUT, OPTIONAL) logical(lgl)

On exit:

FROM_TRIDIAG = true : indicates that the bidiagonal matrix BD has been computed as the Cholesky factor of a positive-definite tridiagonal matrix for the selected TYPE.

FROM_TRIDIAG = false : indicates that the bidiagonal matrix BD has not been computed as the Cholesky factor of a positive-definite tridiagonal matrix for the selected TYPE.

SINGVAL (OUTPUT, OPTIONAL) real(stnd), dimension(:)

On exit, the singular values of BD computed analytically or estimated to high accuracy with a bisection algorithm.

The size of SINGVAL must verify: size( SINGVAL ) = size( D ) .

SORT (INPUT, OPTIONAL) character

Sort the singular values into ascending order if SORT = ‘A’ or ‘a’, or in descending order if SORT = ‘D’ or ‘d’, if the optional argument SINGVAL is present. For other values of SORT nothing is done and SINGVAL(:) may not be sorted.

VAL1 (INPUT, OPTIONAL) real(stnd)

On entry, specifies the parameter d0 for parametrized bidiagonal matrices (e.g. TYPE= 2-8, 10, 15, 32-35).

If this parameter is changed for TYPE between 2 and 8, care must be taken to insure that the initial symmetric tridiagonal matrix, which is used to derive the bidiagonal matrix BD, is positive-definite. If this is not the case, the subroutine will issue an error message and stop the program.

Also, if this parameter is changed for TYPE between 32 and 35, which correspond to graded (or reversely graded) matrices with an arithmetic or geometric progression, care must be taken to insure that some elements of the arithmetic or geometric progression will not underflow or overflow as no checks are done in the subroutine for such errors.

The default for VAL1 is:

for TYPE between 2 and 7;

for TYPE equal to 8;

for TYPE equal to 10;

for TYPE equal to 15;

for TYPE between 32 and 35.

VAL2 (INPUT, OPTIONAL) real(stnd)

On entry, specifies the parameter e0 for parametrized bidiagonal matrices (e.g. TYPE= 2-8, 10, 32-35).

If this parameter is changed for TYPE between 2 and 8, care must be taken to insure that the initial symmetric tridiagonal matrix, which is used to derive the bidiagonal matrix BD, is positive-definite. If this is not the case, the subroutine will issue an error message and stop the program.

Also, if this parameter is changed for TYPE between 32 and 35, which correspond to graded (or reversely graded) matrices with an arithmetic or geometric progression, care must be taken to insure that some elements of the arithmetic or geometric progression will not underflow or overflow as no checks are done in the subroutine for such errors.

The default for VAL2 is:

for TYPE between 2 and 7;

for TYPE equal to 8;

for TYPE equal to 10;

for TYPE between 32 and 35.

L0 (INPUT, OPTIONAL) integer(i4b)

On entry, specify the radius of the initial matrix for parametrized form of glued bidiagonal matrices (e.g. for TYPE equal to 44, 46, 48, 53, 55).

L0 must be greater than 0 and preferably less or equal to size( D )/2 . The default is 5.

GLU0 (INPUT, OPTIONAL) real(stnd)

On entry, specify the glue parameter for parametrized form of glued bidiagonal matrices (e.g. for TYPE equal to 44, 46, 48, 53, 55).

The default is sqrt( epsilon(GLU0) ).

Further Details¶

This subroutine tries to take care of imprecisions in intrinsic subroutines (e.g. like the cos function in the gfortran compiler) when computing singular values by analytic formulae.

For further details on the bidiagonal matrices used for testing in GEN_BD_MAT subroutine, see:

Gladwell, G.M.L., Jones, T.H., Willms N.B., 2014: A test matrix for an inverse eigenvalue problem.

Journal of Applied Mathematics, 14, 6 pages, Article ID 515082, DOI 10.1155/2014/515082.

Clement, P.A., 1959: A class of triple-diagonal matrices for test purposes. SIAM Review

1(1):50-52 DOI 10.1137/1001006.

Gregory, R.T., Karney, D.L., 1969: A collection of matrices for testing computational

algorithms. New York: Wiley. Reprinted with corrections by Robert E. Krieger, Huntington, New York, 1978.

Higham, N.J., 1991: Algorithm 694: a collection of test matrices in MATLAB. ACM

Transactions on Mathematical Software 17(3):289-305 DOI 10.1145/114697.116805.

Godunov, S.K., Antonov, A.G., Kirillyuk, O.P., and Kostin, V.I., 1993: Guaranteed Accuracy

in numerical linear algebra. Kluwer Academic Publishers.

Parlett, B.N., and Vomel, C., 2005: How the MRRR algorithm can fail on tight eigenvalue clusters.

Lapack Working Note 163.

Nakatsukasa, Y., Aishima, K., and Yamazaki, I., 2012: dqds with agressive early deflation.

SIAM J. Matrix Anal. Appl., 33(1): 22-51.

Fernando, K.V., and Parlett, B.N., 1994: Accurate singular values and differenial qd algorithms.

Numer. Math., 67: 191-229.

Module_SVD_Procedures¶

subroutine bd_cmp ( mat, d, e, tauq, taup )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_cmp ( mat, d, e, tauq )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_cmp ( mat, d, e )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_cmp2 ( mat, d, e, p, failure, gen_p )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_cmp2 ( mat, d, e, failure )¶

Purpose¶

Arguments¶

Further Details¶

subroutine ortho_gen_bd ( mat, tauq, taup, p )¶

Purpose¶

Arguments¶

Further Details¶

subroutine ortho_gen_bd2 ( mat, tauq, taup, q_pt )¶

Purpose¶

Arguments¶

Further Details¶

subroutine ortho_gen_q_bd ( mat, tauq )¶

Purpose¶

Arguments¶

Further Details¶

subroutine ortho_gen_p_bd ( mat, taup, p )¶

Purpose¶

Arguments¶

Further Details¶

subroutine apply_q_bd ( mat, tauq, c, left, trans )¶

Purpose¶

Arguments¶

Further Details¶

subroutine apply_p_bd ( mat, taup, c, left, trans )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_svd ( upper, d, e, failure, u, v, sort, maxiter, max_francis_steps, perfect_shift )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_svd2 ( upper, d, e, failure, u, vt, sort, maxiter, max_francis_steps, perfect_shift )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_svd ( upper, d, e, failure, u, sort, maxiter, max_francis_steps, perfect_shift )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_svd ( upper, d, e, failure, sort, maxiter )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_singval ( d, e, nsing, s, failure, sort, vector, abstol, ls, theta, scaling, init )¶

Purpose¶

Arguments¶

Further Details¶

subroutine bd_singval2 ( d, e, nsing, s, failure, sort, vector, abstol, ls, theta, scaling, init )¶

Purpose¶

Arguments¶

Further Details¶

function singvalues ( mat, sort, mul_size, maxiter )¶

Purpose¶

Arguments¶

Further Details¶

subroutine select_singval_cmp ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, tauq, taup, scaling, init )¶

Purpose¶

Arguments¶

Further Details¶

subroutine select_singval_cmp2 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, tauq, taup, scaling, init )¶

Purpose¶

Arguments¶

`subroutine bd_cmp ( mat, d, e, tauq, taup )`¶

`subroutine bd_cmp ( mat, d, e, tauq )`¶

`subroutine bd_cmp ( mat, d, e )`¶

`subroutine bd_cmp2 ( mat, d, e, p, failure, gen_p )`¶

`subroutine bd_cmp2 ( mat, d, e, failure )`¶

`subroutine ortho_gen_bd ( mat, tauq, taup, p )`¶

`subroutine ortho_gen_bd2 ( mat, tauq, taup, q_pt )`¶

`subroutine ortho_gen_q_bd ( mat, tauq )`¶

`subroutine ortho_gen_p_bd ( mat, taup, p )`¶

`subroutine apply_q_bd ( mat, tauq, c, left, trans )`¶

`subroutine apply_p_bd ( mat, taup, c, left, trans )`¶

`subroutine bd_svd ( upper, d, e, failure, u, v, sort, maxiter, max_francis_steps, perfect_shift )`¶

`subroutine bd_svd2 ( upper, d, e, failure, u, vt, sort, maxiter, max_francis_steps, perfect_shift )`¶

`subroutine bd_svd ( upper, d, e, failure, u, sort, maxiter, max_francis_steps, perfect_shift )`¶

`subroutine bd_svd ( upper, d, e, failure, sort, maxiter )`¶

`subroutine bd_singval ( d, e, nsing, s, failure, sort, vector, abstol, ls, theta, scaling, init )`¶

`subroutine bd_singval2 ( d, e, nsing, s, failure, sort, vector, abstol, ls, theta, scaling, init )`¶

`function singvalues ( mat, sort, mul_size, maxiter )`¶

`subroutine select_singval_cmp ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, tauq, taup, scaling, init )`¶

`subroutine select_singval_cmp2 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, tauq, taup, scaling, init )`¶

`subroutine select_singval_cmp3 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, p, gen_p, scaling, init, failure_bd )`¶

`subroutine select_singval_cmp4 ( mat, nsing, s, failure, sort, mul_size, vector, abstol, ls, theta, d, e, p, gen_p, scaling, init, failure_bd )`¶

`subroutine svd_cmp ( mat, s, failure, v, sort, mul_size, maxiter, max_francis_steps, perfect_shift, use_svd2 )`¶

`subroutine svd_cmp2 ( mat, s, failure, u_vt, sort, mul_size, maxiter, max_francis_steps, perfect_shift, use_svd2 )`¶

`subroutine svd_cmp ( mat, s, failure, sort, mul_size, maxiter, d, e, tauq, taup )`¶

`subroutine svd_cmp3 ( mat, s, failure, u_v, sort, maxiter, max_francis_steps, perfect_shift, failure_bd )`¶

`subroutine svd_cmp4 ( mat, s, failure, v, sort, maxiter, max_francis_steps, perfect_shift, sing_vec, gen_p, failure_bd, d, e )`¶

`subroutine svd_cmp3 ( mat, s, failure, sort, maxiter, save_mat, failure_bd )`¶

`function maxdiag_gkinv_qr ( e, lambda )`¶

`function maxdiag_gkinv_ldu ( e, lambda )`¶

`subroutine gk_qr_cmp ( e, lambda, cs, sn, diag, sup1, sup2, maxdiag_gkinv )`¶

`subroutine bd_inviter ( upper, d, e, s, leftvec, rightvec, failure, maxiter, scaling, initvec )`¶

`subroutine bd_inviter ( upper, d, e, s, leftvec, rightvec, failure, maxiter, ortho, backward_sweep, scaling, initvec )`¶

`subroutine bd_inviter2 ( mat, tauq, taup, d, e, s, leftvec, rightvec, failure, maxiter, ortho, backward_sweep, scaling, initvec )`¶

`subroutine bd_inviter2 ( mat, p, d, e, s, leftvec, rightvec, failure, maxiter, ortho, backward_sweep, scaling, initvec )`¶

`subroutine upper_bd_dsqd2 ( q2, e2, shift, flip, d )`¶