As Demmel and others have noted, SVD is both more reliable and more expensive than QR as a method of solving rank-deficient least squares problems.
SVD is the method that LinearAlgebra:-LeastSquares will choose when the Matrix has more columns than rows (n>m), unless instructed otherwise using the optional 'method' parameter.
LinearAlgebra:-SingularValues always computes a full U and Vt. But for least squares computations, such as when n>m, this is not necessary. Including the smaller singular values may just be (re-)introducing noise. See here for more detail.
Here's a 20x2000 example, using wrapperless external calling and the SVD routine dgesvd in the CLAPACK library. The effective speedup by using the Thin SVD for that 20x2000 least squares example is about a factor of 100 (ie, 2000/20), with a similar reduction in additional memory allocation.
I'm using Linux, but perhaps someone can add a note indicating whether it works on MS-Windows.
dgesvd:=proc(
a::Matrix
, jobu::{identical("S","A","N","O")}
, jobvt::{identical("S","A","N","O")}
, $
)
local m,n,lda,locala,s,ldu,u,ldvt,vt,lwork,work,linfo,dgesvd_external,
JOBU,JOBVT,M,N,AA,LDA,SS,U,LDU,VT,LDVT,WORK,LWORK,INFO,LIB;
dgesvd_external:=define_external('dgesvd',
JOBU::string[1],
JOBVT::string[1],
M::REF(integer[4]),
N::REF(integer[4]),
AA::ARRAY(datatype=float[8],order=Fortran_order),
LDA::REF(integer[4]),
SS::ARRAY(datatype=float[8],order=Fortran_order),
U::ARRAY(datatype=float[8],order=Fortran_order),
LDU::REF(integer[4]),
VT::ARRAY(datatype=float[8],order=Fortran_order),
LDVT::REF(integer[4]),
WORK::ARRAY(datatype=float[8],order=Fortran_order),
LWORK::REF(integer[4]),
INFO::REF(integer[4]),
LIB=ExternalCalling:-ExternalLibraryName("clapack",'HWFloat')):
if not ( jobu="O" and jobvt="O" ) then
locala:=Matrix(a,'storage'='rectangular','order'='Fortran_order','datatype'='float[8]');
else
locala := a;
end if;
(m,n):=op(1,locala);
lda:=m;
s:=Vector[row](min(m,n),datatype=float[8]):
if jobu="S" then
ldu:=max(1,m);
u:=Matrix(ldu,min(m,n),datatype=float[8]):
elif jobu="A" then
ldu:=max(1,m);
u:=Matrix(ldu,m,datatype=float[8]):
else
ldu:=1;
u:=Matrix(ldu,1,datatype=float[8]):
end if;
if jobvt="S" then
ldvt:=max(1,min(m,n));
vt:=Matrix(ldvt,min(m,n),datatype=float[8]):
elif jobvt="A" then
ldvt:=max(1,n);
vt:=Matrix(ldvt,n,datatype=float[8]):
else
ldvt:=1;
vt:=Matrix(ldvt,1,datatype=float[8]):
end if;
vt:=Matrix(ldvt,n,datatype=float[8]):
lwork:=max(1,3*min(m,n)+max(m,n),5*min(m,n));
work:=Vector[row](lwork,datatype=float[8]):
linfo:=0:
dgesvd_external(jobu,jobvt,m,n,locala,lda,s,u,ldu,vt,ldvt,
work,lwork,linfo);
if linfo <> 0 then error -linfo; end if;
return `if`(jobu<>"N" and jobu<>"O",u,NULL),s,
`if`(jobvt<>"N" and jobvt<>"O",vt,NULL);
end proc:
with(LinearAlgebra):
#A:=Matrix([[3,2],[3,1.5],[3,1]],datatype=float[8]):
#SingularValues(A,output=[':-U',':-S',':-Vt']):
#dgesvd(A,"A","A");
#dgesvd(A,"S","S");
#dgesvd(A,"S","N");
A:=RandomMatrix(20,2000,'outputoptions'=['datatype'='float'[8]]):
st,ba,bu := time(),kernelopts(bytesalloc),kernelopts(bytesused):
dgesvd(A,"S","S"):
time()-st,kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu;
st,ba,bu := time(),kernelopts(bytesalloc),kernelopts(bytesused):
SingularValues(A,output=[':-U',':-S',':-Vt']):
time()-st,kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu;
B := RandomVector(20,generator=0.1..1.0,'outputoptions'=['datatype'='float'[8]]):
# Compute least squares solution, using dgesvd routine.
st:=time():
u,s,vt := dgesvd(A,"S","S"):
LSsol1 := Transpose(vt)
. Matrix(20,20,(i,j)->`if`(i=j,1/s[i],0))
. (Transpose(u).B) :
time()-st;
# Compute least squares solution, using LeastSquares routine.
st:=time():
#infolevel[LinearAlgebra]:=1:
LSsol2 := LeastSquares(A,B):
#infolevel[LinearAlgebra]:=0:
time()-st;
Norm(LSsol2-LSsol1);
acer
Comments
symbol name adjustment
There is a minor mistake in the above code.
The external function name used in the define_external() call is 'dgesvd'. But that will not work on MS-Windows. The symbol for the external function should instead be 'dgesvd_' with a trailing underscore. That is what is actually in the external dynamic library.
In other words, in the Maple procedure degsvd above there should instead be a line like,
dgesvd_external:=define_external('dgesvd_',It seems that it works with either symbol name, 'dgesvd' or 'dgesvd_', on Linux.
acer