CovarianceMatrix efficiency

Ask a Question

Create a Post

All Posts : Applications, Examples and Libraries

CovarianceMatrix efficiency

Posted: acer 32293 Product: Maple

January 20 2009

Some years ago, before the advent of the Statistics package, a colleague asked for a fast way to generate thousands of normally distributed random numbers in Maple. The suggestion that worked quickest and most easily (using existing, simple Maple Library routines) was to generate random deviates using the usual formula associated with the distribution. But the key was to replace the scalar values (representing the uniformly distributed input) with a whole Matrix of input values. Then all the random deviates get generated at the same time. The reason it was fast is that all the arithmetic was done not only in precompiled external functions, but also in fast BLAS functions tuned for L2 cache, SIMD chipset extensions, etc.

The lesson there was that offloading repetitive floating-point arithmetic to LinearAlgebra can bring about a speed-up.

So, I was looking at the "principal component analysis" section of the Maple code used in this software performance review. I noticed that a rather large part of the time was being spent inside Statistics:-CovarianceMatrix. But an earlier part of that code had already set up the columns of the input Matrix and subtracted column-means to get what it called centered data. It turns out that getting the covariance matrix from that is as easy as doing lots of pairwise dot products with that centered data. With the Vectors as columns of a Matrix M, this amounts to computing Transpose(M).M. Then it became apparent that this alternative way was much faster than CovarianceMatrix. And the reason appears to be that CovarianceMatrix computes all the individual entries one at a time, within a double-loop of interpreted Library code. And it breaks the original Matrix data up into columns, so as to pass them in pairs to an external routine which computes just the single result. But with a Matrix-Matrix multiplcation, all that arithmetic can once again benefit from a single tuned level-3 BLAS call.

As can be seen below, the speed-up increases with the size of the input data.

> kernelopts(printbytes=false):
> Digits := trunc(evalhf(Digits)):

> Cov := proc(M)
> local i,m,n,c,means,tempV,res;
>   Digits := max(Digits,trunc(evalhf(Digits)));
>   m,n := op(1,M);
>   c := Matrix(m,n,M,'datatype'='float[8]');
>   for i from 1 to n do
>     means[i] := Statistics:-Mean(M[1..-1,i]);
>   end do;
>   tempV := Vector(m,'datatype'='float[8]');
>   for i from 1 to n do
>     ArrayTools:-Fill(m,means[i],tempV,0,1);
>     LinearAlgebra:-LA_Main:-VectorAdd(tempV,c[1..m,i],-1,1,
>                               'inplace'=true,'outputoptions'=[]);
>     ArrayTools:-Copy(m,tempV,0,1,c,(i-1)*m,1);
>   end do;
>   res:=LinearAlgebra:-Transpose(c).c;
>   LinearAlgebra:-LA_Main:-MatrixScalarMultiply(res,evalf(1/m),
>                               'inplace'=true,'outputoptions'=[]);
> end proc:

> M := LinearAlgebra:-RandomMatrix(200,200,'generator'=0.0..1.0,
>                         'outputoptions'=['datatype'='float[8]']):
> st:=time(): cv1 := Cov(M): time()-st;
                                     0.099
 
> st:=time(): cv2 := Statistics:-CovarianceMatrix(M): time()-st;
                                     0.978
 
> LinearAlgebra:-Norm(cv1-cv2);
                                                 -14
                          0.244463449456484502 10
 

> M := LinearAlgebra:-RandomMatrix(10000,200,'generator'=0.0..1.0,
>                          'outputoptions'=['datatype'='float[8]']):
> st:=time(): cv1 := Cov(M): time()-st;
                                     1.078
 
> st:=time(): cv2 := Statistics:-CovarianceMatrix(M): time()-st;
                                    22.094
 
> LinearAlgebra:-Norm(cv1-cv2);
                                                 -14
                          0.126576304945475327 10

11285 views
More...

Contact Author

Subscribe

Generate PDF

Comment

Share

Branch

Create a new Post in reply

Ask a related Question

Flag

Add a Comment

Share via:

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

Ask a Question

Create a Post

CovarianceMatrix efficiency

11285 views
More...

Contact Author

Subscribe

Generate PDF

Comment

Share

Branch

Create a new Post in reply

Ask a related Question

Flag

Preview

Preview

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

Save this setting as your default sorting preference?
Note: You can change your preference any time in your account settings
Don't show this again

From:
To:

Custom Message (optional):

Ask a Question

Create a Post

CovarianceMatrix efficiency

11285 views More... Contact Author Subscribe Generate PDF Comment Share Branch Create a new Post in reply Ask a related Question Flag

Preview

Preview

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

11285 views
More...

Contact Author

Subscribe

Generate PDF

Comment

Share

Branch

Create a new Post in reply

Ask a related Question

Flag