acer

32405 Reputation

29 Badges

19 years, 346 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

Yes, I knew that it was builtin, thanks. But I can think of no reason for that last one's returning a float. I will submit an SCR against it.

acer

Yes, I knew that it was builtin, thanks. But I can think of no reason for that last one's returning a float. I will submit an SCR against it.

acer

The calls to combinat:-randperm produce a lot of collectible garbage, which takes time to manage.

In a way, this boils down to discussions of a few years back: the give and take between producing a lot of random values up front (with corresponding large memory allocation) versus repeated Vector (or list) creation of many smaller samples (and the ensuing time cost of memory management of those temp objects). I still say that the ideal solution would consist of being be able to repopulate a single Vector with a multitude of random samples (inplace, using a compiled routine). That's how other good RNGs work. I got the impression, Joe, that you were expressing the same sort of idea.

My solution below is to generate a huge sample up front, based on how many iterations I plan to do and how many values I expect to need (at most, with a bit of leeway).

First, if I may make a few minor corrections, Joe. Your ShuffleMatrix3 had a typo in the typecheck of parameter T. And your example M and T would need subtype=Matrix to get past the parameter typing of ShuffleMatrix3. But I'll get rid of ShuffleMatrix3 altogether below and call PermuteMatrixRows directly in the loop, to save all those extra function calls.

I also commented out the part of PermuteMatrixRows that updated M, to try and cut its work&time in half. I think that it's fine to have a routine which re-uses a container like T, and that with such in hand it's not necessary to "fake" inplace operation on input M.

> restart:
>
> PermuteMatrixRows := proc(M :: Array(datatype=integer[4])
>                           , T :: Array(datatype=integer[4])
>                           , P :: Vector(datatype=integer[4])
>                           , m :: nonnegint
>                           , n :: nonnegint
>                          )
> local i,j;
>     for i to m do
>         for j to n do
>             T[i,j] := M[P[i],j];
>         end do;
>     end do;
>     ## Letting it off the hook for updating M.
>     #for i to m do
>     #    for j to n do
>     #        M[i,j] := T[i,j];
>     #    end do;
>     #end do;
>     return NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n):=3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cPermuteMatrixRows:=Compiler:-Compile(PermuteMatrixRows):
> for k from 1 to maxnum do
>   P := Vector(combinat:-randperm(m), 'datatype'=integer[4]):
>   cPermuteMatrixRows(M, T, P, m, n);
> end do:
> time()-st;
                                    20.465

Now, using a method which explicitly creates its own permutation of 1..m. (I hope that it is a valid implementation of Fisher-Yates, and that any mistake will be pointed out.) It's a bit wasteful is this manner: whenever it needs a random value from 1..j with 1<=j<=m it uses the same big sample, reject values >j which wastes quite a lot of the precomputed values. So the memory use could be improved, by utilizing several reservoirs instead.

> restart:
>
> Mshuffle:=proc(M1::Matrix(datatype=integer[4]),
>                M2::Matrix(datatype=integer[4]),
>                m::integer, n::integer,
>                V::Vector(datatype=integer[4]),
>                ransam::Vector(datatype=integer[4]))
# Copies row i of of mxn M1 into row V[i] of mxn M2,
# for i=1..m, where V is an inplace generated random
# shuffle of 1..m.
# ransam is a source of precomputed random values in 1..m,
# with ransam[1] storing position of previuos run's last
# accessed entry.
> local i::integer, j::integer, count::integer,
>       need::integer, cand::integer;
>   count:=ransam[1];
>   for i from 1 to m do
>     V[i]:=i;
>   end do;
>   for i from m to 2 by -1 do
>     need:=1;
>     while need=1 do
>       count:=count+1;
>       cand:=ransam[1+count];
>       if cand<=i then
>         need:=0;
>       end if;
>     end do;
>     V[i],V[cand]:=V[cand],V[i];
>   end do;
>   for i from 1 to m do
>     for j from 1 to n do
>        M2[V[i],j]:=M1[i,j];
>     end do;
>   end do;
>   ransam[1]:=count;
>   NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n) := 3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cMshuffle:=Compiler:-Compile(Mshuffle):
> B:=Vector[row](m,datatype=integer[4]):
> rsrc:=LinearAlgebra:-RandomVector(maxnum*m*m,generator=1..m,
>                            outputoptions=[datatype=integer[4]]):
> for k from 1 to maxnum do
>   cMshuffle(M,T,m,n,B,rsrc);
> end do:
> time()-st;
                                     0.837

For 1000000 repetitions cMshuffle took 7.5 sec and 110MB for the session while cPermuteMatrixRows&randperm took 201.5 sec and 35.8MB for the session.

I'd like to get a little philosophical. I don't think that implementing simple algortihms (for sorting, or standard deviation, or shuffling, etc) is too much of a pain. The Compiler gives the current best performance possible in Maple for many purely numerical problems (except compared to calling external to some other program...). I like to think that examples which illustrate it (along with inplace a memory management refinements) are going to help some people with other similar tasks.

BTW, I think that your ideas, Joe, about efficiently permuting the entries of any datatype rtable is feasible. The OpenMaple API allows for it, not problem. But it might require writing a (reusable) compiled external routine.

acer

The calls to combinat:-randperm produce a lot of collectible garbage, which takes time to manage.

In a way, this boils down to discussions of a few years back: the give and take between producing a lot of random values up front (with corresponding large memory allocation) versus repeated Vector (or list) creation of many smaller samples (and the ensuing time cost of memory management of those temp objects). I still say that the ideal solution would consist of being be able to repopulate a single Vector with a multitude of random samples (inplace, using a compiled routine). That's how other good RNGs work. I got the impression, Joe, that you were expressing the same sort of idea.

My solution below is to generate a huge sample up front, based on how many iterations I plan to do and how many values I expect to need (at most, with a bit of leeway).

First, if I may make a few minor corrections, Joe. Your ShuffleMatrix3 had a typo in the typecheck of parameter T. And your example M and T would need subtype=Matrix to get past the parameter typing of ShuffleMatrix3. But I'll get rid of ShuffleMatrix3 altogether below and call PermuteMatrixRows directly in the loop, to save all those extra function calls.

I also commented out the part of PermuteMatrixRows that updated M, to try and cut its work&time in half. I think that it's fine to have a routine which re-uses a container like T, and that with such in hand it's not necessary to "fake" inplace operation on input M.

> restart:
>
> PermuteMatrixRows := proc(M :: Array(datatype=integer[4])
>                           , T :: Array(datatype=integer[4])
>                           , P :: Vector(datatype=integer[4])
>                           , m :: nonnegint
>                           , n :: nonnegint
>                          )
> local i,j;
>     for i to m do
>         for j to n do
>             T[i,j] := M[P[i],j];
>         end do;
>     end do;
>     ## Letting it off the hook for updating M.
>     #for i to m do
>     #    for j to n do
>     #        M[i,j] := T[i,j];
>     #    end do;
>     #end do;
>     return NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n):=3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cPermuteMatrixRows:=Compiler:-Compile(PermuteMatrixRows):
> for k from 1 to maxnum do
>   P := Vector(combinat:-randperm(m), 'datatype'=integer[4]):
>   cPermuteMatrixRows(M, T, P, m, n);
> end do:
> time()-st;
                                    20.465

Now, using a method which explicitly creates its own permutation of 1..m. (I hope that it is a valid implementation of Fisher-Yates, and that any mistake will be pointed out.) It's a bit wasteful is this manner: whenever it needs a random value from 1..j with 1<=j<=m it uses the same big sample, reject values >j which wastes quite a lot of the precomputed values. So the memory use could be improved, by utilizing several reservoirs instead.

> restart:
>
> Mshuffle:=proc(M1::Matrix(datatype=integer[4]),
>                M2::Matrix(datatype=integer[4]),
>                m::integer, n::integer,
>                V::Vector(datatype=integer[4]),
>                ransam::Vector(datatype=integer[4]))
# Copies row i of of mxn M1 into row V[i] of mxn M2,
# for i=1..m, where V is an inplace generated random
# shuffle of 1..m.
# ransam is a source of precomputed random values in 1..m,
# with ransam[1] storing position of previuos run's last
# accessed entry.
> local i::integer, j::integer, count::integer,
>       need::integer, cand::integer;
>   count:=ransam[1];
>   for i from 1 to m do
>     V[i]:=i;
>   end do;
>   for i from m to 2 by -1 do
>     need:=1;
>     while need=1 do
>       count:=count+1;
>       cand:=ransam[1+count];
>       if cand<=i then
>         need:=0;
>       end if;
>     end do;
>     V[i],V[cand]:=V[cand],V[i];
>   end do;
>   for i from 1 to m do
>     for j from 1 to n do
>        M2[V[i],j]:=M1[i,j];
>     end do;
>   end do;
>   ransam[1]:=count;
>   NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n) := 3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cMshuffle:=Compiler:-Compile(Mshuffle):
> B:=Vector[row](m,datatype=integer[4]):
> rsrc:=LinearAlgebra:-RandomVector(maxnum*m*m,generator=1..m,
>                            outputoptions=[datatype=integer[4]]):
> for k from 1 to maxnum do
>   cMshuffle(M,T,m,n,B,rsrc);
> end do:
> time()-st;
                                     0.837

For 1000000 repetitions cMshuffle took 7.5 sec and 110MB for the session while cPermuteMatrixRows&randperm took 201.5 sec and 35.8MB for the session.

I'd like to get a little philosophical. I don't think that implementing simple algortihms (for sorting, or standard deviation, or shuffling, etc) is too much of a pain. The Compiler gives the current best performance possible in Maple for many purely numerical problems (except compared to calling external to some other program...). I like to think that examples which illustrate it (along with inplace a memory management refinements) are going to help some people with other similar tasks.

BTW, I think that your ideas, Joe, about efficiently permuting the entries of any datatype rtable is feasible. The OpenMaple API allows for it, not problem. But it might require writing a (reusable) compiled external routine.

acer

Yes, numapprox, mentioned also in an ealier reply above, is a good approach for this.

But be careful, the accuracy might not be as good as that last argument to minimax.

For 8 digits (was that actually requested by the OP!?) one could try something like this, after evaluating at the values for Km, S0 and v,

Digits:=15:

numapprox:-minimax(s,t=0..135,[5,6],.5e-9);

And so on.

That gives a result that is easy to code in double-precision, with 15 digits float coefficients.

And the numerator and denominator are even returned already in Horner form.

A tougher question might be: how to get or demonstrate that a numapprox result is adequate for all the parameter ranges mentioned in the OP's followup replies.

acer

Yes, numapprox, mentioned also in an ealier reply above, is a good approach for this.

But be careful, the accuracy might not be as good as that last argument to minimax.

For 8 digits (was that actually requested by the OP!?) one could try something like this, after evaluating at the values for Km, S0 and v,

Digits:=15:

numapprox:-minimax(s,t=0..135,[5,6],.5e-9);

And so on.

That gives a result that is easy to code in double-precision, with 15 digits float coefficients.

And the numerator and denominator are even returned already in Horner form.

A tougher question might be: how to get or demonstrate that a numapprox result is adequate for all the parameter ranges mentioned in the OP's followup replies.

acer

Hi Alec,

The main difference between d01amc and d01smc is that the latter is supposed to be thread-safe (in its own sake). So, if the Maple integrand were also known to be thread-safe too then the define_external call to the wrapper around d01smc could reasonably get the 'THREAD_SAFE' option. Without that option to define_external (which is presently the case under `evalf/int` in M13 and M14) then accessing the library containing the NAG function entry points is only allowed for one thread at a time. See ?define_external for a paragraph on this.

But it's a tricky thing to automate correctly in general: the maple integrand might be thread-unsafe. It's easier for everyone when the evalf/Int call is explicit in the user's code, but what about code that calls a Maple Library routine which in turn calls evalf/Int internally? In that case there is no convenient and automatic way to pass an evalf/Int option which correctly toggles the relevant define_external call. The user's code might be using Maple Threads, but perhaps the computation instead relies on recognizing that some integrand is actually thread-unsafe.

The d01smc routine may also have an extra (new, over d01amc) argument of type (struct) Nag_Comm, for communicating with the function. But exposing that via evalf/Int might not be of worth to most people.

What I'm trying to suggest is that, in the absence of the relevant  define_external call having its THREAD_SAFE' option, the choice between calling out from Maple to d01amc versus d01smc may be of little or no consequence. And the question of when and how to allow that define_external option might be problematic.

I don't know of a fast, efficient, and robust piece of code to ascertain thread-safety of a Maple routine. What with lexical scoping and other things, there are many devious ways for a Maple procedure to write to a higher scope. A user-defined integrand in operator form might have 'option threadsafe' or a similar flag. But in expression form, an integrand could contain a quoted function call to some other thread-unsafe procedure. And so on...

So perhaps it would be better for evalf/Int only to allow the user to select explicitly the thread-safe NAG function, either by explicit name like _d01smc, _d01skc, etc or by a general option such as, say, 'allowthreads', or by passing the integrand as a proc with 'option threadsafe' (which is a made-up term).

acer

Hi Alec,

The main difference between d01amc and d01smc is that the latter is supposed to be thread-safe (in its own sake). So, if the Maple integrand were also known to be thread-safe too then the define_external call to the wrapper around d01smc could reasonably get the 'THREAD_SAFE' option. Without that option to define_external (which is presently the case under `evalf/int` in M13 and M14) then accessing the library containing the NAG function entry points is only allowed for one thread at a time. See ?define_external for a paragraph on this.

But it's a tricky thing to automate correctly in general: the maple integrand might be thread-unsafe. It's easier for everyone when the evalf/Int call is explicit in the user's code, but what about code that calls a Maple Library routine which in turn calls evalf/Int internally? In that case there is no convenient and automatic way to pass an evalf/Int option which correctly toggles the relevant define_external call. The user's code might be using Maple Threads, but perhaps the computation instead relies on recognizing that some integrand is actually thread-unsafe.

The d01smc routine may also have an extra (new, over d01amc) argument of type (struct) Nag_Comm, for communicating with the function. But exposing that via evalf/Int might not be of worth to most people.

What I'm trying to suggest is that, in the absence of the relevant  define_external call having its THREAD_SAFE' option, the choice between calling out from Maple to d01amc versus d01smc may be of little or no consequence. And the question of when and how to allow that define_external option might be problematic.

I don't know of a fast, efficient, and robust piece of code to ascertain thread-safety of a Maple routine. What with lexical scoping and other things, there are many devious ways for a Maple procedure to write to a higher scope. A user-defined integrand in operator form might have 'option threadsafe' or a similar flag. But in expression form, an integrand could contain a quoted function call to some other thread-unsafe procedure. And so on...

So perhaps it would be better for evalf/Int only to allow the user to select explicitly the thread-safe NAG function, either by explicit name like _d01smc, _d01skc, etc or by a general option such as, say, 'allowthreads', or by passing the integrand as a proc with 'option threadsafe' (which is a made-up term).

acer

Hi Axel,

Yes, it looks as if `evalf/int/control` might use `evalf/int/CreateProc` or something to make a proc from an expression-form integrand.

But before that happens, `evalf/int/control` seems to really poke expensively at an expression-form integrand when a NAG method is not specified. It may be looking for singularities. I can understand why it might do that: it wants to compute tough problems correctly by default. I wouldn't mind seeing another way to prevent that cost.

acer

Hi Axel,

Yes, it looks as if `evalf/int/control` might use `evalf/int/CreateProc` or something to make a proc from an expression-form integrand.

But before that happens, `evalf/int/control` seems to really poke expensively at an expression-form integrand when a NAG method is not specified. It may be looking for singularities. I can understand why it might do that: it wants to compute tough problems correctly by default. I wouldn't mind seeing another way to prevent that cost.

acer

Thanks, Alec. The minor mystery is now solved (see your Edit.)

But it does puzzle me that evalf/Int can do so much more (symbolic?) work, digging away at investigating the integrand when in expression form. I suppose that passing the integrand as a procedure/operator merely tricks it into considering it as a black box that cannot be poked. It's a little worrisome that one has to use a cheap trick, or know which forced method is appropriate, to disable it. It might be nicer to have an optional parameter that controls it, eg. discont=false (assuming that I'm right and that it is discontinuity checking that makes the difference) or whatever.

acer

Thanks, Alec. The minor mystery is now solved (see your Edit.)

But it does puzzle me that evalf/Int can do so much more (symbolic?) work, digging away at investigating the integrand when in expression form. I suppose that passing the integrand as a procedure/operator merely tricks it into considering it as a black box that cannot be poked. It's a little worrisome that one has to use a cheap trick, or know which forced method is appropriate, to disable it. It might be nicer to have an optional parameter that controls it, eg. discont=false (assuming that I'm right and that it is discontinuity checking that makes the difference) or whatever.

acer

Are there any ideas for a more lightweight interface to mapleprimes which might be more convenient on a mobile device? (eg, smartphone, etc)?

I am thinking about the sorts of issues that one encounters when accessing a dense site from a mobile device. There are a few sites (google, yahoo, wikipedia, facebook) where a specialized entry-point can make a world of difference.

Accessing the present (v.1) of mapleprimes isn't that great on a smartphone. I am wondering whether the imminent v.2 mapleprimes might  eventually offer a better experience.

acer

Hi Alec,

On my 64bit Linux Maple 13.01, the operator form I showed and the forced _d01amc method took very similar times. Trying each many times, always in fresh sessions, (plotting from 2..100 say) showed as much timing variation for either alone as between both of them.

I have found that the disable-discont-probing-using-operator trick can avoid timing blowups for some finite ranges too. Using forced methods like _d01amc for semi-infinite ranges, or _d01akc for oscillating integrands, etc, can mean that one has to remember or look up their names and purposes. So I tend to advise trying the operator form first for this trick (if one believes that there is less risk to avoiding discont checks).

I saw plot output for only the range 100..110 using either method. But I think I might know what you mean. Using the Standard GUI, the very small value is not plotted, if there is a much larger value shown. So, for example, plotting f (or PP) from 100 to 200 only shows the plotted red line up to about 118 or so. This seems to be a Standard plotting bug. Similarly if plotting from only 200 to 300, or from only 300 too 400. I wonder if it's a single-precision display issue, or something. It seems to have trouble with the plotted red line varying more than a factor of about 10^8. The problem does not seem to occur in the commandline interface.

acer

Hi Alec,

On my 64bit Linux Maple 13.01, the operator form I showed and the forced _d01amc method took very similar times. Trying each many times, always in fresh sessions, (plotting from 2..100 say) showed as much timing variation for either alone as between both of them.

I have found that the disable-discont-probing-using-operator trick can avoid timing blowups for some finite ranges too. Using forced methods like _d01amc for semi-infinite ranges, or _d01akc for oscillating integrands, etc, can mean that one has to remember or look up their names and purposes. So I tend to advise trying the operator form first for this trick (if one believes that there is less risk to avoiding discont checks).

I saw plot output for only the range 100..110 using either method. But I think I might know what you mean. Using the Standard GUI, the very small value is not plotted, if there is a much larger value shown. So, for example, plotting f (or PP) from 100 to 200 only shows the plotted red line up to about 118 or so. This seems to be a Standard plotting bug. Similarly if plotting from only 200 to 300, or from only 300 too 400. I wonder if it's a single-precision display issue, or something. It seems to have trouble with the plotted red line varying more than a factor of about 10^8. The problem does not seem to occur in the commandline interface.

acer

First 460 461 462 463 464 465 466 Last Page 462 of 593