epostma

1429 Reputation

18 Badges

14 years, 177 days
Maplesoft

Social Networks and Content at Maplesoft.com

Maple Application Center
I am the manager of the Mathematical Software Group, working mostly on the Maple library. I have been working at Maplesoft since 2007, mostly on the Statistics and Units packages and on contract work with industry users. My background is in abstract algebra, in which I completed a PhD at Eindhoven University of Technology. During my studies I was always searching for interesting questions at the crossroads of math and computer science. When I came to Canada in 2007, I had nothing but a work permit, some contacts at Maplesoft (whom I had met at a computer algebra conference a year earlier), and a plan to travel around beautiful Canada for a few months. Since Maplesoft is a company that solves more interesting math and computer science related questions than just about anywhere else, I was quite eager to join them, and after about three months, I could start.

MaplePrimes Activity


These are replies submitted by epostma

Agreed, the best way is to make sure that X, Y, Z do not occur in the matrix when you do the substitution.

Of course we don't know how the original poster created the expression in the first place; but if possible, it might be best not to assign a value to R in the first place. You could just use

eval(intermediateResult, R = sqrt(X^2 + Y^2 + Z^2));

whenever you need to display an intermediate result or something like that, but not make the assignment - so that such problems do not crop up in the first place.

Hope this helps,

Erik Postma
Maplesoft.

Hi pagan,

That's a great answer, but I have two remarks that might improve it a little further. The first is a little convenience: you don't need to supply the ?rtable_dims  if you want ?rtable_scanblock  to go over the entire rtable - you can just call

rtable_scanblock(A, [], 'Average');

and similar for M etcetera.

The other is an optimization. For very large rtables, ?Statistics/Mean is faster even than rtable_scanblock (effectively it uses the BLAS linked to maple), but for smaller inputs it has more overhead just because the command does multiple very different things.

This graph shows the time per rtable entry in seconds that Statistics[Mean] and rtable_scanblock take on my developer machine in the current development version of Maple (the platform is 64-bit Linux). As you see, the crossover point is somewhere around 20000 entries. These are rtables created with the datatype=float[8] option (by calling ?Statistics/Sample , which does that automatically).

Hope this helps,

Erik Postma
Maplesoft.

Hi pagan,

That's a great answer, but I have two remarks that might improve it a little further. The first is a little convenience: you don't need to supply the ?rtable_dims  if you want ?rtable_scanblock  to go over the entire rtable - you can just call

rtable_scanblock(A, [], 'Average');

and similar for M etcetera.

The other is an optimization. For very large rtables, ?Statistics/Mean is faster even than rtable_scanblock (effectively it uses the BLAS linked to maple), but for smaller inputs it has more overhead just because the command does multiple very different things.

This graph shows the time per rtable entry in seconds that Statistics[Mean] and rtable_scanblock take on my developer machine in the current development version of Maple (the platform is 64-bit Linux). As you see, the crossover point is somewhere around 20000 entries. These are rtables created with the datatype=float[8] option (by calling ?Statistics/Sample , which does that automatically).

Hope this helps,

Erik Postma
Maplesoft.

@acer : Good catch. I should learn to read the documentation before posting... I presume that the unreliability is the reason we haven't documented it, and that this also means there are no promises of forward compatibility.

Erik.

@acer : Good catch. I should learn to read the documentation before posting... I presume that the unreliability is the reason we haven't documented it, and that this also means there are no promises of forward compatibility.

Erik.

You could consider doing this by calling sort['inplace'] on your whole data sample, then just using two pointers at the "current" positive and negative elements, which start at the end and beginning of the section of 0's and move outward from there. Not sure if this helps - just an idea, and maybe it's obvious.

Erik.

You could consider doing this by calling sort['inplace'] on your whole data sample, then just using two pointers at the "current" positive and negative elements, which start at the end and beginning of the section of 0's and move outward from there. Not sure if this helps - just an idea, and maybe it's obvious.

Erik.

@herclau : Values for options that are a specific name, such as U, S, and Vt in the SingularValues example, or any of the true/false options you see in many commands, almost always require the global name. That is, they require that your argument is identical to one of the things declared as legal values for the option, and an (unassigned) local name is not equal to the (unassigned) global name.

Typically, when you write such a procedure call in your program, you would use the scope resolution operator :- to indicate that you mean the global name, and quotes to prevent Maple from evaluating the global name if the user of your code would happen to have assigned something to that name. This is what you see in acer's suggestion to use ':-U' etc.

Hope this helps,

Erik Postma
Maplesoft.

@herclau : Values for options that are a specific name, such as U, S, and Vt in the SingularValues example, or any of the true/false options you see in many commands, almost always require the global name. That is, they require that your argument is identical to one of the things declared as legal values for the option, and an (unassigned) local name is not equal to the (unassigned) global name.

Typically, when you write such a procedure call in your program, you would use the scope resolution operator :- to indicate that you mean the global name, and quotes to prevent Maple from evaluating the global name if the user of your code would happen to have assigned something to that name. This is what you see in acer's suggestion to use ':-U' etc.

Hope this helps,

Erik Postma
Maplesoft.

Hi Axel,

We're definitely planning to make the paper available. Have patience! :)

Erik.

@John May : interestingly, the highlights seem to be cut off much more easily with this version than with the SVD version of the code. I'm trying to figure out if that makes sense.

@acer : Indeed, all of what I discussed is specific to user-defined distributions, but your question applies to both the predefined and the custom distributions, in principle: for both, there are situations where there is data we want to keep between invocations of the C level code. However, we currently have facilities for doing so only for the predefined distributions. Fixing this has been on my wish list for a while, but it hasn't bubbled to the top quite yet.

So in the mean time, for custom distributions, it is still best to, as you say, create a sample that is as big as your memory can bear and then slice and dice that to obtain your subsamples. For predefined distributions, it should be almost as efficient if you use this underdocumented calling sequence:

X := Statistics[RandomVariable](SomePredefinedDistribution(alpha, beta));
p := Statistics[Sample](X); # This does most of the library-level precomputation
for i to 10^5 do
  sample := p(10^4):
  process(sample):
end do:

This took 65s for SomePredefinedDistribution(alpha, beta) = Normal(0, 1) on one machine where I tested it (with the process statement omitted), versus 63s for 100 samples of size 10^7. We will try to document this calling sequence (where Sample returns a procedure)  better in a future version of Maple.

Hope this helps,

Erik Postma
Maplesoft. 

Note that you "lose" some properties relative to the 2nd order tensor case, where the SVD performs this task. For example, with SVD, the left singular vectors are orthogonal to each other, as are the right singular vectors, and thus you need at most min(n, m) of them to fully reconstruct your n by m matrix, and indeed the rank of such a matrix can be at most min(n, m). This does not hold for higher order tensors, at least with the definition of rank(v) as the minimal number of pure tensors needed to reconstruct v. For instance, not every colour image is a linear combination of at most three pure tensors, whereas it is a 3 by n by m tensor. This means that the putative left "singular vectors" of a full decomposition can never be orthogonal.

Erik.

I tried some heavier-tailed distributions (the Cauchy distribution and the cube of the normal distribution); these do not seem to lead to a convex polygon at all with the algorithm above! You quickly get a near square, with four points roughly on the axes. These four points do get further and further from the origin as you add more and more points, but new points land inside the convex hull with very high probability. Worse, the most likely spot to land outside of it is near the axes but further out than the current furthest point P in that direction, in such a manner that the new convex hull will contain P in its interior. I let these distributions try a couple tens of thousands of points a few times, but no dice.

For Student's t-distribution it seems to depend on the parameter: the higher, the more likely you are to get a convex polygon; for StudentT(2), the probability seems to be around a half, from ten tries.

A better approach for these distributions might be to reject all points outside a given circle with fixed radius.

Erik.

I tried some heavier-tailed distributions (the Cauchy distribution and the cube of the normal distribution); these do not seem to lead to a convex polygon at all with the algorithm above! You quickly get a near square, with four points roughly on the axes. These four points do get further and further from the origin as you add more and more points, but new points land inside the convex hull with very high probability. Worse, the most likely spot to land outside of it is near the axes but further out than the current furthest point P in that direction, in such a manner that the new convex hull will contain P in its interior. I let these distributions try a couple tens of thousands of points a few times, but no dice.

For Student's t-distribution it seems to depend on the parameter: the higher, the more likely you are to get a convex polygon; for StudentT(2), the probability seems to be around a half, from ten tries.

A better approach for these distributions might be to reject all points outside a given circle with fixed radius.

Erik.

First 7 8 9 10 11 12 13 Last Page 9 of 21