## 1429 Reputation

14 years, 117 days
Maplesoft

## Social Networks and Content at Maplesoft.com

I am the manager of the Mathematical Software Group, working mostly on the Maple library. I have been working at Maplesoft since 2007, mostly on the Statistics and Units packages and on contract work with industry users. My background is in abstract algebra, in which I completed a PhD at Eindhoven University of Technology. During my studies I was always searching for interesting questions at the crossroads of math and computer science. When I came to Canada in 2007, I had nothing but a work permit, some contacts at Maplesoft (whom I had met at a computer algebra conference a year earlier), and a plan to travel around beautiful Canada for a few months. Since Maplesoft is a company that solves more interesting math and computer science related questions than just about anywhere else, I was quite eager to join them, and after about three months, I could start.

## Final note...

As a final note, as I was investigating this I figured that there was something fishy going on with either the documentation of the GaussMarkovProcess, or the implementation of it. Looking at the literature about Ornstein-Uhlenbeck processes, I think I understand it now. It looks like the documentation is incorrect about the SDE used for GaussMarkovProcess instances: it should be dX = (g + h*X)dt + sigma*dW, rather than dX = h*(g-X)*dt + sigma*dW. This, at least, is a change I can easily find time for!

The Euler scheme expression above should therefore be

g(t)*dt + h(t)*X*dt + sigma(t)*sqrt(dt)*W

rather than what it says above. The "unbiased scheme" expression is consistent with what's in the code.

Erik.

## Hmmm...

@itsme Hmmm... it looks like I'll need to eat my words in this case!

I was trying to add this info to our help pages, but then found there is a way you can control the integration scheme that's used, in an indirect way. Many of the process creation procedures have a 'scheme' option, which has two values: Euler and unbiased. The default value is Euler, and in that case, integration proceeds as I described above. However, the unbiased case is more complicated.

I looked at the GaussMarkovProcess case to find out what exactly is going on. I haven't completely figured out why it works yet, but here is the short version. If you select 'scheme = unbiased', then Maple uses its own integration methods (as implemented in evalf/Int)  to do as much of the integration as possible behind the scenes. Concretely, here's what happens:

The SDE for this class of process is dX = h * (g - X) * dt + sigma * dW. Define the following auxiliary functions:

lambda(t, t + dt) = exp(int(h(z), z = t .. t+dt))

mu(t, t+dt) = int(exp(int(h(u, u = z .. t + dt)) * g(z), z = t .. t + dt)

s(t, t+dt) = sqrt(int(exp(2*int(h(u), u=z..t+dt))*sigma(z)^2, z=t..t+dt))

Integration proceeds *technically* via the Euler-Maruyama scheme in all cases, with a callback from QuantLib. QuantLib asks the process what the change in X should be as time progresses from t to t+dt; this delta is added to the current value of X. If the Euler scheme is selected, then (unsurprisingly) Maple returns

h(t) * g(t) * dt - h(t) * X * dt + sigma(t) * sqrt(dt) * W,

where W is a standard normally distributed random variable. On the other hand, if the unbiased scheme is selected, Maple returns:

(lambda(t, t+dt) - 1) * X + mu(t, t+dt) + s(t, t+dt)*W.

This means that for every time step, Maple needs to compute a number of (deterministic) numerical integrals (through evalf/Int). (Note that this happens only once for each new process-time grid combination; if you create many sample paths, the integrals for each time point are computed only once.) The time steps and integration scheme used for these integrations cannot be set manually.

I haven't figured out yet why, exactly, the second expression is a valid thing to return, but (at the cost of being many times slower) it appears to give excellent results.

As for documenting this, it's now become a much larger task. I'll see if I can schedule this in somewhere...

Best

Erik.

## I'll add it...

@itsme You're right - I'll try to add it for a future version of Maple.

Erik.

## @Mac Dude Yes, in order for the code to ...

@Mac Dude

Yes, in order for the code to work correctly you have to provide all inflection points. The code should not care about additional points (actually, any additional points, whether or not the 2nd derivative is 0).

I think you're correct that the code is less efficient than a simpler solution if the PDF changes for every invocation. If efficiency is very important, it might be possible to use some of the ideas there to get a tight fitting envelope in your situation, though I think it depends on being able to describe the inflection points relatively easily.

Recalling writing those posts, I think I just didn't consider non-normalized PDFs. The examples I had in mind were all easy to normalize so it just didn't come up. (In the context of the Statistics package, you would definitely need to normalize the PDF - there, everything relies on it being normalized.)

With regards to your PS: I don't know of an easy way to find such posts. I expect most of them would be "featured posts" when they are written, but there's no easy way to find them after the fact. Another favourite of mine is Joe Riel's post about Cartesian products of lists, at http://www.mapleprimes.com/posts/41838-Cartesian-Products-Of-Lists. I reread it once every year or so :)

Erik Postma
Maplesoft.

## Response from Erik...

Hi Mac Dude,

First off, I'm glad to hear you enjoyed my post. Here are a couple of notes in reply to your ideas and suggestions:

1. The process you arrived at looks sound, pretty efficient, and correct to me. That's what matters most!
2. Most of the time spent in generating the 600 points you show, is undoubtedly spent in finding the inflection points of the PDF. This is a step that's needed to make the particular implementation of Maple's envelope generation work. This implementation is explained in a series of four mapleprimes posts I posted four years ago. Finding the inflection points is done in pure Maple code, and it can be slow for complicated PDFs; once that's done, the actual generation of points is done in external C code, it requires very few PDF evaluations, and it's quite fast.
3. The technique from the end of the Sample help page where one leaves a parameter unassigned when calling Sample, and then sets it before calling the resulting procedure, is not efficient for use with the envelope rejection generator: the envelope needs to be recomputed for each invocation anyway, so there's no noticeable benefit to having the sampling procedure pre-computed. (For builtin distributions, it is efficient, because those have pre-written C code for generation of random numbers without substantial set up, and it's just a matter of sending the correct value in.)
4. If you know something about the conditional PDF for y given the value of x, that tells you where one can find its inflection points (as a function of x), then you can adapt the code given in part four of the series linked above to possibly speed the random generation up a bit more. As always, the question is, how much do you need the extra speed vs. how much time would it cost to write this adapted version.

Let me know if you have more questions or remarks.

Erik Postma
Maplesoft.

## Thanks for your question, @Majmaj. ...

Thanks for your question, @Majmaj.

The reason that Maple uses the formula that it uses, is the same reason that it divides by N-1, rather than by N, to obtain the variance: this is the formula that gives you an unbiased estimate of the distribution variance -- at least, if the distribution that your values come from is normal. This is really the only formula that makes sense for the general case: where the weights can be any positive numbers. (Consider, for example, if your weights are 1/10, 1/9, 1/8; then using sum(w[i])-1 does not make sense.)

You could argue, as @CarlLove does below, that Maple should detect the case where all weights are integers (or the weights sum to a number greater than 1, or some other condition) and in that case use a different formula - the formula that assumes that weights are just repetitions. I don't agree with this proposal - I think using formulas with different outcomes, in a manner that is controlled by the values of the arguments, and not even in a continuous manner, is just bad practice; it would make the behaviour of Maple much less predictable than it is currently.

If you want to view weights as repetitions in the Statistics package, then it's best to do the expansion yourself: if your data set and the numbers of repetitions are both given as lists, you can do this nicely as follows:

`L := [3,2,2];W := [7,5,8];LW := zip(`\$`, L, W);Statistics:-StandardDeviation(LW); # returns 0.489360...`

Hope this helps,

Erik Postma
Maplesoft.

## copy and LinearAlgebra:-Copy...

Note that both ?copy and ?LinearAlgebra:-Copy will work. They do almost exactly the same thing if their argument is a Matrix or Vector.

Erik Postma
Maplesoft.

## copy and LinearAlgebra:-Copy...

Note that both ?copy and ?LinearAlgebra:-Copy will work. They do almost exactly the same thing if their argument is a Matrix or Vector.

Erik Postma
Maplesoft.

## You set the random seed to a fixed argum...

Hi herclau,

If you call ?randomize with a fixed argument, you always set the random generator to the same state, so you will always get the same matrices. That's its purpose.

If you want to get different results, it's probably best to just not restart in between subsequent runs. Alternatively, you can omit the argument in the call to randomize(); that will set the random seed to something based on the current internal clock time of your computer (though I believe it only uses a granularity of seconds, so don't do it in, say, a loop - in which case you're doing something wrong anyway, because for obvious reasons you can't restart from within a loop.)

HTH,

Erik Postma
Maplesoft.

## You set the random seed to a fixed argum...

Hi herclau,

If you call ?randomize with a fixed argument, you always set the random generator to the same state, so you will always get the same matrices. That's its purpose.

If you want to get different results, it's probably best to just not restart in between subsequent runs. Alternatively, you can omit the argument in the call to randomize(); that will set the random seed to something based on the current internal clock time of your computer (though I believe it only uses a granularity of seconds, so don't do it in, say, a loop - in which case you're doing something wrong anyway, because for obvious reasons you can't restart from within a loop.)

HTH,

Erik Postma
Maplesoft.

## More details?...

Hi Daniel / SamuelTuvare,

Could you show us the details of what you have done so far?

Erik Postma
Maplesoft.

## The system cannot deal with length-0 tim...

Taking the time interval between t=0 and t=0 and dividing it into 100 time steps, you get steps of length 0. The system is not set up for this. Similarly, you also can't ask for the value at negative times.

Hope this helps,

Erik Postma
Maplesoft.

## The system cannot deal with length-0 tim...

Taking the time interval between t=0 and t=0 and dividing it into 100 time steps, you get steps of length 0. The system is not set up for this. Similarly, you also can't ask for the value at negative times.

Hope this helps,

Erik Postma
Maplesoft.

## Faster, faster!...

@Markiyan Hirnyk : Another option that is equally fast (measured to within measurement accuracy on an i7 running linux-64) is:

`nops(select(`<`, thelist, 0));`

The big advantage of this over the version with c -> c < 0 is that it only uses a kernel-internal procedure in the inner loop.

Met vriendelijke groet,

Erik Postma
Maplesoft.

## Faster, faster!...

@Markiyan Hirnyk : Another option that is equally fast (measured to within measurement accuracy on an i7 running linux-64) is:

`nops(select(`<`, thelist, 0));`

The big advantage of this over the version with c -> c < 0 is that it only uses a kernel-internal procedure in the inner loop.

Met vriendelijke groet,

Erik Postma
Maplesoft.

﻿