mmcdara

7891 Reputation

22 Badges

9 years, 54 days

MaplePrimes Activity


These are replies submitted by mmcdara

@Joe Riel 

With all due respect, are you sure this is the case for every operational system ?

(If I'm not mistaken I faced problems with Maple 2015 and 2016 when using ExcelTools with cmaple ... but I could be wrong)

@vv 

SolveTools:-SemiAlgebraic({x^2 + y^2 + a*x*y<z},[x,y], 'parameters'=[a,z]) i

is not sufficient: the good command is

SolveTools:-SemiAlgebraic({x^2 + y^2 + a*x*y<z, x > 0, x < 1, y > 0, y < 1},[x,y], 'parameters'=[a,z])  


Now try and run this command without no more information, that is for any real value a....
The memory used to compute the solution (CodeTools:-Usage) is larger than 15 GiB, giving a small idea of how complex the solution is.
 

Let's try a simple thing with a=4/3, as set in my previous file

SolveTools:-SemiAlgebraic({x^2 + y^2 + 4/3*x*y<z, x < 1, y > 0, y < 1},[x,y], 'parameters'=[a,z]);   ?

and give me please the formal expression of "the (double) integral (not necessarily a closed form, of course, even if such a closed form exists)." given the output of SemiAlgebraic.
If you can and Maple can't we can say Maple has some weaknesses. But if you can't how could Maple ?

I think you deeply underestimate the difficulty of the problem


SemiAlgebraic.mw


BTW: I probably made a mistake in my previous post. I think I have confused z and sqrt(z) in the reduced form of the ellipses. I can correct the file ToyProblem.mw if needed

@vv 

Not really a severe limitation of PDF.
Or you have to consider in this case that "int has severe limitations" too ...

The problem here reduces to the integration of some function z=f(x,y) over a domain D(x,y | z) where "|" stands for "conditional to the value z of Z".
Generally difficult, without even considering the fact that the CDF of a sum of independent random variable already invokes a convolution product.


A few words about your toy-case to explain all of this.

Let Z = X^2 + Y^2 + a*X*Y.

First point: it is easy to show that the level curves Z = constant are ellipses as long as  |a| < 2 (hyperbolas for |a| > 2 and a degenerate conocs for |a|=2).

Second point: there are a few obvious cases that makes obvious the computation of  Prob(Z < z)  

  • a=2 (Z = (X+Y)^2) here the level curves of Z are parallel lines of slope -1  
  • a=-2 (Z = (X-Y)^2)  here the level curves of Z are parallel lines of slope -1 
  • a=0 (Z = X^2 + Y^2)  here the level curves of Z are concentric circles of center (0,0)


Apart those 3 elementary situations (Maple 2015 doesn't even solve without a little bit of help), let us consider a  simpler problem than your original one in order to introduce a geometrical approach.
Let X any be two Uniform random variable over [-H, H] where H is some strictly positive number.
Then, provided z is "sufficiently small" regarding to H, Prob(Z < z) is just the surface of some ellipsa.
Under the change of variables {X=U*cos(Pi/4)+V*sin(Pi/4), Y=-U*sin(Pi/4)+V*cos(Pi/4)} Z = X^2 + Y^2 + a*X*Y
writes Z = (1-(1/2)*c)*U^2+(1+(1/2)*c)*V^2.
With  A = 1/sqrt((1-(1/2)*c)) and B = 1/ sqrt((1+(1/2)*c)) this relations becomes Z = U^2/A^2+V^2/B^2
Then Prob(Z < z) = Pi*(A*z)*(B*z).
This result holds only if ellipsa U^2/A^2+V^2/B^2 = z^2 is entirely contained in the unit square [-H,  H]^2.

Now a more complex case.
First observe that U^2/A^2+V^2/B^2 = z^2 describe ellipses centered at point (0,0): then there exist no complete ellipsa within the square (X,Y)=[0, H]^2 and Prob(Z < z) is the measure of some angular sector of the ellipsa defined by angles alpha and beta.
A general formula for this measure, with a proper definition of alpha and beta, is
A*B/2*(arctan(A/B*tan(alpha)) - arctan(A/B*tan(beta))) (... = Prob(Z < z) )
Even if  it is not very difficult to compute "by hand", you already see that some auxiliary difficulties occur:

  • express Z = X^2 + Y^2 + a*X*Y in the more suitable form Z = (1-(1/2)*c)*U^2+(1+(1/2)*c)*V^2
  • identify the ellipsa U^2/A^2+V^2/B^2 = z^2 
  • compute alpha and beta

It is not very surprising that PDF (or int) is not capable to do tje job without any human help.


To give you a better idea of the details of the computations you can look to the attached file.
It doesn't contain a complete solution for any value of a in [-2, 2], but just an example for the case a=4/3 that PDF doesn't handle (generalisation of the given example is straightforward).


ToyProblem.mw

 

If you want a more general solution for any value of "a", you have to analyse the different situations when the iso-probability curves are hyperbolas.
At the end a complete solution of your toy-problem becomes a very complex issue.




 

A 3x3x3 structured set of numbers is not a matrix from a mathematical point of view (just try to buid such a matrix in Maple to see what happens).
Then trying to compute the determinant of such an object (an array in tMaple) makes no sense.

@Preben Alsholm 

Crystal clear, 

Thank you Preben

@Preben Alsholm 

Hi, 

I allow myself to answer you from home.
You write "Incidentally, didn't you mean reinitialize instead of initialize?" ... it's likely, yes (I have no way to verify this right nox)
I will try your last suggestion as soon as I get to work tomorrow.

Thanks for the help; I will keep you informed of the results

@maple2015 

I have not read your last post yet, I will do that tomorrow.
For the moment I have a suggestion that dramatically  improve the quality of your model: fit log(Z), not Z.

You will find some premiminary results in the attached file 

  • Use of Z
    • The estimation of the representation error for all the models from [1, x] to [1, x, y, x^2, y^2, x*y]
    • The estimation of their generalization error GE with the Leave-One-Out method
  • Use of log(Z)
    • fit of the model [1, x, y, x^2, y^2, x*y]
    • plot of the exponential of this fitted model and comparison with the original data


LinearFit2.mw

Carl Love 
suggests to look to the p-values for the many terms of the model: this is obviously the first thing to do.
These p-values tell you what coefficient of the model is not significantly different from 0 ... but take this information very carefully: some people use the p-values to drop out some "non significant" terms fom a "big" model, obtaining then a more parcimonuous model. The problem is that this concise "small" model may be worse than some other model in betwwen the original "bg" one and this "small" one.




BTW: Student:-Statistics:-ChiSquareGoodnessOfFitTest is of no help in your case.
Goodness of Fit tests (GoF tests) are used  to compare empirical statistical distributions, for instance the empirical distributions of two samples possibly drawn (null hypothesis) from a same population.
You are not in this situation
If you want to assess the quality of your model, the best to do is to analyse the residuals (output=residuals in LinearFit): plot these values against the values of Z (or X or Y)  to detect some lack of representation of the model. If the cloud of points shows no particular structure your model can be reasonably correct. If the cloud contains some pattern, then the model is probably biaised or too poor to represent correctly the data.
Here is a simple reference showing a few classical patterns https://stattrek.com/regression/residual-analysis.aspx

Another simple trick is to observe the Cook's distances or the leverages (also returned by LinearFit)
https://en.wikipedia.org/wiki/Cook%27s_distance



Last thing to remind of: Goodness of Fit tests  (Chi square for example) require some conditions to give reliable answers.
In particular their answer is very sensitive to the sample size and, personnaly, I would never use a Chi square GoF test the way you did.



 

@maple2015 

 

BTW: I gave a look to your data (a few plots Z vs X and Z vs Y) and it doesn't seem that Z values are entailed by some measurement error.
I don't know where your data come from, but if Z is, for instance, the result of some unknown function F of X and Y 
(this is a classical situation which appears when you want to buil a simplified model of a more complex and time consuming one)
then a fruitfull approach is to use a kriging model.
This feature is now offered by Maple 2018. Unfortinately I'm now at home with my personal Maple 2015 license and I can tel you much. If your problem is relevant of a kriging modeling and if you have Maple 2018 at your disposal, it could be interesting for you to test it.
If you can't and are neverhtless interested by the kriging approach, please let me know and I will be able to do some job this monday once at my office.

 

@maple2015 

One more time: what do you mean by "the best statistical model".

Mac Dude here gave you some hints but, in essence, he didn't say anything else ...

Let's do the things simple: if you have N observations (17 in your case), then there exist an infinity of models with 17 parametrers that fit perfectly the data. In some sense theu are all "best statistical models".
One says that the "representation error" of all these models is 0.

Suppose you take one of them M :(X,Y) --> Z.
For all n in {1..N} you have M(X[n], Y[n]) =Z[n] ... but what is the prediction error of M at some other point (X', Y') not in 
{(X[n], Y[n]), n=1..N} ???
This last error is also termed "generalization error" ans describes the capability of the model M to predict correctly out of the sampling points (at least in some neighborhoud of heir convex hull).
The most common practice is to accept as "a best model" a model M that realizes some balance between the representation error (RE) and the generalization error. Models with high number of parameters generally give small values of RE and high values of GE: they are called over-fitted models or over-learned models and systematically given up.

One generally prefer good prediction capabilities (low GE) instead of a good representation of the data (which means nothing when they are uncertain values, for instance measurement values).

The simplest way (I do not say it is the best) to assess GE is to use the Leave-One-Out strategy.

  • Let S your original sample and S(-n) this sample where the n th observation X[n], Y[n]) has been dropped out.
  • Build the model M(-n) over S(-n) (just take Statistical:-Fit has you did)
  • Compute M(X[n], Y[n]): a quick estimation of GE is GE(-n) = M(X[n], Y[n]) - Z[n]

Repeating this sequence N times will give point-wise estimators GE(-1), ..., GE(-N)
The variance of these N estimators is an estimator of the generalization (prediction) error of the model M fitted over all the sample S.

More sophisticated approaches are Leave-K-Out, Bootstrap (look to the statistical package for more details: let me know if you are interested in bootstraping a statistical model), Cross-Vallisation ... (the meaning of all these strategies can be easily found on Wikipedia).

 

Last but not least ...

Let's take a simpler model with only one regressor X and a single dependant variable Z.
When the experimental observation of Z is submitted to some measurement/observation error U, the statistical approach begins by writing some a priori model Z = M[P](X) + U where P is the set of parameters of M

(it's a little bit more subtle than that because the you should write that the "Conditional expectation of Z given X = x is equal tp M[P](x) and the conditional variance of  Z given X = x is equal to the variance of U at X = x)


Fitting the parameters P by mean of the least squares method, for instance, means that your are looking some specific value p* of P that has some particular property (here minimizing the sum of the residuals).
Then M[p*] is the model that minimizes the representation error RE over the class M[P] of models

But this doesn't makes M[p*] the best model in many other senses.
Consider this example 

  • The sample S is made of 2 points (X[1], Z[1]) and (X[2], Z[2])
  • M[P] : X --> a*X+b
  • Obviously the best "least squares" model verifies a*=(Z[2]-Z[1])/(X[2]-X[1]) and b*=(X[1]*Z[2]-X[2]*Z[1])/(X[2]-X[1])


For many people this model is the best for this couple (a*, b*) maximizes some likelihood function over S.
But this is true only when the likelihood function is a function of (M[a, b](X[1])-Z[1])^2+(M[a, b](X[2])-Z[2])^2; that is when U has a gaussian stationnary distribution.

Consider the counter example when U is Uniform on the interval [-h, +h]

Then all the couples (a, b) such that the graph (straight line) of M[a,b](x) goes through the two "boxes" 
(x=X[n], z in [Z[n]-h, Z[n]+h]) have exactly the same value of the likelihood function.
For instance, from the "maximization of the likelihood point of view", the models 
a=((Z[2]+h)-(Z[1]-h))/(X[2]-X[1]) and b*=(X[1]*(Z[2]+h)-X[2]*(Z[1]-h))/(X[2]-X[1])
a=((Z[2]-h)-(Z[1]-h))/(X[2]-X[1]) and b*=(X[1]*(Z[2]-h)-X[2]*(Z[1]-h))/(X[2]-X[1])
a=((Z[2]+h)-(Z[1]+h))/(X[2]-X[1]) and b*=(X[1]*(Z[2]+h)-X[2]*(Z[1]+h))/(X[2]-X[1])
....

are as best as the model  (a*, b*) is

 

What do you mean by saying "the best statistical model" ?

 

@stefanv 

Wrong.
If you work in worksheet mode, not in document mode, the writeto function from Maple 2015 & 2016 redirects only the output, not the input.
In this without invoking interface(echo=0).

With Maple 2018 (run on a Windows 7 PC ... I'll give you tomorrow the exact version of Maple 2018 I use at office, for I'm now at home and I'm going to sleep), even if I put interface(echo=0) in a separate block just after the restart, I recover the input in the file.

 

@acer 

 

I've forgotten to say I'm working with the preferences described in the attached file

Capture_d’écran_2018-06-27_à_22.16.27.pdf

(sorry for giving you this environment in french but I'm here at home and french is the default language for my personal license).
Furthermore, I always work in worksheet mode, never in document mode

 

@Preben Alsholm 

Thanks for all these advices 
They seem very interesting at the first reading, but I need to test them by myself to re-appropriate the full material.

You said "I succeeded in some attempts (after restarting) and other times it took too long for my patience" ... please don't worry about that, you have already done more that i ever expected !

Looking forward to read you soon on other topics.
Have a nice weekend

@Preben Alsholm 

Before sending my answer ro  mehdibaghaee I had tried to obtain an explicit solution of the this ODE system:
     dsolve(...);
     allvalues(%);


The result allvalues returned was extremely lengthy, but a quick look revealed some facts:

  1. from the result of  dsolve(...) we can observe that the term  RootOf(_Z^4+9*_Z^3+4*_Z^2-19*_Z-5, index = SomeInteger)  is recurrent. 
    Unfortunately allvalues(RootOf(_Z^4+9*_Z^3+4*_Z^2-19*_Z-5) ) is dramatically untractable because, as a rule, Maple always represents (except in obvious cases) the formal solution of equations ot the 3rd or 4th degree that Maple rin a complex form, even if the roots are real.
    Does it exist some "already implemented feature" to force Maple to return roots under a real form if they are indeed real ?
    One might think of some analysis of the discriminant.
    Or, for 3rd degree equations, one might use the  "trigonometric resolution" which gives expressions in temrs of hyperbolic trig. functions when only one root is real, or in terms of cosine if the 3 roots are real.
     
  2. The second question has probably already been asked here.
    Because the explicit solution for, let's say,  phi__1(t) contains many times the same term, for instance 
    (4*(3268+(3*I)*sqrt(10275765))^(2/3)+211*(3268+(3*I)*sqrt(10275765))^(1/3)+1876)/(3268+(3*I)*sqrt(10275765))^(1/3)
    or 
    (3268+(3*I)*sqrt(10275765))
    I have had the idea, in order to have a more readable solution, to susbtitute some repetitive pattern by, let's say, a few auxiliary variables U, V, W, ...
    What I tried was
           dsolve(...):
           sol := allvalues(%):
           subs(SomePattern=U, rhs(sol[1]))

    Unfortunately this subsitution did not always apply, leaving unchanged some terms which nevertheless contained the pattern "SomePattern".
    For instance, if SomePattern := (3268+(3*I)*sqrt(10275765))^(1/3): changes occur in expressions invoking SomePattern but not 1/SomePattern (wich comes from the fact that 1/SomePattern is in fact  (3268+(3*I)*sqrt(10275765))^(-1/3) , or even in exp(SomePattern) ...

    Do you that functions such as select/remove, patmatch ... could help to do the substitution everywhere SomePattern appears ?


I would be pleased to have your opinion about these two "by-problems".
Thanks in advance


 

@tomleslie 

Sh... It's not the first time that I get taken this way !
Sorry for asking this question for I knew perfectly that the restart had to be put in a separate block.

Thanks for reminded me how volatile my memory is :-)

First 141 142 143 144 145 146 147 Last Page 143 of 154