Processing math: 24%

Triangle Distribution Math

Rob Carnell

2022-12-12

Triangle Notation

Triangle PDF

f(x)={hca(xa)if axchcb(xb)if c<xb0otherwise    (1)

Integrating the pdf in equation (1) to solve for h

f(x)dx=1

hcaca(xa)dx+hcbbc(xb)dx=h(ba)2

h=2ba    (2)

Substituting back into equation (1),

f(x)={2(ba)(ca)(xa)if axc2(ba)(cb)(xb)if c<xb0otherwise    (3)

Triangle Mean

Integrating equation (3) to find E(x),

E(X)=xf(x)dx=hcaca(x2ax)dx+hcbbc(x2bx)dx

=a+b+c3    (3)

Triangle Variance

V(X)=E(X2)(E(X))2=x2f(x)dx(a+b+c3)2

=hcacax2(xa)dx+hcbbcx2(xb)dx(a+b+c3)2

=a2+b2+c2abacbc18

Logarithmic Triangle distribution

Define:

al=logϕ(a),  bl=logϕ(b),  cl=logϕ(c),  h=2blal,  ϕ=log base

f(z)={hclal(zal)if alzclhclbl(zbl)if cl<zbl0otherwise    (4)

However,

E(ϕz)ϕE(z)    (5)

Therefore, transforming…

Y=ϕZ

Z=logϕ(Y)

w(y)=logϕ(y)

w(y)=dzdy=1ylog(ϕ)

g(y)=f(w(y))w(y)

g(y)={2(clal)(blal)log(ϕ)logϕ(y)alyif 0<ayc2(clbl)(blal)log(ϕ)logϕ(y)blyif c<yb0otherwise    (5)

Define:

β1=2(clal)(blal)

β2=2(clbl)(blal)

Finding the CDF,

G(y)=yg(y)dy

for ayc,  G(y)=β1log(ϕ)yalog(y)ylog(ϕ)alydy

=β1[log2ϕ(y)2allogϕ(y)a2l2+a2l]

for c<yb,  G(y)=G(c)+β2log(ϕ)yclog(y)ylog(ϕ)blydy

=G(c)+β2[log2ϕ(y)2bllogϕ(y)c2l2+blcl]

Checking that the CDF is 1 at b,

G(b)=c2l2alcl+a2l(clal)(blal)+b2lc2l+2blcl(clbl)(blal)

=clalblal+(clbl)blal=1

Now calculating E(y),

E(y)=y g(y) dy

=β1log(ϕ)ca[log(y)log(ϕ)al]dy+β2log(ϕ)bc[log(y)log(ϕ)bl]dy

=cβ1log2(ϕ)[log(c)1log(a)+ac]+cβ2log2(ϕ)[bclog(c)+1+log(b)]

Method of Moments

E(x)=a+b+c3

c=3E(x)ab=3E(x)min

Maximum Likelihood Estimation

The procedure for maximum likelihood estimation involves maximizing the likelihood with respect to c for a fixed a and b, followed by minimizing the negative log likelihood with respect to a and b for a fixed c.

Maximizing the Likelihood with respect to c

This discussion follows the results from Samuel Kotz and Johan Rene van Dorp. Beyond Beta

For the purposes of this section, with a fixed a and b, the sample can be easily rescaled to a=0 and b=1. This section will proceed on [0,1] with the mode at 0 \le c \le 1

w(x) = \left\{ \begin{array}{ll} \frac{2x}{c} & \mbox{if } 0 \le x \lt c \\ \frac{2(1-x)}{1-c} & \mbox{if } c \le x \leq 1 \\ 0 & \mbox{otherwise} \end{array} \right.

L(x|c) = \prod_{i}^{n} w(x|c)

Assume that the sample is ordered into order statistics X_{(1)} \lt \dots \lt X_{(n)}. Also, note that X_{(r)} \le c \lt X_{(r+1)}. In other words, the mode falls between the r^{th} and r+1 order statistics.

L(x|c) = \prod_{i=1}^{r} \frac{2x_{(i)}}{c} \prod_{i=r+1}^{n} \frac{2(1-x_{(i)})}{1-c} = \frac{2^n \prod_{i=1}^{r} x_{(i)} \prod_{i=r+1}^{n} (1-x_{(i)})}{c^r(1-c)^{n-r}}

To maximize the likelihood, we can first maximize with respect to r and then locate c between the r^{th} and r+1 order statistics. For notation purposes, also define X_{(0)} = 0 and X_{(n+1)} = 1.

\large \max_{0 \le c \le 1} L(x|c) = \max_{r \ \epsilon \ (0,\dots,n)} \ \ \max_{x_{(r)} \le c \le x_{(r+1)}} \ \ L(x|c)

Case 1. c is between the first and second to last order statistic r \ \epsilon \ (1, \dots, n-1)

Noticing that maximizing the likelihood is equivalent to minimizing the denominator:

\large \max L(x|c) = \max_{r \ \epsilon \ (1,\dots,n-1)} \ \ \min_{x_{(r)} \le c \le x_{(r+1)}} \ \ c^r(1-c)^{n-r}

Since c^r(1-c)^{n-r} is unimodal with respect to c, it should be sufficient to test the end points of an interval to find the minimum on the interval

\large = \max_{r \ \epsilon \ (1,\dots,n-1)} \ \ \min_{c \ \epsilon \ (x_{(r)},\ \ x_{(r+1)})} \ \ c^r(1-c)^{n-r}

Therefore, for this case, it is sufficient to test the likelihood using c at each of the sampled points and find the largest.

Side note on z=c^r(1-c)^{n-r} being unimodal

\frac{dz}{dc} = rc^{(r-1)}(1-c)^{n-r} + c^r(n-r)(1-c)^{n-r-1}(-1) = c^{(r-1)}(1-c)^{n-r-1}(r - cn)

\frac{dz}{dc} = 0 at c=0,\ 1,\ \frac{r}{n}. At 0 < c < \frac{r}{n}, z is positive, and at \frac{r}{n} < c < 1, z is negative. Therefore, z is unimodal on (0,1).

Case 2. c is between 0 and the first order statistic r = 0

\large \max L(x|c) = \max_{0 \le c \le x_{(1)}} \prod_{i=1}^{n} \frac{1-x_{(i)}}{1-c} = \prod_{i=1}^{n} \frac{1-x_{(i)}}{1-x_{(1)}}

Choosing the largest endpoint in the interval, creates the smallest denominator, and the largest likelihood.

Therefore, for this case, it is sufficient to test the likelihood using c at the first sampled point.

Case 3. c is between the last order statistic r = n and 1

\large \max L(x|c) = \max_{x_{(n)} \le c \le 1} \prod_{i=1}^{n} \frac{x_{(i)}}{c} = \prod_{i=1}^{n} \frac{x_{(i)}}{x_{(n)}}

Choosing the smallest option in the denominator creates the largest likelihood. Again, it is sufficient to test the likelihood using c at the largest sample point.

All Cases

For all cases, it is sufficient to compute the sample likelihood using c equal to each of the samples, and choosing the largest likelihood from the n options to find the corresponding c. This calculation is performed with a fixed a and b, so the test must be performed iteratively as a and b are separately optimized.

Negative Log Likelihood

nLL = -\log(L) = -\log\left(\prod_i^n f(x_i)\right)

= - \sum_i^n \log\left(f(x_i)\right) = - \sum_{i: \ a \le x_i \lt c}^{n_1} \log\left(f(x_i)\right) - \sum_{i: \ c \le x_i \le b}^{n_2} \log\left(f(x_i)\right)

where n = n_1 + n_2

Case 1 a = c \lt b

nLL = - \sum_{i}^{n} \log(2) + \log(b-x_i) - \log(b-a) - \log(b-c)

= -n\log(2) + n\log(b-a) + n \log(b-c) - \sum_{i}^{n} \log(b-x_i)

Case 2 a \lt c = b

= - \sum_{i}^{n} \log(2) + \log(x_i - a) - \log(b-a) - \log(c-a)

= -n\log(2) + n\log(b-a) + n\log(c-a) - \sum_{i}^{n} \log(x_i - a)

Case 3 a c b

= - \sum_{i: \ a \lt x_i \lt c}^{n_1} \log(2) + \log(x_i - a) - \log(b-a) - \log(c-a) - \sum_{i: \ c \le x_i \lt b}^{n_2} \log(2) + \log(b-x_i) - \log(b-a) - \log(b-c)

= -n\log(2) + n\log(b-a) + n_1\log(c-a) + n_2 \log(b-c) - \sum_{i: \ a \lt x_i \lt c}^{n_1} \log(x_i - a) - \sum_{i: \ c \le x_i \lt b}^{n_2} \log(b-x_i)

Gradient of the negative Log Likelihood Given c:

The negative log likelihood is not differentiable with respect to c because the limits of the sum (n_1 and n_2) are functions of c. There the gradient and hessian are derived as if c is fixed.

Case 1 a = c \lt b

\frac{\partial nLL}{\partial a} = - \frac{n}{b-a}

\frac{\partial nLL}{\partial b} = \frac{n}{b-a} + \frac{n}{b-c} - \sum_i^{n} \frac{1}{b-x_i}

Case 2 a \lt c = b

\frac{\partial nLL}{\partial a} = - \frac{n}{b-a} - \frac{n}{c-a} + \sum_i^{n} \frac{1}{x_i - a}

\frac{\partial nLL}{\partial b} = \frac{n}{b-a}

Case 3 a c b

\frac{\partial nLL}{\partial a} = - \frac{n}{b-a} - \frac{n_1}{c-a} + \sum_i^{n_1} \frac{1}{x_i - a}

\frac{\partial nLL}{\partial b} = \frac{n}{b-a} + \frac{n_2}{b-c} - \sum_i^{n_2} \frac{1}{b-x_i}

Hessian of the negative Log Likelihood Given c:

Case 1 a = c \lt b

\frac{\partial^2nLL}{\partial a^2} = - \frac{n}{(b-a)^2}

\frac{\partial^2 nLL}{\partial b^2} = -\frac{n}{(b-a)^2} - \frac{n}{(b-c)^2} + \sum_i^{n} \frac{1}{(b-x_i)^2}

\frac{\partial^2 nLL}{\partial a\partial b} = \frac{\partial^2 nLL}{\partial b\partial a} = - \frac{n}{(b-a)^2}

Case 2 a \lt c = b

\frac{\partial^2 nLL}{\partial a^2} = - \frac{n}{(b-a)^2} - \frac{n}{(c-a)^2} + \sum_i^{n} \frac{1}{(x_i - a)^2}

\frac{\partial^2 nLL}{\partial b^2} = - \frac{n}{(b-a)^2}

\frac{\partial^2 nLL}{\partial a\partial b} = \frac{\partial^2 nLL}{\partial b\partial a} = - \frac{n}{(b-a)^2}

Case 3 a \lt c \lt b

\frac{\partial^2 nLL}{\partial a^2} = - \frac{n}{(b-a)^2} - \frac{n_1}{(c-a)^2} + \sum_i^{n_1} \frac{1}{(x_i - a)^2}

\frac{\partial ^2 nLL}{\partial b^2} = -\frac{n}{(b-a)^2} - \frac{n_2}{(b-c)^2} + \sum_i^{n_2} \frac{1}{(b-x_i)^2}

\frac{\partial ^2 nLL}{\partial a\partial b} = \frac{\partial ^2 nLL}{\partial b\partial a} = - \frac{n}{(b-a)^2}

MLE Variance Co-variance

For the optimization of (a,b) given c, we can use the inverse of the hessian of the negative log likelihood for an estimate of the covariance matrix of \hat{a} and \hat{b}. For the variance in \hat{c}, we use the variance of the r^{th} order statistic which corresponds to c. The covariance of (a,b) and c is not computed because the negative log likelihood is not differentiable with respect to c.

Let H denote the Hessian matrix, and let H^{-1}[1,1] be the V(\hat{a}), H^{-1}[2,2] be the V(\hat{b}), and H^{-1}[1,2] = H^{-1}[2,1] be the Cov(\hat{a}, \hat{b}). Then,

V([\hat{a}, \hat{b}, \hat{c}]) = \begin{bmatrix} H^{-1}[1,1] & H^{-1}[1,2] & 0 \\ H^{-1}[2,1] & H^{-1}[2,2] & 0 \\ 0 & 0 & V(\hat{c}) \\ \end{bmatrix}

r^{th} order statistic

f(x_{(r)}) = \frac{n!}{(r-1)!(n-r)!} f(x) [F(x)]^{(r-1)}[1-F(x)]^{(n-r)}

A closed form solution to V(X_{(r)}) is not easily obtainable for the triangle, so numerical integration is used with f(x) as dtriangle and F(x) as ptriangle.

V\left(X_{(r)}\right) = \int_a^b x^2 f(x_{(r)}) dx - \left[\int_a^b x f(x_{(r)}) dx \right]^2