2083-4746-1-PB copy [PDF] | Documents Community Sharing

* The preview only shows a few pages of manuals at random. You can get the complete content by filling out the form below.

Description

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

Parameter Estimation of Bernoulli Distribution using Maximum Likelihood and Bayesian Methods 1)

Nurmaita Hamsyiah1), Khoirin Nisa1), &Warsono1) Departmentof Mathematics, Faculty of Mathematics and Science, University of Lampung Jl. Prof. Dr. Sumantri Brojonegoro No. 1 Bandar Lampung Phone Number +62 721 701609 Fax +62 721 702767 E-mail: [email protected]

ABSTRACT The term parameter estimation refers to the process of using sample data to estimate the parameters of the selected distribution.There are several methodsthat can be used to estimate distributionparameter(s).In this paper,the maximum likelihood andBayesian methodsare usedfor estimating parameter ofBernoulli distribution, i.e. , which isdefined asthe probability of success event for two possible outcomes.The maximum likelihood and Bayesian estimators of Bernoulli parameter are derived,for the Bayesian estimator the Beta prior is used. The analytical calculation shows that maximum likelihood estimator is unbiased while Bayesian estimator is asymptotically unbiased. However, empirical analysis by Monte Carlo simulation shows that the mean square errors (MSE) of the Bayesian estimatorare smaller than maximum likelihood estimator for large sample sizes. Keywords: Bernoulli distribution, beta distribution, conjugate prior, parameter estimation.

1. PENDAHULUAN Parameter estimation is a way to predict the characteristics of a population based on the sample taken. In general, parameter estimation is classified into two types, namely point estimation and interval estimation. The point estimation of a parameter is a value obtained from the sample and is used as a parameter estimator whose value is unknown.

Several point estimation methods are used to calculate the estimator, such as moment method, maximum likelihood method, and Bayesian method. The moment method predicts the parameters by equating the values of sample moments to the population moment and solving the resulting equation system [1]. The maximum likelihood (ML)methoduses differential calculus to determine the maximum of the likelihoodfunction to obtain theparameters estimates. The Bayesianmethod differs from the traditional methods by introducing a frequency function for the parameter being estimated namely prior distribution. The Bayesian method combines the prior distribution and sample distribution. The prior distribution is the initial distribution that provides information about the parameters. The sample distribution combined with the prior distribution provides a new distribution i.e. the posterior distribution that expresses a degree of confidence regarding the location of the parameters after the sample is observed [2].

Researches on parameter estimation using various methods of various distributions have been done, for example: Bayesian estimation of exponential distribution [3], [4], ML and Bayesianestimations of Poisson distribution [5], Bayesianestimation of Poisson-Exponential distribution [6], and Bayesianestimation of Rayleigh distribution [7].

The difference between the ML and the Bayesian methods is that the ML method considers that the parameter is

214

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

an unknown quantity of fixed value and the inference is based only on the information in the sample; while the Bayesian method considers the parameter as a variable that describes the initial knowledge of the parameters before the observation is performed and expressed in a distribution called the prior distribution. After the observation is performed, the information in the prior distribution is combined with the sample data information through Bayesian theorem, and the result is expressed in a distribution form called the posterior distribution, which further becomes the basis for inference in the Bayesian method [8].

The Bayesian method has advantages over other methods, one of which is the Bayesian method can be used for drawing conclusions in complicated or extreme cases that cannot be handled by other methods, such as in complex hierarchical models. In addition, if the prior information does not indicate complete and clear information about the distribution of the prior, appropriate assumptions may be given to its distribution characteristics. Thus, if the prior distribution can be determined, then a posterior distribution can be obtained which may require mathematical computation [8].

This paper examines the parameter estimation of Bernoulli distribution using ML and Bayesianmethods. A review of Bernoulli distribution and Beta distribution is presented in Section 2. The research methodologyis described in Section 3. Section 4 provides the results and discussion. Finally, the conclusion is given in Section 5.

2. THEORETICAL FRAMEWORK 2.1 Bernoullli Distribution Bernoulli distribution was introduced by Swiss mathematician,Jacob Bernoulli (1654-1705). It is the probability distribution resulting from two outcomes or events in a given experiment, i.e. success ( with the probability of the success is θ and the probability of failure is

) and fail (

),

.

Definition A random variable

is called a Bernoulli random variable (or

is Bernoulli distributed) if and only if its

probability distribution is given by (

)

(

)

, for

.

Proposition 1 Bernoulli distribution (

) has mean and variance as follows: and

(

)

Proof : The mean of Bernoulli random variable

is

215

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7 ( ) (

∑

)

∑

( (

(

The variance, i.e. (

)

)

)

(

)

, ( )- of Bernoulli distribution is obtained as follows:

)

(

)

∑

(

)

∑

(

)

∑

(

)

(

)

(

)

Then, (

)

(

)

2.2. Beta Distribution Definition A random variable

is called a betarandom variable with parametersa and b if the density function of

is given

by ( ) where (

{ (

(

)

)

)is betafunction defined as (

)

(

∫

)

;

(1)

Proposition 2 The beta function and gamma function is connected by (

)

( ) ( ) (

)

Proof : ( ) ( )

216

∫

∫

(2)

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

∫ ∫ Let

(

)

and

(

)

(

), , (

∫ ∫( )

( ) ( )

)-

, (

∫ ∫( )

| (

)-

∫ ∫

∫

∫

(

) (

)|

(

)

(

)

)

Then, (

( ) ( ) ( )

)

Proposisi 3 The mean and variance of beta distribution with parametersaandbare and

(

)(

)

Proof : The proposition can be proved by using the moment ofbeta distribution as follows: (

)

(

(

)

)

∫

∫

(

)

(

)

(

)

From equations (1) and (2) we obtain (

)

(

) (

(

) ) ( )

( ) ( ) ( ) (

)

217

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7 ( (

) ( ) ) (

( ) ( ) ( )

) (

(

)

(3)

)) ( )

Thus the mean and variance of beta distribution will be obtained by substituting

and

to equation (3),

then ( )

(

( (

)

) (

( ) ( ) (

(

and

( )

(

)

, ( )- .

(

)

) )) ( ) ) )) ( )

Since ( (

) (

(

) ( ) (

) (

) )(

( ) ( ) (

( ( ( (

)

(

)(

) )) ( ) ) ) ( ) ) ) ( )

,

)

then ( )

( ( ( ( (

)( ( (

(

) (

) )(

)

) )(

) ) ) (

.

/

(

)

(

) )

) (

) )

3. RESEARCH METHOD The research method for estimating the parameter of Bernoulli distribution in this paper can be described as follows. For ML estimation, the parameter estimation is done by differentiatingpartially the log of the likelihood 218

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

function and equation it by zero, ( ) to obtain ML estimator( ̂

). The second derivation assessment is performed to show that the resulted ̂ truly

maximize the likelihood function. For the Bayesian method, the parameter estimation is done through the following steps: 1. Form the likelihood function of Bernoulli distribution as follows: (

| )

∏ (( )| )

2. Calculate the joint probability distribution, which is obtained by multiplying the likelihood function and the prior distribution, (

)

(

| )

( )

3. Calculate the marginal probability distribution function, (

)

∫ (

)

4. Calculatethe posterior distributionby dividing the joint probability distribution function by the marginal function, ( |

( (

)

) )

The Bayesian parameter estimate of is then produced as the mean of the posterior distribution.

After the parameter estimate of θis obtained by MLE and Bayesian methods, the evaluation of the estimators is performed by assessing their bias, variance, and mean square error.

4. RESULT AND DISCUSSION 4.1. The ML Estimator of the Bernoulli Distribution Parameter ( ) Let

( ), where

be Bernoulli distributed random sample with

probability function of

(

). The

is (

)

(

)

with

*

+

The likelihood function of Bernoulli distribution is given by ( )

(

)

∏ (

∏

)

(

)

219

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

(

)

∑

(

(

)

∑

)

(

.

) (4)

The natural logarithm of the likelihood function is then

( )

∑

[

(

∑

∑

) (

∑

∑

) (

The ML estimate value of

]

∑

) (

)

(5)

is obtained by differentiating equation (5)with respect to andequating the

differential result to zero, i.e. ( )

[∑

(

∑

)

(

)]

∑

(

∑

)∑

∑

(

∑

∑

)

∑

∑ then we obtain ̂

∑

To show that ̂ is the value that maximizes the likelihood function ( ), it must be confirmed that the second derivative of the likelihood function for ( )

= ̂ is negative: [∑

(

∑ (

(

∑

(

)]

)

) ∑ (

∑

∑ (

220

)

∑ (

∑

∑

) ∑

)

)

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7 ∑ (

∑ )

Since ̂maximizes the likelihood function, we conclude that the ML estimator of is given by ̂

∑

4.2. The Bayesian Estimator of the Bernoulli Distribution Parameter( ) To estimate

using Bayesian method, it is necessary to choose the initial information of a parameter called the

prior distribution, denoted by π(θ), to be applied to the basis of the method namely the conditional probability. In this paper, the prior selection for Bernoulli distribution refers to the formation of its likelihood function. From equation (4) we have ∑

( )

(

∑

)

A distribution having probability function in the same form as the above expression is the beta distribution with density function ( where

∑

) ∑

,

(

(

)

, and

)

are factors requiredfor the density function tobe satisfied.

(

)

The prior distributionis combinedwith the sample distribution to produce a new distribution calledposterior distributionand denoted by

( |

). Posterior distributionis obtained by dividing the joint density

distribution by the marginal distribution.

)is given by:

Joint probability density function of( (

)

(

| ) ∑

(

( ) )

∑ (

and the marginal function of (

∑

( (

)

(

) ∑

)

) (6)

) is obtained as follows: (

)

∫ (

)

Using equation (6) we have (

)

∫

(

)

∑

(

)

∑

221

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

(

)

∑

∫

(

(

(

)

∑

)

∑

∑

)

(7)

Then from equation (6) and (7) the posterior distribution can be written as follows: ( |

(

)

) (

) ∑

(

)

(

)

(

(

∑

)

∑

∑ ∑

(

(

) ∑

)

∑

∑

(8)

)

The posterior distribution expressed in equation (8) is obviously following beta distribution also with parameter (

∑

)and(

∑

), or

̂

(

∑

∑

).

Since the prior and posterior distribution of Bernoulli follows the same distribution, i.e. the Beta distribution, beta distribution is called as the conjugate prior of the Bernoulli distribution. The posterior mean is used as the parameter estimate θ in Bayesian method. Using Proposition 2, the Bayesian estimator of parameter θ is obtained as follows: ̂

∑ ∑

∑

∑

4.3. Evaluation of the Estimators Properties The parameter estimationof the Bernoulli distribution isobtained by the MLE and Bayesian methods yields different estimates. The best estimator has to meet the following properties: 1. Unbiased An estimator is called to be unbiased if its expected values is equal to the estimated parameter, i.e. ̂is an unbiased estimator of if ( ̂ )

.The bias of an estimator is then given by: ( ̂)

Let

are Bernoulli( ) random sample observations. Since ̂

its expected value is as follows: 222

( ̂)

(9)

∑

is the ML estimator of ,

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

(̂

)

( ∑

)

(∑

)

∑ ( ) (10) Since ( ̂

)

,̂

is an unbiased estimator of .

∑

Now consider the Bayesian estimator of i.e. ̂

. The expexted value of Bayesian estimator is given

by (̂ )

(

∑

)

(

∑

[ ( )

[ ( ) (

Since ( ̂ )

)

(∑

)]

∑ ( )]

)

(11)

, ̂ is an biased estimator of . The bias value of ̂ is: (̂ )

(̂ ) (12)

Although ̂ is an biased estimator of , it can be shown that ̂ is asymptotically unbiased.The proof is given as follows: (̂ )

223

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

(13)

(̂ )

Since

, ̂ is an asymptotically unbiased estimator of .

2. Efficiency The efficiency of an estimator is observed from its variance. The best parameter estimator is the one that has the smallest variance. This is because the variance of an estimator is a measure of the spread of the estimator around its mean.

The variance of ML estimator ̂

is: (̂

)

( ∑ )

(∑

∑

(

)

( )

)

(

)

(14)

While the variance of the Bayesian estimator ̂ is given by: (̂ )

Since

( )

and

( )

(

(

∑

(

)

(

)

)

(

∑ )

( )

[

∑

( )]

), we obtain (̂ )

(

)

(

)

(15)

From equation (10), it is shown that the ML estimator is unbiased, whereas from equations (11) and (12) it is shown that Bayesian estimator is biased. As a result, the efficiency of the two methods cannot be compared because the efficiency of estimators applies to unbiased estimators.

224

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

3. Consistency The consistency of the estimators is evaluated from their mean square error (MSE). The MSE can be expressed as ( ̂)

(̂

( ̂)

)

̂)

(

(16)

If the sample size grows infinitely, a consistent estimator will give a perfect point estimate to θ. Mathematically, θ is a consistent estimator if and only if (̂

)

when

,

which means that the bias and the variance approaches to 0 if

.

Substituting equation (10) and (14) to equation (16), the MSE of ML estimator ̂ (̂

)

(̂ For

)

(̂

)

(̂

)

̂

( (

is then

) )

, we have (̂

)

(

)

(17)

In the same manner, by substituting equation (12) and (15) the MSE of Bayesian estimator ̂ is: (̂ (̂ For

)

[

(̂ )

)

(

)

̂ )

(

(

)]

[{(

)

(

)

, we have (̂

)

(

)}

.

/ ]

(18)

From equation (17) and (18), we can conclude that ML and Bayesian estimators are consistent estimators of .

4.4. Empirical Comparison of the Properties of ML and Bayesian Estimators To compare the ML and Bayesian estimators of θ, a Monte Carlo simulation using R program was conducted. The simulation was performed by generating Bernoulli distributed data with θ = 0.1, 0.3, and 0.5 and eight different sample sizes, i.e. n = 20, 50, 100, 300, 500, 1000, 5000, and 10000. The simulation was repeated 1000 times for each combination of θ and n. The generated data were used to estimate parameter θ using the two methods. Furthermore, the bias and MSE of both estimators were calculated using the formulas in equations (9) and (16) andthe results are presented in Table 1.

225

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

Table 1. The biasand MSEofML and Bayesian estimators of Bias N

0,1

0,3

0,5

ML

MSE Bayesian (

)

ML

Bayesian (

)

20

0,001200

0,223084

0,031478

0,558149

50

0,002180

0,036602

0,015264

0,041740

100

0,000270

0,009328

0,007058

0,008843

300

0,000413

0,001075

0,001901

0,002906

500

0,000210

0,000364

0,001183

0,000551

1000

0,000195

0,000091

0,000503

0,000128

5000

0,000003

0,000003

0,000143

0,000004

10000

0,000114

0,000001

0,000184

0,000002

20

0,003100

0,536609

0,001652

0,493287

50

0,001300

0,090000

0,003113

0,142692

100

0,000630

0,021341

0,001583

0,067615

300

0,000003

0,002205

0,000327

0,002307

500

0,000312

0,000845

0,000111

0,000899

1000

0,000545

0,000207

0,000444

0,000217

5000

0,000384

0,000008

0,000403

0,000018

10000

0,000023

0,000002

0,000013

0,000004

20

0,001450

0,714234

0,023000

1,453419

50

0,002260

0,097036

0,011566

0,147045

100

0,003860

0,024341

0,008602

0,024833

300

0,000143

0,002856

0,001792

0,003374

500

0,000146

0,001004

0,001139

0,003040

1000

0,000432

0,000252

0,000929

0,000253

5000

0,000132

0,000009

0,000232

0,000056

10000

0,000066

0,000002

0,000016

0,000003

Table 1 shows the bias and MSE values of ML and Bayesian estimates for a successful probability of θ = 0.1, 0.3 and 0.5. From the table it can be seen ML estimator produces smaller biases than Bayesian estimates for finite sample (i.e. n< 1000). However, when the sample size equal or larger than 1000 (i.e. 5000 and 10000, the biasesof the Bayesian estimator are smaller than the ML estimator. Even though the bias values of ML estimates changes inconsistently throughout the sample sizes, analytically it has been proved that ML estimator is an unbiased estimator.This appears to be different from the bias values for Bayesian estimator. This is because for all the considered success probabilitiesof the bias values become smaller whenthe sample size increases, although analytically it is found that Bayesian estimatoris a biased estimator. As a result the efficiency of the two estimators cannot be compared. Therefore, to compare the best estimators we use MSE of both estimators. This is because MSE considers both the bias and variance values.

The MSE values of MLand Bayesian estimators that have been shown in Table 1 have similarities, i.e. the MSE value decreases as the sample size increases and it closes to 0. Thus, both estimators are consistent estimators. This also corresponds to the results obtained analytically. Based on the simulation results in this study, it can be seen that for the larger sample sizes Bayesian estimatoris better than ML estimator. This is because the MSE value of Bayesian estimator is smaller than the ML estimator. As shown in Table 1, when θ = 0.1, the MSE value 226

Prosiding Seminar Nasional Metode Kuantitatif 2017 ISBN No. 978-602-98559-3-7

of the Bayesian estimator is smaller than the ML estimator for n= 500, 1000, and 10000; and when θ = 0.3 and 0.5, the MSE values of the Bayesian estimator are smaller than the ML estimator for n = 1000, 5000, and 10000.

5. CONCLUSION In this paper, we derived the ML and Bayesian estimator (using beta prior) of Bernoulli distribution parameter. Analytically we show that the ML estimator is an unbiased estimator and Bayesian estimator is a biased estimator for parameter θ. However, Bayesian estimator is asymptotically unbiased. Based on the simulation result, both ML and Bayesian estimator are consistent estimators of θ because the two estimators satisfy the property of consistency, i.e. ( ̂

)

whenn → ∞.The simulation result also shows that the

Bayesianestimator usingbeta prioris better than the MLE method for large sample sizes (n  1000).

REFERENCES , - Bain, L.J. and Engelhardt, M. (1992).Introduction to Probability and Mathematical Statistics. Duxbury Press, California. , - Walpole, R.E dan Myers, R.H. (1995). Ilmu Peluang dan Statistika untuk Insinyur dan Ilmuwan. ITB, Bandung. , - Al-Kutubi H. S., Ibrahim N.A. (2009). Bayes Estimator for Exponential Distribution with Extensionof Jeffery Prior Information. Malaysian Journal of MathematicalSciences. 3(2):297-313. , - Nurlaila, D., Kusnandar D.,& Sulistianingsih, E. (2013). Perbandingan Metode Maximum Likelihood Estimation (MLE) dan Metode Bayes dalam Pendugaan Parameter Distribusi Eksponensial. Buletin Ilmiah Mat. Stat. dan Terapannya. 2(1):51-56. , - Fikhri, M., Yanuar, F., & Yudiantri A. (2014). Pendugaan Parameter dari Distribusi Poisson dengan Menggunakan Metode Maximum Likelihood Estimation (MLE) dan Metode Bayes. Jurnal Matematika UNAND. 3(4):152-159. , - Singh S. K., Singh, U., &Kumar, M. (2014). Estimation for the Parameter of Poisson-Exponential Distribution under Bayesian Paradigm. Journal of Data Science.12:157-173. , - Gupta, I. (2017). Bayesian and E-Bayesian Method of Estimation of Parameter of Rayleigh Distribution-A Bayesian Approachunder Linex Loss Function. International Journal of Statistics and Systems.12(4):791796. , - Box, G.E.P& Tiao, G.C. (1973). Bayesian Inference in Statistical Analysis. Addision-Wesley Publishing Company, Philippines.

227

2083-4746-1-PB copy

Description

Similar documents

A 1 - Copy - Copy

C++ - Copy - Copy

QRcode Copy

Mini - Copy

3 - Copy

Copy 4

RPP - Copy

tarea 2.2 spañol - Copy

RPP - Copy (5)

Prueba - Copy (2)

Jurnal Petrologi Gaizka - Copy

न्यू रिज्यूम - Copy