**- DSP log - http://www.dsplog.com -**

Least Squares in Gaussian Noise – Maximum Likelihood

Posted By __Krishna Sankar__ On January 15, 2012 @ 9:45 am In __DSP__ | __No Comments__

From the previous posts on Linear Regression (using Batch Gradient descent ^{[1]}, Stochastic Gradient Descent ^{[2]}, Closed form solution ^{[3]}), we discussed couple of different ways to estimate the parameter vector in the least square error sense for the given training set. However, how does the least square error criterion work when the training set is corrupted by noise? In this post, let us discuss the case where training set is corrupted by Gaussian noise.

For the training set, the system model is :

,

where,

is the input sequence,

is the output sequence,

is the parameter vector and

is the noise in the observations.

Let us assume that the noise term are independent and identically distributed following a Gaussian probability having mean 0 and variance .

The probability density function of noise term can be written as,

.

This means that probability of the output sequence given and parameterised by is,

.

Let us write the** likelihood of** , given all the observations of input sequence and output as,

.

Given that all the observations are independent, the **likelihood of** is,

.

Taking logarithm on both sides, the** log-likelihood function is,**

.

From the above expression, we can see that maximizing the likelihood function is same as minimizing

Recall: This is same cost function which was minimized in the Least Squares solution ^{[1]}.

**Summarizing:**

a) When the observations are corrupted by **independent Gaussian Noise**, the** least squares solution** is the **Maximum Likelihood estimate** of the parameter vector .

b) The term is not a playing a role in this minimization. However if the noise variance of each observation is different, this needs to get factored in. We will discuss this in another post.

CS229 Lecture notes1, Chapter 3 Probabilistic Interpretation, Prof. Andrew Ng ^{[4]}

Article printed from DSP log: **http://www.dsplog.com**

URL to article: **http://www.dsplog.com/2012/01/15/least-squares-gaussian-noise-maximum-likelihood/**

URLs in this post:

[1] Batch Gradient descent: **http://www.dsplog.com/2011/10/29/batch-gradient-descent/**

[2] Stochastic Gradient Descent: **http://www.dsplog.com/2011/11/15/stochastic-gradient-descent/**

[3] Closed form solution: **http://www.dsplog.com/2011/12/04/closed-form-solution-linear-regression/**

[4] CS229 Lecture notes1, Chapter 3 Probabilistic Interpretation, Prof. Andrew Ng: **http://cs229.stanford.edu/notes/cs229-notes1.pdf**

[5] click here to SUBSCRIBE : **http://www.feedburner.com/fb/a/emailverifySubmit?feedId=1348583&loc=en_US**

Click here to print.

Copyright © 2007-2012 dspLog.com. All rights reserved. This article may not be reused in any fashion without written permission from http://www.dspLog.com.