- DSP log - http://www.dsplog.com -

Straight line fit using least squares estimate

Posted By Krishna Sankar On July 15, 2007 @ 2:22 am In DSP | 7 Comments

Two points suffice for drawing a straight line. However we may be presented with a set of data points (more than two?) presumably forming a straight line. How can one use the available set of data points to draw a straight line?

A probable approach is to draw a straight line which hopefully minimizes the error between the observed data points and estimated straight line.

$err = \Sigma_{i=1}^N\left(y_i - \hat{y}_i\right)^2$ where $y_i$ is the observed data points and $\hat{y}_i$ is the points from estimated straight line.

To draw the estimated straight line $\hat{y}=mx+c$, we need to estimate the slope, $m$ and the constant, $c$.

Formulating as a matrix,

$\left[\begin{eqnarray} y_1 \\ y_2 \\ y_3 \\ \vdots \\ y_n \end{eqnarray}\right]=\left[\begin{eqnarray} x_1\ 1\\ x_2\ 1 \\ x_3\ 1 \\ \vdots \\ x_n\ 1 \end{eqnarray}\right]\left[\begin{eqnarray} m \\ c \end{eqnarray}\right]+\left[\begin{eqnarray} \eta_1\\ \eta_2 \\ \eta_3 \\ \vdots \\ \eta_n\end{eqnarray}\right]$

$\mathbf{Y} = \mathbf{X}\left[\begin{eqnarray} m \\ c \end{eqnarray}\right]+ \mathbf{N}$,

where,

${y}_i$ = $\mathbf{Y}$ is the set of observations is a matrix of dimension $[N \times 1]$ ,

${x}_i$ = $\mathbf{X}$ is the set of coefficients is a matrix of dimension $[N \times 2]$,

$\left[\begin{eqnarray} m \\ c \end{eqnarray}\right]$ is the slope and constant estimate of dimension $[2\times 1]$,

$\eta_i$ = $\mathbf{N}$is the noise is a matrix of dimension $[N\times1]$ .

The least square estimate of the straight line is,

$\left[\begin{eqnarray} m \\ c \end{eqnarray}\right]=(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{Y}$.

A simple MATLAB code for least squares straight line fit is given below:

% Least Squares Estimate
rand(‘state’,100); % initializing the random number generation
y = [5:3:50]; % observations, y_i
y = y + 5*rand(size(y)); % y_i with noise added
x = 1:length(y); % the x co-ordinates

% Formulating in matrix for solving for least squares estimate
Y = y.’;
X = [x.' ones(1,length(x)).'];
alpha = inv(X’*X)*X’*Y; % solving for m and c

% constructing the straight line using the estimated slope and constant
yEst = alpha(1)*x + alpha(2);

close all
figure
plot(x,y,’r.’)
hold on
plot(x,yEst,’b')
legend(‘observations’, ‘estimated straight line’)
grid on
ylabel(‘observations’)
xlabel(‘x axis’)
title(‘least squares straight line fit’)

[1]

References: