(5 votes, average: 4.40 out of 5)

# Closed form solution for linear regression

by on December 4, 2011

In the previous post on Batch Gradient Descent and Stochastic Gradient Descent, we looked at two iterative methods for finding the parameter vector $\theta$ which minimizes the square of the error between the predicted value $h_{\theta}(x)$ and the actual output $y$ for all $j$ values in the training set.

A closed form solution for finding the parameter vector $\theta$ is possible, and in this post let us explore that. Ofcourse, I thank Prof. Andrew Ng for putting all these material available on public domain (Lecture Notes 1).

## Notations

Let’s revisit the notations.

$m$ be the number of training set (in our case top 50 articles),

$x$ be the input sequence (the page index),

$y$ be the output sequence (the page views for each page index)

$n$ be the number of features/parameters (=2 for our example).

The value of $(x^j,y^j)$ corresponds to the $j^{th}$ training set

The predicted the number of page views for a given page index using a hypothesis $h_{\theta}(x)$ defined as :

$\begin{array}{lll}h_{\theta}(x)&=&\theta{_0}x_0 + \theta{_1}x_1\\&=&\sum_{i=0}^{n-1}\theta{_i}x_i\\&=&\theta^Tx\\\end{array}$

where,

$x_1$ is the page index,

$x_0\ = \ 1$.

Formulating in matrix notations:

The input sequence is,

$X = $\begin{array}{mm}x_0^1 & x_1^1 \\x_0^2 & x_1^2\\ \vdots & \vdots \\x_0^m & x_1^m\end{array}$$ of dimension [m x n]

The measured values are,

$Y = $\begin{array}{m}y^1\\y^2\\\vdots \\y^m\end{array}$$ of dimension [m x 1].

The parameter vector is,

$\theta=$\begin{array}{m}\theta_0\\\theta_1\end{array}$$ of dimension [n x 1]

The hypothesis term is,

$H_\theta(X)=X\theta= $\begin{array}{mm}x_0^1 & x_1^1 \\x_0^2 & x_1^2\\ \vdots & \vdots \\x_0^m & x_1^m\end{array}$$\begin{array}{m}\theta_0\\\theta_1 \end{array}$$ of dimension [m x 1].

From the above,

$H_\theta(X)-Y=X\theta-Y= $\begin{array}{mm}h_\theta(x^1)-y^1\\h_\theta(x^2)-y^2\\\vdots\\h_\theta(x^m)-y^m\end{array}$$.

Recall :

Our goal is to find the parameter vector $\theta$ which minimizes the square of the error between the predicted value $h_{\theta}(x)$ and the actual output $y$ for all $j$ values in the training set i.e.

$\min_{\theta} \sum_{j=1}^m$h_{\theta}(x^j) - y^j$^2$.

From matrix algebra, we know that

$\sum_{j=1}^m$h_{\theta}(x^j) - y^j$^2=$$X\theta-Y$$^T$$X\theta-Y$$$.

So we can now go about to define the cost function $J$$\theta$$$ as,

$J$$\theta$$=\frac{1}{2}\sum_{j=1}^m$h_{\theta}(x^j) - y^j$^2 = \frac{1}{2}$$X\theta-Y$$^T$$X\theta-Y$$$.

To find the value of $\theta$ which minimizes $J$$\theta$$$, we can differentiate $J$$\theta$$$ with respect to $\theta$.

$\begin{array}{lll}\frac{\partial}{\partial\theta}J$$\theta$$& =& \frac{1}{2}\frac{\partial}{\partial\theta}$$X\theta-Y$$^T$$X\theta-Y$$\\&=&\frac{1}{2}\frac{\partial}{\partial\theta}$$\theta^TX^TX\theta - \theta^TX^TY -Y^TX\theta+Y^TY$$\\&=&$$X^TX\theta - X^TY$$\end{array}}$

To find the value of $\theta$ which minimizes $\theta$,  we set

$\frac{\partial}{\partial\theta}J(\theta)=0$,

$\begin{array}{lll}$$X^TX\theta - X^TY$$&=&0\end{array}$.

Solving,

$\Huge\begin{array}{lll}\theta & = & $$X^TX$$^{-1}X^TY\end{array}$

Note : (Update 7th Dec 2011)

As pointed by Mr. Andre KrishKevich, the above solution is same as the formula for liner least squares fit (linear least squares, least square in wiki)

## Matlab/Octave code snippet

clear ;
close all;
x = [1:50].';
y = [4554 3014 2171 1891 1593 1532 1416 1326 1297 1266 ...
1248 1052 951 936 918 797 743 665 662 652 ...
629 609 596 590 582 547 486 471 462 435 ...
424 403 400 386 386 384 384 383 370 365 ...
360 358 354 347 320 319 318 311 307 290 ].';

m = length(y); % store the number of training examples
x = [ ones(m,1) x]; % Add a column of ones to x
n = size(x,2); % number of features
theta_vec = inv(x'*x)*x'*y;

The computed $\theta$ values are

$\begin{array}{lll}\theta_0&=&1840.618\\\theta_1&=&-39.820\end{array}$.

Note :

a)

$\begin{array}{lll}\frac{\partial}{\partial\theta}\theta^TX^TX\theta & = & X^TX\theta+XX^T\theta\\ & =&2X^TX\theta\end{array}$

(Refer: Matrix calculus notes - University of Colorado)

b)

$\begin{array}{lll}-\frac{\partial}{\partial\theta}(\theta^TX^TY) &=&-\frac{\partial}{\partial\theta}tr(\theta^TX^TY) \\ & = & -\frac{\partial}{\partial\theta}tr(Y^TX\theta)\\&=&-X^TY\end{array}$

References

An Application of Supervised Learning – Autonomous Deriving (Video Lecture, Class2)

CS 229 Machine Learning Course Materials

Refer: Matrix calculus notes - University of Colorado

Matrix Calculus Wiki

D id you like this article? Make sure that you do not miss a new article by subscribing to RSS feed OR subscribing to e-mail newsletter. Note: Subscribing via e-mail entitles you to download the free e-Book on BER of BPSK/QPSK/16QAM/16PSK in AWGN.

(ایستگاه تبلیغات/istgahtablighat) October 15, 2014 at 7:51 pm

شرکت فناوری اطلاعات متحد با کادر تخصصی در زمینه طراحی و پشتیبانی سایت، در چند
سال فعالیت خود در این زمینه، معیار های اصلی مشتریان برای داشتن یک
وب سایت حرفه ای را در موارد زیر دانسته است:
طراحی با کیفیت
کارایی بالا
امنیت کدها و برنامه ها
سرعت لود بالا
امکان مدیریت کامل سایت
بهینه سازی و سئو
قالبهای استاندارد برای تبلت و موبایل
کاربر پسند بودن
هزینه مناسب
تحویل بموقع
پشتیبانی 24 ساعته
گروه همگام پیشرو متحد موارد فوق را سرلوحه کار خود قرار داده و با این نگرش،
سایت هایی در زمینه های زیر با مناسب ترین قیمت
و در کوتاه ترین زمان در اختیار
شما دوست عزیز قرار میدهد.

طراحی سایت
شرکتی
فروشگاهی
شخصی
آژانس هواپیمایی
کاتالوگ
خبری و …

علاوه بر آن، در زمینه بهینه سازی سایت و سئو تجربه های فراوانی کسب نموده است و می تواند شما را در این امر
راهنمایی و پشتیبانی نماید.

جهت کسب اطلاعات بیشتر و مشاوره رایگان کافی است با
شماره های 0212287101 و 09127005829 تماس حاصل فرمایید.

Motahed Information Technolkgy CO. in the wayy
of designing website and support it, try to hsve the main criteria for a professional
website.
the main criteria is :
- High performance
- Usser friendly
- Security codes and programs
- High qualiity design
- Optimization and Seo
- Low-cost

Motahed Information Technology CO. designing website with reasonable price and in the shortest
tiome in the following domains:
- News
- Catalog
- Agency
- Personal
- Shopping
Furthermore, in te field of website optimization and
Seo has vast experience and can support your website in this
master.

02122287101
09127005829

http://istgahtablighat.com/

%IstgahTablighat%

laser hair growth treatment south jersey October 2, 2014 at 5:51 am

In the league of hair care products, its risk free trial and consumer testimonials are the talk of
the town. However, genetics and male sex hormones do not
fully explain the exact mechanism that will cause hair loss to
start. Regrow Hair – Hair Regrowth – It is easy to Regrow Hair and even make it grow faster with Procerin if your body has not stopped Growing New Hair
and follicles altogether.

bodycon dress u haul September 28, 2014 at 12:16 am

Hervé Léger is master of bodycon black dresses, but here at Team LBD, we’re head over heels for our exclusive range of black body-con dresses which
are great for showing of your shoulders and midriff.
That procuring spot is certainly consumer’s heaven with
retail outlets standing up in the course of bouquets in addition to lane performing artists.
It is rightly said that anew fashion trend takes birth every day, which is promisingly true.

firehouse movers dallas September 27, 2014 at 11:21 am

In most homes, a classily designed LCD TV has become an urgent part
in every entertainment room. No matter whether you are looking to store furniture, apparels, documents or even automobiles, there is always a solution for you.
No trip to Miami would be complete without a shopping trip, and there’s
no shortage of options, with shopping areas catering for every style and budget.
However, if they are susceptible, then try to get a feel for how the
client thinks that the contractor has performed – even if he or she makes some criticisms then finish
by asking what good points the contractor has.
It’s tough for a property that has a cultural or time style that peaked 30
to 40 years earlier to be competitive with its contemporaries.
You can even present some gift vouchers of spa to take the
pleasure of relaxation on their birthday. While away for a longer period, drop the room temperature in 12 -16 degrees of Celsius.
They have taken up the steps to make the clients outsource to them.
There are a number of parameters which make it to the ideal cleaning system.
If this does not fix the problem, you may consider replacing the garage door opener with a new unit.

consumer reports best interstate moving companies September 25, 2014 at 7:01 am

Two weeks ago our employer announced that we would be moving our
office to a new location; this was great news considering our company had grown in number quite drastically
over the last year. Apple Macbook, Dell XPS 15, HP Pavillion dv7, Lenovo G560,
Sony Vaio. There are many affordable solutions for data storage for small and home-based businesses.
For this reason many people make an effort to be brokers of freight companies.

The rides include Hoodoo Run, Woo – Hoo Falls, Stingray Falls, Kiwi Curl, and more.
You can even present some gift vouchers of spa to take the pleasure
of relaxation on their birthday. Internet speed at Symphony is many times your present speed and also
storage boxes will help you to store as much toys as possible.

There are a number of parameters which make it to the ideal cleaning system.

You must know what the data corruption is and how it occurs.

Piano Movers Seattle Wa September 17, 2014 at 3:39 pm

You get a companion and a friend you can actually rely on in times
of trouble. Video Production improved presented impressing are
sure to appeal video companies’ middle and
old, large and small, traditional in a valuable and meaningful subject.
The proficient companies of movers and packers in Ghaziabad are always ready to offer all essential resettlement services at very reasonable
cost. * Storage facilities can be availed for short-term or long-term
needs. It is important for the crew to be aware of the handling
procedures when it comes to high tech gadgets and furniture.
This 30 day timeframe can be both good and bad for a seller.
This means that you can just open the door and leave the screen closed to enjoy the beautiful weather without
pesky bugs. The units send to the client are taken back to the location where the
containers are kept in secured environment. They will shift entire home
stuffs to the desired destination with any damage.

social media marketing campaign September 15, 2014 at 2:11 am

Arrest their attention by using rich colors and well defined graphics.
The final large expense SEO creates is tracking and analyzing data.
SEO (search engine optimization) is a term that is freely
bandied about by people who like to make the art of internet marketing seem mysterious.

kool deck paint September 13, 2014 at 12:48 pm

Islamorada tarpon fishing involves the fishing of an prehistoric rare fish
tarpon through the scientific name Magalops Atlanticus
which is a prehistoric fish considered to be one in every of
the diverse & interesting creatures across the country
which can be present in both side of Atlantic also referred to as as the tarpum tarpon or else the silver king due to it’s majestic appearance & colour which lays 15 million eggs.
Keep in mind the Cooper Mountain resort design their
activities. If you are a foreigner unlikely to return if allowed to leave, you will not be able
to post a bond.

marine freezer option for jeanneau 53 September 7, 2014 at 12:22 pm

Thus we became noobs on the water when we purchased our 1957 Sea King with
a 1964 Johnson motor. With the high temperature sensitivity of some perishable
goods, it’s easy to understand why so many companies rely on vacuum insulated panels such a Thermo – Cor to ensure
the quality of their products. You will catch red fish,
spotted sea trout, pompano, snook, mangrove snapper, Spanish mackerel, small black-tip sharks, small bonnet
sharks, sheephead & in cooler weather the odd king mackerel.

yeti fiberglass coleman marine coolers August 22, 2014 at 5:20 pm

Artificial baits include spoons, jigs, ribbon baits, rattler baits & streamer baits.

For instance at a time when anglers where sticking rigidly to the mantra of
3 to 5 milliliters of flavour, plus 1 to 2 milliliters of intense sweetener to
a pound of base mix or more, as measured by 4 large eggs or
6 medium eggs as a basis of measurement, I tried other approaches and
reaped big rewards against much more experienced anglers.
Some day you will have to divide your worms and start a new farm because of their rapid reproduction rate.

Cochonne baise May 9, 2014 at 7:34 am

Superbe ρoste une fois de plus

Margaretta July 9, 2014 at 6:10 pm

BION I’m imsrseped! Cool post!

Mehdi October 23, 2012 at 2:54 pm

hi

archit December 23, 2011 at 10:33 pm

i am a school student sir ….i m really sorry if i have been spamming ur blog sir…but sir i guess ur blog is amongst the best blogs on education …..
please watch the following videos on you-tube, vote for me and also ask all your near n dear ones to do the same.

If you face any problem in doing so can please go through the bullets below.

You just need to create an account on you-tube using your gmail account
Go to you-tube.com ( in another tab)
Click ‘Create Account’ {your account would be created}
Click the ‘Like’ button

My work is done!
Please do take the pains of liking the videos and ask as many people as you can, it would help me a lot.Please don’t forget to ask other to do the same.
Thank You
m really sorry ..for bothering u ..i hope u’ll understand

Krishna Sankar December 24, 2011 at 6:07 am

@archit: good luck !

Will Dwinnell December 11, 2011 at 7:17 pm

You use:

theta_vec = inv(x’*x)*x’*y;

While that is theoretically correct, why not just use:

theta_vec = x \ y;

Krishna Sankar December 12, 2011 at 5:04 am

@Will: Yes, I could have used that. Just that I preferred to use the expanded version.

And for reference, from Mathworks on mldivide \
http://www.mathworks.in/help/techdoc/ref/mldivide.html
“If the equation Ax = b does not have a solution (and A is not a square matrix), x = A\b returns a least squares solution — in other words, a solution that minimizes the length of the vector Ax – b, which is equal to norm(A*x – b).”

Andrei Krishkevich December 6, 2011 at 10:56 pm

Very nice post. Please correct me if I’m wrong, but it looks like you’re simply deriving the linear least square fit formula. It’s probably worth mentioning in the article.

Krishna Sankar December 7, 2011 at 4:41 am

@Andrei: Yup, it turns out to be the closed form for least squares fit. Recall, we are trying to minimize the square of the errors, which is indeed least squares. I will add a note on that…

linkin8834 December 4, 2011 at 6:48 pm

Hello, is it the same as the MATLAB function polyfit?