Derivative of a summation in order to minimize
I am asked to minimize $\sum^n_{i=0}(x_i - C)^2$ with respect only to C so I know I have to take the derivative respect to C, set it equal to 0, and then solve. I have never done summation in my life and this is very new. I have been trying to search the web for information about how to proceed in these cases but all I have found is long theorems of summation or explanations of very simple operations with the summation notation. I have found, nevertheless, the answer to my question which is as follows: $$S = \sum^n_{i=0}(x_i - C)^2$$
$$\frac{\partial S}{\partial C} = \sum^n_{i=0} 2 (x_i - C)(-1) = -2 \sum^n_{i=0} (x_i - C)$$
$$\frac{\partial S}{\partial C} = 0 \implies \sum^n_{i=0} (x_i - C) = 0$$
$$ \sum^n_{i=0} x_i - \sum^n_{i=0} C = 0$$
$$ \sum^n_{i=0} x_i = \sum^n_{i=0} C = nC$$
$$C = \frac{\sum^n_{i=0} X_i}{n}$$
First step, I don't get what this guy is doing, so when he takes the derivative why he puts that (-1) at the end? is it because this is originally the square of a difference? That's the only thing I can think of..
Second step, where has the (-2) gone? It just vanished.
Third step, I get this one.
Fourth, where is this $nC$ coming from? What does it mean? I understand that at the end what he is doing is $ \sum^n_{i=0} x_i= nC$ so it isolates $C$ and he finally gets the result $C = \frac{\sum^n_{i=0} X_i}{n}$. Any reasons though why he chooses to swap $\sum^n_{i=0} C$ instead of $\sum^n_{i=0} x_i$ for $nC$?
Thanks a lot, I'd really appreciate if you could please refer me to any page you might know about summation which doesn't go about 1000 theorems and properties . Unfortunately I can spend much time on summation as I have many other topics in my exam that I need to cover and this is just a small part of it. Cheers!
$\endgroup$3 Answers
$\begingroup$First step
Think of the sum as a function. To find a minima/maxima for a certain function we need to find it's derivative and set it to 0. And because we have 2 terms in between the parenthesis, we can't just apply the rule $\frac{\partial}{\partial x} x^n = nx^{n-1}$, but instead we apply the chain rule. So that -2 is from the chain rule.
Second step
Let's denote the derivative of the sum as $S_1$, the when $2S_1 = 0$? It's only possible if $S_1 =0$, so we left out the 2.
Fourth step
$C$ is a constant that's independent from $n$ so after every step we just add $C$. If $n$ is the number of steps then we have added $nC$
$\endgroup$ $\begingroup$Consider $f(C)=\sum_{i=1}^{n}(x_{i}-C)^{2}$ so by chain rule the derivative of $f$ is:
$f'(C)=\sum_{i=1}^{n}2(x_{i}-C)(-1)$.
To minimize at a differentiable point we need that $f'(C)=0$ so the above is $0$ when:
$0=-\sum_{i=1}^{n}(2x_{i}-2C)$
Multiplying both sides by $-1$ we get
$0=\sum_{i=1}^{n}(2x_{i}-2C)=2\sum_{i=1}^{n}x_{i}-2\sum_{i=1}^{n}C=2\sum_{i=1}^{n}x_{i}-2C\sum_{i=1}^{n}1=2\sum_{i=1}^{n}x_{i}-2Cn$.
$\sum_{i=1}^{n}1=n$ since $\sum_{i=1}^{n}a_{i}=a_{1}+...+a_{n}$ by definition and we can take $a_{i}=1$ for all $i$.
So
$2Cn=2\sum_{i=1}^{n}x_{i}$
Dividing by $2n$ we get $C=\frac{\sum_{i=1}^{n}x_{i}}{n}$
$\endgroup$ 6 $\begingroup$Step 1: the $-1$ comes from the chain rule.
Step 2: We set the derivative equal to $0$, then divide both sides by $2$. That's why the $2$ goes away.
Step 4: $C + \cdots + C = nC$. ($C$ appears $n$ times on the left.)
$\endgroup$