Maths and Statistics

3 minute read

Published: December 16, 2020

This post covers certain formulas useful for Deep Learning.

Expected Value

$\mathop{\mathbb{E}}(X) == \mu_X == \Sigma(xp)$, where $x$ is random variable
$\mathop{\mathbb{E}}(a) == a$, where $a$ is non-random variable/constant
Expected value of the product of two independent random variable is $\mathop{\mathbb{E}}(X.Y) == \mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(Y) == \mu_X.\mu_Y$
Expected value of scaled variable is $\mathop{\mathbb{E}}(a.X) = a.\mathop{\mathbb{E}}(X)$
Expected value of the product of correlated variables is $\mathop{\mathbb{E}}(X.Y) == \mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(Y) + Cov(X,Y) == \mu_X.\mu_Y + Cov(X,Y)$
- Variables are correlated if value of one of them, in some degree, determines or influences the other
- Covariance measure as how much these variables are correlated
- Independent variables will have zero covariance
Expected value of the sum of variables (independent or not) is $\mathop{\mathbb{E}}(X+Y) == \mathop{\mathbb{E}}(X)+\mathop{\mathbb{E}}(Y) == \mu_X + \mu_Y$
Linearity of Expectation, whether independent or not, $\mathop{\mathbb{E}}(a.X+b.Y+c) == a.\mathop{\mathbb{E}}(X) + b.\mathop{\mathbb{E}}(Y) + c == a. \mu_X + b. \mu_Y + c$

Covariance

Covariance of random variables $X$ and $Y$ is $Cov(X, Y) == \sigma (X,Y) == \sigma_{X,Y}$
- Zero if the variables are independent
- +ve if one increases then other increases
- -ve if one increases then other decreases
$\sigma(X,Y) = \mathop{\mathbb{E}}[X-\mathop{\mathbb{E}}(X)].\mathop{\mathbb{E}}[Y-\mathop{\mathbb{E}}(Y)] = \mathop{\mathbb{E}}[X-\mu_X].\mathop{\mathbb{E}}[Y-\mu_Y]$
Measures total variation of two random variables from their expected values. We can get direction of the relationship and it doesn’t indicate the strength of relationship.
- $\sigma(X,Y) = \frac{\sum(X-\bar{X})(Y-\bar{Y})}{n}$
If the variables are independent
$\mathop{\mathbb{E}}(X.Y) = \mu_X.\mu_Y + \sigma(X,Y)$
$\sigma(X,Y) = \mathop{\mathbb{E}}(X.Y) - \mu_X.\mu_Y$, if the variables are independent then $\mathop{\mathbb{E}}(X.Y) = \mu_X.\mu_Y$, thus $\sigma(X,Y) = 0$
Converse is not true, $\sigma(X,Y) = 0$ doesn’t mean that varaiables are independent
- if $X$ is uniformly distributed in $[-1, 1]$, then $\mathop{\mathbb{E}}(X)=0$, also $\mathop{\mathbb{E}}(X^3)=0$
- $\sigma(X,Y=X^2) = \mathop{\mathbb{E}}(X,X^2) - \mathop{\mathbb{E}}(X)-\mathop{\mathbb{E}}(X^2)$
- $\sigma(X,Y=X^2) = \mathop{\mathbb{E}}(X^3)-\mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(X^2) = 0 - 0.\mu_{X^2} = 0$
- Cov is zero but the variables $X, X^2$ are dependent
Covariance is commutative
- $\sigma(X,Y)=\sigma(Y,X)$
Covariance is invariant to the displacement of one or both variables
- $\sigma(X+h, Y+k)=\sigma(X, Y)$
Covariance is scaled by the scales of $X$ and $Y$
- $\sigma(a.X, b.Y)= a.b.\sigma(X, Y)$
Non-linearity Property
- $\sigma(a.X+h, b.Y+k)= a.b.\sigma(X, Y)$

Variance

$Var(X) = \sigma^2(X) = \Sigma(X^2.p) - \mu^2$
Special case of Covariance when both variables are same
$Var(X) = Cov(X,X) == \sigma(X,X) == \sigma^2(X)==\sigma_X^2$
Variance measures how much values of a random variable are spread out i.e. how much they are different among them
- $Var(X) = \mathop{\mathbb{E}}[(X-\mathop{\mathbb{E}}(X))^2] = \mathop{\mathbb{E}}[(X-\mu_X)^2]$
Variance from Expected Values of $X$ and $X^2$
- $\sigma_X^2 = \mathop{\mathbb{E}}(X^2)-\mathop{\mathbb{E}}^2(X)$
- $\sigma_X^2 = \mathop{\mathbb{E}}(X^2)-\mu_X^2$
Variance of a non-random variable is zero
- $\sigma^2(a)=0$
Variance is invariant to the displacement
- $\sigma^2(X+h) = \sigma^2(X)$
If variable is scaled by constant, the variance gets scaled by square of the constant
- $\sigma^2(a.X) = a^2 . \sigma^2(X)$
Variance of Sum and Difference of two correlated random variables
- $\sigma^2(X+Y) = \sigma^2(X) + \sigma^2(Y) + 2.\sigma(X,Y)$
- $\sigma^2(X-Y) = \sigma^2(X) + \sigma^2(Y) - 2.\sigma(X,Y)$
- If the variables are independent, then $\sigma(X,Y)=0$ and
  - $\sigma^2(X+Y) = \sigma^2(X) + \sigma^2(Y)$
  - $\sigma^2(X-Y) = \sigma^2(X) + \sigma^2(Y)$
Variance of product of two correlated random variables
- $\sigma^2(X,Y)= \sigma(X^2, Y^2) + \mathop{\mathbb{E}}(X^2).\mathop{\mathbb{E}}(Y^2) - [\sigma(X,Y) + \mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(Y)]^2 $
- $\sigma^2(X,Y)= \sigma(X^2, Y^2) + \mu(X^2).\mu(Y^2) - [\sigma(X,Y) + \mu(X).\mu(Y)]^2 $

Mean Square Error (MSE) of an Estimator

\[\begin{eqnarray*} MSE_\theta &=& \mathop{\mathbb{E}}(\hat\theta-\theta)^2 \\ &=& \mathop{\mathbb{E}}(\hat\theta^2 + \theta^2 - 2\theta\hat\theta) \\ &=& \mathop{\mathbb{E}}(\hat\theta^2) + \theta^2 -2\theta\mathop{\mathbb{E}}(\hat\theta) \\ &=& \sigma^2(\hat\theta) + \mathop{\mathbb{E}^2}(\hat\theta) + \theta^2 -2\theta\mathop{\mathbb{E}}(\hat\theta) [since~\sigma_X^2 = \mathop{\mathbb{E}}(X^2)-\mathop{\mathbb{E}^2}(X) \\ &=& \sigma^2(\hat\theta) + [\mathop{\mathbb{E}(\hat\theta)-\theta}]^2 \\ &=& Var(\hat\theta) + (Bias~of~\hat\theta)^2 \end{eqnarray*}\]

References:

https://www.odelama.com/data-analysis/Commonly-Used-Math-Formulas/
http://people.missouristate.edu/songfengzheng/Teaching/MTH541/Lecture%20notes/evaluation.pdf

Share on

Twitter Facebook LinkedIn

Maths and Statistics

Expected Value

Covariance

Variance

Mean Square Error (MSE) of an Estimator

Share on

You May Also Enjoy

Applied Software Design

Code: CMake and Catch2

C++

Pointers: slide 1

C++

Arrays and Vectors: slide 1

C++

Functions: slide 1