# Maths and Statistics

Published:

This post covers certain formulas useful for Deep Learning.

# Expected Value

• $\mathop{\mathbb{E}}(X) == \mu_X == \Sigma(xp)$, where $x$ is random variable
• $\mathop{\mathbb{E}}(a) == a$, where $a$ is non-random variable/constant
• Expected value of the product of two independent random variable is $\mathop{\mathbb{E}}(X.Y) == \mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(Y) == \mu_X.\mu_Y$
• Expected value of scaled variable is $\mathop{\mathbb{E}}(a.X) = a.\mathop{\mathbb{E}}(X)$
• Expected value of the product of correlated variables is $\mathop{\mathbb{E}}(X.Y) == \mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(Y) + Cov(X,Y) == \mu_X.\mu_Y + Cov(X,Y)$
• Variables are correlated if value of one of them, in some degree, determines or influences the other
• Covariance measure as how much these variables are correlated
• Independent variables will have zero covariance
• Expected value of the sum of variables (independent or not) is $\mathop{\mathbb{E}}(X+Y) == \mathop{\mathbb{E}}(X)+\mathop{\mathbb{E}}(Y) == \mu_X + \mu_Y$
• Linearity of Expectation, whether independent or not, $\mathop{\mathbb{E}}(a.X+b.Y+c) == a.\mathop{\mathbb{E}}(X) + b.\mathop{\mathbb{E}}(Y) + c == a. \mu_X + b. \mu_Y + c$

# Covariance

• Covariance of random variables $X$ and $Y$ is $Cov(X, Y) == \sigma (X,Y) == \sigma_{X,Y}$
• Zero if the variables are independent
• +ve if one increases then other increases
• -ve if one increases then other decreases
• $\sigma(X,Y) = \mathop{\mathbb{E}}[X-\mathop{\mathbb{E}}(X)].\mathop{\mathbb{E}}[Y-\mathop{\mathbb{E}}(Y)] = \mathop{\mathbb{E}}[X-\mu_X].\mathop{\mathbb{E}}[Y-\mu_Y]$
• Measures total variation of two random variables from their expected values. We can get direction of the relationship and it doesn’t indicate the strength of relationship.
• $\sigma(X,Y) = \frac{\sum(X-\bar{X})(Y-\bar{Y})}{n}$
• If the variables are independent
• $\mathop{\mathbb{E}}(X.Y) = \mu_X.\mu_Y + \sigma(X,Y)$
• $\sigma(X,Y) = \mathop{\mathbb{E}}(X.Y) - \mu_X.\mu_Y$, if the variables are independent then $\mathop{\mathbb{E}}(X.Y) = \mu_X.\mu_Y$, thus $\sigma(X,Y) = 0$
• Converse is not true, $\sigma(X,Y) = 0$ doesn’t mean that varaiables are independent
• if $X$ is uniformly distributed in $[-1, 1]$, then $\mathop{\mathbb{E}}(X)=0$, also $\mathop{\mathbb{E}}(X^3)=0$
• $\sigma(X,Y=X^2) = \mathop{\mathbb{E}}(X,X^2) - \mathop{\mathbb{E}}(X)-\mathop{\mathbb{E}}(X^2)$
• $\sigma(X,Y=X^2) = \mathop{\mathbb{E}}(X^3)-\mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(X^2) = 0 - 0.\mu_{X^2} = 0$
• Cov is zero but the variables $X, X^2$ are dependent
• Covariance is commutative
• $\sigma(X,Y)=\sigma(Y,X)$
• Covariance is invariant to the displacement of one or both variables
• $\sigma(X+h, Y+k)=\sigma(X, Y)$
• Covariance is scaled by the scales of $X$ and $Y$
• $\sigma(a.X, b.Y)= a.b.\sigma(X, Y)$
• Non-linearity Property
• $\sigma(a.X+h, b.Y+k)= a.b.\sigma(X, Y)$

# Variance

• $Var(X) = \sigma^2(X) = \Sigma(X^2.p) - \mu^2$
• Special case of Covariance when both variables are same
• $Var(X) = Cov(X,X) == \sigma(X,X) == \sigma^2(X)==\sigma_X^2$
• Variance measures how much values of a random variable are spread out i.e. how much they are different among them
• $Var(X) = \mathop{\mathbb{E}}[(X-\mathop{\mathbb{E}}(X))^2] = \mathop{\mathbb{E}}[(X-\mu_X)^2]$
• Variance from Expected Values of $X$ and $X^2$
• $\sigma_X^2 = \mathop{\mathbb{E}}(X^2)-\mathop{\mathbb{E}}^2(X)$
• $\sigma_X^2 = \mathop{\mathbb{E}}(X^2)-\mu_X^2$
• Variance of a non-random variable is zero
• $\sigma^2(a)=0$
• Variance is invariant to the displacement
• $\sigma^2(X+h) = \sigma^2(X)$
• If variable is scaled by constant, the variance gets scaled by square of the constant
• $\sigma^2(a.X) = a^2 . \sigma^2(X)$
• Variance of Sum and Difference of two correlated random variables
• $\sigma^2(X+Y) = \sigma^2(X) + \sigma^2(Y) + 2.\sigma(X,Y)$
• $\sigma^2(X-Y) = \sigma^2(X) + \sigma^2(Y) - 2.\sigma(X,Y)$
• If the variables are independent, then $\sigma(X,Y)=0$ and
• $\sigma^2(X+Y) = \sigma^2(X) + \sigma^2(Y)$
• $\sigma^2(X-Y) = \sigma^2(X) + \sigma^2(Y)$
• Variance of product of two correlated random variables
• $\sigma^2(X,Y)= \sigma(X^2, Y^2) + \mathop{\mathbb{E}}(X^2).\mathop{\mathbb{E}}(Y^2) - [\sigma(X,Y) + \mathop{\mathbb{E}}(X).\mathop{\mathbb{E}}(Y)]^2$
• $\sigma^2(X,Y)= \sigma(X^2, Y^2) + \mu(X^2).\mu(Y^2) - [\sigma(X,Y) + \mu(X).\mu(Y)]^2$

# Mean Square Error (MSE) of an Estimator

$\begin{eqnarray*} MSE_\theta &=& \mathop{\mathbb{E}}(\hat\theta-\theta)^2 \\ &=& \mathop{\mathbb{E}}(\hat\theta^2 + \theta^2 - 2\theta\hat\theta) \\ &=& \mathop{\mathbb{E}}(\hat\theta^2) + \theta^2 -2\theta\mathop{\mathbb{E}}(\hat\theta) \\ &=& \sigma^2(\hat\theta) + \mathop{\mathbb{E}^2}(\hat\theta) + \theta^2 -2\theta\mathop{\mathbb{E}}(\hat\theta) [since~\sigma_X^2 = \mathop{\mathbb{E}}(X^2)-\mathop{\mathbb{E}^2}(X) \\ &=& \sigma^2(\hat\theta) + [\mathop{\mathbb{E}(\hat\theta)-\theta}]^2 \\ &=& Var(\hat\theta) + (Bias~of~\hat\theta)^2 \end{eqnarray*}$

References:

• https://www.odelama.com/data-analysis/Commonly-Used-Math-Formulas/

• http://people.missouristate.edu/songfengzheng/Teaching/MTH541/Lecture%20notes/evaluation.pdf

Tags: