# Skewed Distribution

** Published:**

This lesson covers Introduction to Skewed Distribution.

Sources:

- https://www.statisticshowto.com/probability-and-statistics/skewed-distribution/
- http://jse.amstat.org/v13n2/vonhippel.html
- https://www.statisticshowto.com/skewness/

# What is a Skewed Distribution?

If one tail is longer than another, the distribution is skewed.

- aka Asymmetric or Asymmetrical distributions
Symmetry means:

- one half of the distribution is a mirror image of the other half
- The tails are exactly the same, e.g. normal distribution

## Left-Skewed

Long left tail

- aka negatively-skewed distributions

- long tail in the negative direction on the number line. The mean is also to the left of the peak.
- Mean is to the left of the peak. This is the main definition behind “skewness”, which is technically a measure of the distribution of values around the mean.
- In most cases, the mean is to the left of the median. This isn’t a reliable test for skewness though, as some distributions (i.e. many multimodal distributions) violate this rule. You should think of this as a “general idea” kind of rule, and not a set-in-stone one.

Box Plot (Left Skewed)

- Left Whisker is longer than right whisker

Histogram (Left Skewed)

## Right-Skewed

Long right tail

- aka positive-skew distributions

- long tail in the positive direction on the number line. The mean is also to the right of the peak.

Histogram (Right Skewed)

Box Plot (Right Skewed)

- Right whisker is longer than left whisker.

- Example:
- Numbers: $1, 2, 3$
- Evenly spaced, with $2$ as the mean

- Adding a number to the far left: $-10,~ 1,~ 2,~ 3$
- Left skewed

- Adding a value to the far right: $1,~ 2,~ 3,~ 10$
- Right skewed

- Numbers: $1, 2, 3$

# Exception

- Distribution from a 2002 General Social Survey. Respondents stated how many people older than 18 lived in their household.
- Right-skewed graph, but the mean is clearly to the left of the median.

# Compute Skewness

Measure of lack of symmetry

A standard normal distribution is perfectly symmetrical and has zero skew.

Other Zero-skewed distributions:

- T Distribution
- Uniform distribution
- Laplace distribution

Computation for various distributions (non-zero)

Distribution Equation Bernoulli distribution. $\frac{1-2p}{\sqrt{p(1-p)}}$ Beta distribution. $\frac{2(b-a)}{2+a+b}\sqrt{\frac{1+a+b}{ab}}$ Binomial distribution. $\frac{1-2p}{\sqrt{np(1-p)}}$ Chi square distribution. $2\sqrt{\frac{2}{r}}$ F distribution. $\frac{2(2n+m-2)}{m-6}\sqrt{\frac{2(m-4)}{n(m+n-2)}}$ Negative binomial. $\frac{2-p}{\sqrt{r(1-p)}}$ $\frac{2-p}{\sqrt{r(1-p)}}$ Poisson Distribution. $\nu^{-1/2}$

## Calculation

- The mean, mode and median can be used to figure out if you have a positively or negatively skewed distribution.
- If the mean is greater than the mode, the distribution is positively skewed.
- If the mean is less than the mode, the distribution is negatively skewed.
- If the mean is greater than the median, the distribution is positively skewed.
- If the mean is less than the median, the distribution is negatively skewed.

- $Skew = \frac{Mean – Mode}{Standard~ Deviation}$
- Mode Skeweness

- The mean, mode and median can be used to figure out if you have a positively or negatively skewed distribution.
Alternative Pearson Mode Skewness

- $Skew = 3 * \frac{Mean – Median}{Standard~ Deviation}$
- Median Skewness

- $Skew = 3 * \frac{Mean – Median}{Standard~ Deviation}$
- SKEW function is used to calculate the skewness of the sample data
- Excel uses adjusted Fisher-Pearson standardized coefficient
- $G = \frac{n}{(n-1)(n-2)} \Sigma(\frac{x_i - \bar{x}}{s})^3$
- $s$ - STDEV.S in excel for Sample

- SKEW.P function is used to calculate the skewness of the population data
- $S_k = \frac{1}{n} \Sigma(\frac{x_i-\mu}{\sigma})^3$

- SKEW function is used to calculate the skewness of the sample data
Data

$X = {1,1,~~ 2,2,2,2,2,~~ 3,3,3,3,~~ 5,5,~~ 7, 8}$

- Plot Histogram
- Insert -> Histogram
- Right Click Data Area -> Format Data Series -> Bin Width 0.9
- Right Skewed

- $SKEW.P() \implies 1.09899799$

- Plot Histogram
$X = 1, 2,~~ 3,3,~~ 5,5,5,~~ 7,7,7,7,~~ 8,8,8,8,8$

Plot Histogram

- Insert -> Histogram
- Right Click Data Area -> Format Data Series -> Bin Width 0.9
- Left Skewed

- $SKEW.P() \implies -0.704386$