# Hyothesis

Published:

This post covers Hypothesis Testing.

# Hypothesis Testing

Dependent Variable
(Measured in scale from 1 to 10)
Sample Mean $\bar{X}$
(n=20)
ProbabilityLikely or Unlikely
Student Engagement$\bar{X}_E =$ Something$p \sim 0.05$
Student Learning$\bar{X}_L =$ Something$p \sim 0.10$
• Threshold is difficult to decide - likely or unlikely

# $\alpha$ Levels of Likelihood (unlikelihood) - One Tailed • If the probability of getting a sample mean is less than
• $\alpha = 0.05 (5\%)$
• $\alpha = 0.01 (1\%)$
• $\alpha = 0.001 (0.1\%)$
• then it is considered unlikely.
• If the probability of getting a particular sample mean is less than $\alpha$ (0.05, 0.01, 0.001), it is unlikely to occur
• If a sample mean has a z-score greater than $z^*$ (1.64, 2.33, 3.09), it is unlikely to occur

# Z-Critical Value

• If the probability of obtaining a particular sample mean is less than the alpha level. Then it will fall in the tail which is called the Critical Region and Z-value is called the z-critical value

• If the z-score of a sample mean is greater than z-critical value, we have evidence that these sample statistics are different from regular or untreated population.

• If probability of critical region = alpha level = 0.05

• z-critical value = 1.64

• alpha = 5/100
z = stats.norm.ppf(1 - alpha)
label = f'alpha={alpha} ({alpha*100}%), z={z:.2f}'

• If probability of critical region = alpha level = 0.01

• z-critical value = 2.33

• alpha = 1/100
z = stats.norm.ppf(1 - alpha)
label = f'alpha={alpha} ({alpha*100}%), z={z:.2f}'

• If probability of critical region = alpha level = 0.001

• z-critical value = 3.09

• alpha = 0.1/100
z = stats.norm.ppf(1 - alpha)
label = f'alpha={alpha} ({alpha*100}%), z={z:.2f}'

• Example

• Sample Mean = $\bar{X}$
• $z = \frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}}$
• Let $z=1.82$
• $\bar{X}$ is significant at $p<0.05$
• since zcr > 1.64 (0.05) and < 2.33 (0.01)
• Red region
• Example

• z-scoreSignificant at: ( p< )
3.140.001
2.070.05
2.570.01
14.310.001

# Two-Tailed Critical Values

• Split the Alpha Level in half # Hypothesis

One Tailed TestOne Tailed TestTwo Tailed Test   • Two Outcomes

• Sample Mean is outside the Critical Region
• Sample Mean is inside the Critical Region
• $H_0$, Null Hypothesis

• No Significant difference between the current population parameters and what will be the new population parameters after intervention
• Sample Mean lies outside the critical region
• $\mu \sim \mu_I$
• $H_a$ or $H_1$, Alternate Hypothesis

• $\mu < \mu_I$
• $\mu > \mu_I$
• $\mu \ne \mu_I$
• Example

• $H_0$: Most dogs have four legs (most = more than 50%)
• $H_A$: Most dogs have less than four dogs
• Sample 10 dogs and find all have four legs
• Did we prove that Null Hypothesis is True?
• No
• We have evidence to suggest that most dogs have four legs - since we have sample - but we didn’t prove - we also didn’t prove alternative hypothesis
• We simply fail to reject the Null Hypothesis
• Sample 10 dogs and find that 6 dogs have 3 legs
• Is this evidence to reject the null hypotheis that most dogs have 4 legs
• Yes
• Based on sample, reject Null in favor of Alternative
Z$\alpha$-LevelTest
$\pm 1.64$5% or 0.05One tailed
$\pm 2.33$1% or 0.01One tailed
$\pm 3.09$0.1% or 0.001One tailed
$\pm 1.96$5% or 0.05Two tailed
$\pm 2.58$1% or 0.01Two tailed
$\pm 3.29$0.1% or 0.001Two tailed
• Example

• EngagementLearning.csv

• $\mu = 7.47, \sigma = 2.413$

• Hypothesis Test

• $H_0$ - no significant difference
• not make learners more engaged
• results in same level of engagement
• $H_1$ - significant difference
• Make Learners more Engaged, $\mu < \mu_I$
• Make Learners less Engaged, $\mu > \mu_I$
• Change how much learners are engaged, $\mu \ne \mu_I$
• Which Hypothesis Test to choose

• $\mu < \mu_I$ - One Tailed Test (cr - right)
• $\mu > \mu_I$ - One Tailed Test (cr - left)
• $\mu \ne \mu_I$ - Two Tailed Test
• Two Tailed Test on Learning at 5 % Level

• z-critical values, $\pm 1.96$

• lb = round(stats.norm.ppf(0.025), 3) # -1.96
ub = round(stats.norm.ppf(0.025+.95), 3) # 1.96

• mu = 7.47
sigma = 2.413

n = 30
xbar = 8.3

std_error = sigma / np.sqrt(n)

# z-score of the sample mean on the sampling distribution
z = (xbar - mu)/std_error # 1.884

• At $\alpha = 0.05$, do we reject or fail to reject the null

• Fail to reject since z-score is less than critical value so it is outside the critical region - fail to reject the null hypothesis
• Not enough evidence that the new population parameters will not be significantly different than the current
• mu = 7.47
sigma = 2.413

# large sample size
n = 50
xbar = 8.3

std_error = sigma / np.sqrt(n)
# z-score of the sample mean on the sampling distribution
z = (xbar - mu)/std_error # 2.43
z

• At $\alpha = 0.05$, do we reject or fail to reject the null

• Reject the Null Hypothesis, $p < 0.05$
• Enough evidence that the new population parameters will be significantly different than the current
• What is the probability of randomly selecting a sample of size 50 with mean of at least 8.3 from the population

• mu = 7.47
sigma = 2.413

n = 50
xbar = 8.3
std_error = sigma / np.sqrt(n)
z = (xbar - mu)/std_error
print(z) # 2.43

p = 1 - round(stats.norm.cdf(z), 3)
print(p) # 0.008


# Decision Errors

Reject $H_0$Retain $H_0$
$H_0$ TrueStatistical Decision Errors
Type I Error
Correct
$H_0$ FalseCorrectStatistical Decision Errors
Type II Error
• Example
• $H_0$: The beverage is fine to drink now
• $H_A$: The beverage is too hot to drink
• A - you decide the beverage is fine to drink now, but it’s too hot and you burn your tongue
• Retain $H_0$ and False $H_0$
• B - You decide the beverage is fine to drink now, and it is
• Retain $H_0$ and True $H_0$
• C - Yoy think the beverage is too hot so you wait to drink it, but it’s actually fine now and by the time you drink it, its too cold
• Reject $H_0$ and True $H_0$
• D - You think the beverage is too hot and indeed it is, so you wait to drink it and then it’s perfect
• Reject $H_0$ and False $H_0$
• Example
• $H_0$: Its not going to rain
• $H_A$: It will rain
• A - It doesn’t rain
• $H_0$ True
• B - You didn’t bring your umbrella
• Retain $H_0$
• C - You bring your umbrella
• Reject $H_0$
• D - It rains
• $H_0$ False

# Learning Example

$H_0: \mu_I = \mu$

$H_A: \mu_I \ne \mu$

$\mu = 7.47,~ \sigma = 2.41$

$n=30,~ \bar{X}=8.3, \mu_{new} =7.8$

Two-tailed Test $\alpha=0.05$

Ans: Retain $H_0$ and $H_0$ is correct

mu = 7.47
std = 2.41

n = 30
xbar = 8.3
mu_new = 7.8

std_error = std/np.sqrt(n)

z = (xbar - mu)/std_error
z = round(z, 3)
print(z) # 1.886

print('z-critical at 0.05 is 1.96')
print('Retain Null')

z = (mu_new - mu)/std_error
z = round(z, 3)
print(z) # 0.75
print('H0 is True since mu_new is not significantly different from mu since mu_new is outside critical region')

print('Retain H0 when H0 is True')
print('Correct')

• Sample Size = 50
mu = 7.47
std = 2.41

n = 50
xbar = 8.3
mu_new = 7.8

std_error = std/np.sqrt(n)

z = (xbar - mu)/std_error
z = round(z, 3)
print(z) # 2.435

print('z-critical at 0.05 is 1.96')
print('Reject Null')

z = (mu_new - mu)/std_error
z = round(z, 3)
print(z) # 0.968
print('H0 is True since mu_new is not significantly different from mu since mu_new is outside critical region')

print('Reject H0 when H0 is True')
print('Type I Error')


Tags: