##### All CategoriesData Science and Artificial Intelligence

In this blog, we will be learning about Hypothesis Testing, Null and Alternate Hypothesis, Significance Level and Rejection region, Statistical Error, Confidence Level and p-value with some practical exercise on Null and Alternative Hypothesis.

Let us first understand what is hypothesis and why we do the testing. Free Step-by-step Guide To Become A Data Scientist

Subscribe and get this detailed guide absolutely FREE

Hypothesis

Is an idea or prediction or assumption that can be tested by an experiment.

If we say petrol price in India is high, so this is an assumption or a statement but is not testable, until I have something to compare it with. But if we define ‘high’ as any price higher than Rs. 73.41, then it immediately becomes a hypothesis.

Now, what cannot be a hypothesis is, suppose in a class we compare the progress of two Students A and B of a class before their assessment, would the two students do better or worse, Statistically this is an assumption but there is no data to test it, therefore, it cannot be a hypothesis of a statistical test.

Conversely, we may compare the progress of two students who have already passed that class, as we have data for both.

The assumption is called a hypothesis and the statistical tests used for this purpose are called statistical hypothesis tests.

This assumption or hypothesis made may or may not be true. Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses.

There are two hypotheses that are made:

• Null Hypothesis, denoted by H0, and
• Alternative Hypothesis, denoted by H1 or HA

The null hypothesis is the one to be tested and the alternative is the converse of the null hypothesis.

Steps Involved In Hypothesis Testing

There are 4 steps involved in Hypothesis Testing:

• We must formulate our null and alternative hypothesis
• Once the hypotheses have been formulated, we will choose the right test for our hypothesis
• The third step included the execution of the test
• Then making a decision based on the result to accept or reject the null hypothesis

The above steps are also called Data-Driven Decision-making

Example

Explaining the above concepts and the four steps involved with the help of a simple example:

Suppose we want to flip a coin for 50 times, and we assume that half the flip would result in heads and half the result would be in tails.

Here the null hypothesis would be: “result would be half heads and half tails“.

And the alternate hypothesis would be: “number of heads and tails would be very different“.

Now, when we actually execute the experiment and test the outcomes, we see that we get 30 heads and 20 tails.

In this case, we would reject the null hypothesis and accept the alternative hypothesis.

Hope now you have a clear idea as what is a hypothesis and what we mean by hypothesis testing.

If you want to understand why hypothesis testing works, you should first have an idea about the significance level and the rejection region.

So let’s jump right into the action.

Significance Level

Normally we aim at rejecting the null hypothesis if it is false. However, as with any test, there is a small chance that we could get it wrong and reject the null hypothesis that is true.

Hence, the significance level is the probability of rejecting the null hypothesis when it is true. It is denoted by α.

Typical values for α are 0.01, 0.05 and 0.1. It is a value that we select based on the certainty we need. 0.05 is the most commonly used value.

In other words, the significance level is a statistical way of demonstrating how confident you are in your conclusion. If you set a high alpha (0.1), then you’ll have a better chance at supporting your alternative hypothesis. However, you’ll also have a bigger chance of being wrong about your conclusion.

Example

Suppose, we need to test if a machine is working properly. We would expect the test to make little or no mistakes. As we want to be very precise, we should pick a low significance level such as 0.01, as low level of significance tells us that we are pretty sure about the null hypothesis to be true.

A packet of cookies from a famous brand contains 5 pieces per packet. If the machine drops 1 extra cookie in the packet, it will lead to the overall damage of the packaging. So, in certain situations, we need to be as accurate as possible. Here, we can keep the value of α as 0.01.

However, if we are analyzing humans or an organization, we would expect more random or uncertain behavior. Hence, a higher degree of error.

Now that we got some idea of Hypothesis Testing, we will now see the mechanics of this testing.

Mechanism of Hypothesis Testing

Suppose we want to analyze that in a certain university how students are performing on an overall basis.

The dean of the university says that on an average student have 75%. But we can’t simply agree on this opinion and we start testing.

Here H0 is: The population mean percentage is 75%

And H1/HA : The population mean percentage is not 75%

Therefore, H0 : μ0 = 75%

H1 : μ0 ≠ 75%

Now we would perform the Z-test, the formula is :

Here x̅ is the sample mean, ‘u’ is hypothesized mean, ‘s’ is the standard error and ‘n’ is the sample size.

Through this, we are standardizing the sample mean we got. So if the sample mean is close enough to hypothesized mean , then Z will be close to 0.

In this case, we will accept the Null Hypothesis (As demonstrated in the image below)

Otherwise, we will reject it.

Now, you might be thinking when we will be rejecting the Null Hypothesis.

Now we will see how big should Z be to reject the Null Hypothesis.

Here we will be using a two-sided or two-tailed test. A two-tailed test is used when the null contains an equality or an inequality sign.

“A two-tailed test is a test of a statistical hypothesis, where the region of rejection is on both sides of the sampling distribution and the region of acceptance is in the middle.”

When we calculate Z, we will get a value. If this value falls into the middle part, then we accept the null hypothesis but If it falls outside, in the shaded region, then we reject the null hypothesis.

The shaded part in the above image is called the Rejection Region.

The cut-off value for the rejection region depends on the value of the Significance Level.

For instance, if the level of significance, α, is 0.05. Then we will divide α by 2, and we get 0.025 on the left side and 0.025 on the right side.

Now, these are values we can check from the z-table. When α is 0.025, Z is 1.96. So, 1.96 on the right side and -1.96 on the left side as shown in the below image.

Therefore, the value of Z we get from the test is lower than -1.96, or higher than 1.96, we will reject the null hypothesis. Otherwise, we will accept it.

Here’s a summary of why we need Significance Level and Rejection Region?

The significance level and the reject region are quite important in the process of hypothesis testing. The level of significance conducts the accuracy of prediction. We choose it depending on how big of a difference a possible error could make. On the other hand, the reject region helps us decide whether or not to reject the null hypothesis.

Statistical Error

No hypothesis test is 100% accurate. There is always a chance of making an incorrect conclusion because the test is based on Probability. Hence while doing hypothesis testing, two types of errors are possible, Type I and Type II Error.

Type I Error: When the null hypothesis is true and you reject it, you make a type I error. This type of error is also known as False Positive.

Type II Error: When the null hypothesis is false and you accept it, you make a  type II error. This type of error is also known as False Negative.

The probability of committing a Type I error (False positive) is equal to the significance level α.

The probability of committing a Type II error (False negative) is equal to the beta β.

Let’s understand this with the help of examples

Suppose we have to predict whether a criminal is guilty or not.

We define our Null Hypothesis and Alternate Hypothesis as:

H0 : Person is not guilty of the crime

H1 : Person is guilty of the crime.

In the above example, two cases can occur.

i) The person is judged as guilty when the person actually did not commit the crime i.e., convicting an innocent person, here is when we commit Type I Error.

ii) The person is judged not guilty when they actually did commit the crime i.e., letting a guilty person go free, this is where we commit Type II Error.

Let understand this with the help of the below table.

 H0 is True(Type I Error: False Positive) H0 is False(Type II Error: False Negative) Reject the Null Hypothesis ✓(H0 is true and yet we reject it) The person is judged as not guilty when he actually did commit the crime. Accept the Null Hypothesis The person is judged as guilty when he actually did not commit the crime. ✓(H0 is false and yet we accept it)

We can take another example of Medical Diagnosis.

H0 : Medical test cures Disease A

H1 : Medical test doesn’t cure Disease A

For the above instance, the table would be:

 H0 is True(Type I Error: False Positive) H0 is False(Type II Error: False Negative) Reject the Null Hypothesis ✓(H0 is true and yet we reject it) The medical test didn’t cure disease A for a person still the reports says it does. Here we accept the Null Hypothesis. Accept the Null Hypothesis Medical test cured disease A for a person still the reports says it doesn’t. Here we reject the Null Hypothesis ✓(H0 is false and yet we accept it)

I hope we are now pretty clear about the Statistical Error.

Now we will understand the concept of p-value. But before moving further we will understand what are Point Estimate and Confidence Interval.

A specific value is called an Estimate. There are two types of Estimates:

• Point Estimate
• Confidence Interval Estimate

Point Estimate

It is a single number. The point estimate is located exactly in the middle of the confidence interval. As we have seen in our earlier blog the sample mean is a point estimate of the population mean μ. Likewise,the sample variance S2 is a point estimate of population variance σ2. We are always keen on looking to the unbiased estimator which has an expected value equal to the population parameter.

For Eg: We are interested in knowing the mean weight of 10-year-old girl living in the United States. Since it would have been impractical to weigh all the 10-year-old girls in the United States, we took a sample of 16 and found that the mean weight was 25 kg. This sample mean of 25 is a point estimate of the population mean.

We cannot rely on this data as not all 10 years old girl would be of 25 kg, therefore we feel point estimate is of little usefulness.

Confidence Interval

This, on the other hand, is an interval. Confidence level provides much more information and is preferred when making inferences. We believe that the point estimate lies somewhere in the middle of the Confidence Interval.

Confidence Interval is the range within which we expect the population parameter to be.

If we say the age meal in India is somewhere between Rs. 50 to 100, in this way we have created a Confidence Interval around the point estimate. However, there is still some uncertainty left which we measure in Level of Confidence.

Taking the same above example when we say we are 90% confident that the population parameter lies between Rs 50 to 100. However, we cannot be 100% confident unless we go through the entire population.

The Confidence level is denoted by 1 – α and is called the Confidence Level of Interval. α is a value between 0 and 1.

Example: if we say we are 90% confident that the parameter is inside the interval, α is 10%.

If we are 95% confident, α will be 5%.

This Confidence Interval is calculated with the following formula:

The common confidence level are 90%, 95% and 99%. With respective alphas of 10%, 5% and 1%. Or we say α = 0.1%, 0.05% and 0.01%.

Let’s take one more example to ensure that we have hold grip of this concept.

I don’t know the age of the reader reading this blog, but I am 95% confident that your age lies between 18 to 55 years, based on the fact that you looking online for Statistics article. However I don’t have much information to begin with, also I don’t have any information about the age of any of the reader. Hence the wider interval.

So I am 95% confident that you are between 18 and 55 years old. Also, I’m 99% confident that you are between 10 and 70 years old and I am 100% confident that you are between 0 and 110 years old.

Finally, I’m 5% confident that you are 25% years old. Since the value located somewhere between our interval, which is a very arbitrary number.

The above explanation is described by the chart below.

100% confidence interval is completely useless as I included all the possible value for age.

25 years old is a pretty useful estimate but the level of confidence of 5% is too small for us to make any meaningful analysis.

Alright, we will now discuss p-value.

p-Value

We know that the Null Hypothesis can be rejected at various Levels of Significance, but we couldn’t find a level of significance for which we can no longer do it and here’s how a new measure was introduced called the p-value.

This is the most common way of testing hypothesis. Instead of testing the hypothesis at predefined levels of Significance, we can find the smallest level of significance at which we can still reject the null hypothesis.

How Do We Calculate The p-Value?

When we test for a hypothesis using a value of Significance we get the value of Z. Then we check for the corresponding value of Z in the z-table. Then using this corresponding value of Z we calculate the value of p. If the value of p is lower than the significance level taken for this particular test we reject the Null hypothesis, otherwise we accept it.

We calculate p using the formula:

1-tailed test:  p = (1 – number from the table)

2-tailed test:  p = (1 – number from the table) * 2

Example

We are doing hypothesis testing with the l of significance 0.05, we get the value of Z as 2.81.

Look for the corresponding value of Z in the z table and get the value as 0.9975.

And calculate the p-value as: p = 1 – 0.9975 = 0.002

Now we compare the value of p with alpha.

Since (p-value) 0.0025 < 0.05 (α), therefore we reject the null hypothesis

And this is how we calculate the p-value for 1-tailed test as well as 2-tailed test.

P-value is an extremely powerful measure at it works for all distribution.

Difference Between Z-Test and T-Test

We have already read about the T-test in our previous blog. You can refer to that blog from the link given at the end of this blog.

Well, both Z score vs T score is part of hypothesis testing under the normal distribution.

 Z – TEST T – TEST The z-score is calculated with the formula:z = (X-μ)/σ The t-score is calculated by the formula:T = (X – μ) / [ s/√(n) ] Z-score is used when we know the Population Standard Deviation σ. T-score is used when we don’t know the Population Standard Deviation σ. When the sample size is above 30, we use the z-score When the sample test is below 30, we use the t-score.

Now we are almost done with the concept of hypothesis testing we will see some practical examples on it.

Q: State the null hypothesis, H0 and the alternative hypothesis, Ha: for the following statements

1. Mean number of years Indians work before retiring is 40.
2. At most 60% of Indians vote in presidential elections.
3. Mean starting salary for ABC University graduates is at least Rs 300,000 per year.
4. 10 percent of high school seniors fail each month.
6. The mean number of cars a person owns in her lifetime is not more than 5.
7. About half of Indians prefer to live away from cities, given the choice.
8. Indians have a mean paid vacation each year of six weeks.
9. The chance of developing breast cancer is under 11% for women.
10. Private universities’ mean tuition fee is more than 200,000 per year.

Ans. A:  H0:μ = 40; Ha:μ ≠ 40

B: H0:p ≤ 0.60; Ha:p > 0.60

C: H0:μ ≥ 300,000; Ha:μ < 300,000

D: H0:p = 0.1; Ha:p ≠ 0.1

E: H0:p = 0.7;Ha:p < 0.7

F: H0:μ ≤ 5:Ha:μ > 5

G: H0:p = 0.50;Ha:p ≠ 0.50

H: H0:μ = 6;Ha:μ ≠ 6

I: H0:p ≥ 0.11;Ha:p < 0.11

J: H0:μ ≤ 200,000;Ha:μ >  200,000

This brings us to the end of this blog. Hope this blog helped you in understanding the working of Hypothesis Testing. For any query or suggestion do drop us a comment below.

You can refer to our previous blog based on statistics.

Keep visiting our website AcadGild for more blogs on Data Science and Data Analytics.

Happy Learning:)

### Mitali Singh

Python|| Machine Learning|| Statistics|| Data Science

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Close