Hypothesis Testing: Type 1 and Type 2 Errors

Ken Hoffman
Analytics Vidhya
Published in
3 min readJan 10, 2021

--

Introduction:

In hypothesis testing, the goal is to determine whether a statement (null hypothesis) is true or false. For example, you might want to test whether a store’s marketing campaign is effective. In order to do this, you would compare statistics, such as the average number of purchases in a given day, before and after the campaign.

In some cases, however, researchers will reject or accept the null hypothesis when they shouldn’t have. Data Scientists refer to these errors as Type I(False Positive) and Type II(False Negative) errors.

Type I Errors:

When conducting hypothesis tests, there is always a chance of rejecting a null hypothesis when it shouldn’t have been rejected. The confidence level, alpha (𝛼), is used as a threshold to determine whether the null hypothesis should be accepted or rejected. It is also represents the probability that you reject the null hypothesis when it is actually true. This scenario is referred to as a Type I error or False Positive.

Type II Errors:

A Type II error occurs when a Data Scientist fails to reject a null hypothesis that should’ve been rejected. These errors are also referred to as False Negatives.

Minimizing Type 1 or Type 2 Errors:

Different situations call for Data Scientists to minimize one type of error over the other. The two errors are inversely related to one other; reducing Type I errors will increase Type II errors and vice versa. Lets go through some different scenarios and determine whether it is more important to reduce Type I errors or Type II errors:

  • Testing patients for Coronavirus.
  • Credit Card company flagging suspicious activity amongst its customers.
  • Jury needs to decide whether someone is guilty of a felony.

In the first and second scenario, you would want to limit the amount of Type II errors that occur. In the first scenario, because of how contagious the virus is, it is better to diagnose a patient that doesn’t have Coronavirus with Coronavirus than the opposite. For the second scenario, it is better to falsely flag someone for suspicious Credit Card activity than it is to not flag someone for suspicious Credit Card activity when that person is, in fact, committing fraud.

In the third scenario, a Type I error would be worse than a Type II error. A Type I error means that you would send an innocent man or woman to jail. At the same time, a Type II error is not exactly ideal either as it means that the jury is letting a guilty man or woman get away with a felony.

Conclusion:

When performing hypothesis tests, it is important to understand the difference between Type I and Type II errors so that you can determine which error should be limited based on the scenario. In certain situations, such as testing for viruses or diseases, it is more important to limit the amount of False Negatives, while other situations, such as ones relating to the judicial system, call for limiting the amount of False Positives.

References:

--

--