A statistical significance test determines whether the observed data deviates sufficiently from the null hypothesis to reject it. This test helps in understanding whether the observed effects are genuine or could have happened by chance.
In statistics, a hypothesis is a statement or assumption about a population parameter. Hypotheses are used to test the validity of claims or assumptions made about the population.
The null hypothesis is a statement that there is no effect or no difference, and it serves as the default assumption that the test seeks to challenge. It's usually a statement of "no change" or "no effect."
Example:
The alternative hypothesis is a statement that indicates the presence of an effect or a difference. It contradicts the null hypothesis.
Example:
A two-tailed test checks for the possibility of an effect in both directions, meaning it tests whether the sample is either significantly higher or significantly lower than the population mean.
Example:
The alpha value, also known as the significance level, is the threshold for rejecting the null hypothesis. It's the probability of making a Type I error, which is rejecting a true null hypothesis. Common alpha values are 0.05, 0.01, and 0.10.
Example:
The p-value is the probability of obtaining test results at least as extreme as the observed results, under the assumption that the null hypothesis is true. A low p-value (< α) indicates strong evidence against the null hypothesis, so you reject the null hypothesis.
Example:
A T-test is used to determine if there is a significant difference between the means of two groups. It's one of the most commonly used statistical tests.
Example: One-Sample T-Test
Let's conduct a one-sample t-test to determine if the mean of a sample is significantly different from a known value (e.g., population mean).
Hypotheses:
Example with Python:
from scipy import stats import numpy as np # Generate random sample data np.random.seed(0) sample_data = np.random.normal(loc=0, scale=1, size=1000) # Mean = 0, Std Dev = 1 # Population mean population_mean = 0 # Perform one-sample t-test t_statistic, p_value = stats.ttest_1samp(sample_data, population_mean) # Define alpha value alpha = 0.05 print("T-statistic:", t_statistic) print("P-value:", p_value) # Decision based on p-value and alpha if p_value < alpha: print("Reject the null hypothesis") else: print("Fail to reject the null hypothesis")
Decision: