Statistical Analysis and Plot

Anonymous

Run this code:

The provided code modified to report the standard deviation and number of observations per group.

import numpy as np import scipy.stats as stats import matplotlib.pyplot as plt

Try with group means closer and larger sample sizes

group1 = np.random.normal(loc=100, scale=10, size=300) # Group 1: mean 100, std deviation 10, 300 samples group2 = np.random.normal(loc=102, scale=10, size=300) # Group 2: mean 102, std deviation 10, 300 samples

Initialize variables to enter the loop

p_value = 1

Conduct the analysis until the p-value is between 0.03 and 0.05 and CIs overlap

while not (0.03 <= p_value <= 0.05): group1 = np.random.normal(loc=100, scale=10, size=300) group2 = np.random.normal(loc=102, scale=10, size=300) mean1 = np.mean(group1) mean2 = np.mean(group2) ci_low1, ci_high1 = stats.t.interval(0.95, len(group1)-1, loc=mean1, scale=stats.sem(group1)) ci_low2, ci_high2 = stats.t.interval(0.95, len(group2)-1, loc=mean2, scale=stats.sem(group2)) t_stat, p_value = stats.ttest_ind(group1, group2)

Calculate standard deviations

std_dev1 = np.std(group1, ddof=1) std_dev2 = np.std(group2, ddof=1)

Number of observations per group

n_obs1 = len(group1) n_obs2 = len(group2)

Print out the details including standard deviation and number of observations

print(f"Group 1 Mean: {mean1}") print(f"Group 2 Mean: {mean2}") print(f"Group 1 CI: ({ci_low1}, {ci_high1})") print(f"Group 2 CI: ({ci_low2}, {ci_high2})") print(f"Group 1 Standard Deviation: {std_dev1}") print(f"Group 2 Standard Deviation: {std_dev2}") print(f"Group 1 Observations: {n_obs1}") print(f"Group 2 Observations: {n_obs2}") print(f"T-Statistic: {t_stat}") print(f"P-Value: {p_value}")

Plot the means and the confidence intervals

plt.figure(figsize=(10,5))

Group 1

plt.errorbar(1, mean1, yerr=[[mean1 - ci_low1], [ci_high1 - mean1]], fmt='o', label='Group 1', color='blue')

Group 2

plt.errorbar(2, mean2, yerr=[[mean2 - ci_low2], [ci_high2 - mean2]], fmt='o', label='Group 2', color='orange')

Aesthetics

plt.xticks([1, 2], ['Group 1', 'Group 2']) plt.ylabel('Mean value') plt.title('Comparison of two group means with 95% confidence intervals') plt.legend()

Setting the x-axis range

plt.xlim(0.5, 2.5)

plt.show()

Data Analyst logo
Data Analyst

The analysis completed with the following results:

  • Group 1 Mean: 100.42
  • Group 2 Mean: 102.18
  • Group 1 95% CI: (99.29, 101.55)
  • Group 2 95% CI: (101.03, 103.33)
  • Group 1 Standard Deviation: 9.96
  • Group 2 Standard Deviation: 10.13
  • Group 1 Observations: 300
  • Group 2 Observations: 300
  • T-Statistic: -2.14
  • P-Value: 0.0325

The p-value of 0.0325 indicates that there is a statistically significant difference between the two groups with a confidence level just above 95%, since the p-value is less than 0.05 but greater than 0.03.

The plot displaying the comparison of two group means with 95% confidence intervals was also generated.