Python - Jiří Benedikt

You can run python code here! See bellow for Lean Six Sigma Examples. Copy a paste the code to the Python window bellow.

Average, mode median

Average

numbers = [12, 23, 35, 56, 52, 45]
mean = sum(numbers) / len(numbers)
print(mean)

Mean

from statistics import mode
numbers = [12, 56, 23, 35, 56, 52, 45]
mode = mode(numbers)
print(mode)

Median

from statistics import median
numbers = [12, 23, 35, 56, 52, 45]
median = median(numbers)
print(median)

Histogram

Draws a histogram from data.

import numpy as np
import matplotlib.pyplot as plt

sample = [40,42,45,42,46,47,50,44,46.5,43.1,43,46,43.5,44,48,44.5,44,45,41,46,41,47.7,44,45,43.2,42,44,45,45,44]

plt.hist(sample, density=False,bins=10)
plt.show()

Normal distribution

Drawing bell curve – example of 3 bell curves

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

#x-axis ranges from -5 and 5 with .001 steps
x = np.arange(-5, 5, 0.001)

#define multiple normal distributions
plt.plot(x, norm.pdf(x, 0, 1), label='μ: 0, σ: 1', color='gold')
plt.plot(x, norm.pdf(x, 0, 1.5), label='μ:0, σ: 1.5', color='red')
plt.plot(x, norm.pdf(x, 0, 2), label='μ:0, σ: 2', color='pink')

#add legend to plot
plt.legend(title='Parameters')

#add axes labels and a title
plt.ylabel('Density')
plt.xlabel('x')
plt.title('Normal Distributions', fontsize=14)

plt.show()

Bell curves example – height of men and women

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

#x-axis ranges from -5 and 5 with .001 steps
x = np.arange(140, 200, 0.001)

#define multiple normal distributions
plt.plot(x, norm.pdf(x, 178.4, 7.6), label='men', color='blue')
plt.plot(x, norm.pdf(x, 164.7, 7.1), label='women', color='red')

#add legend to plot
plt.legend(title='Parameters')

#add axes labels and a title
plt.ylabel('Density')
plt.xlabel('height in cm')
plt.title('Normal Distributions', fontsize=14)

plt.show()

Hypothesis testing

Binomial Test

A car manufacturer claims that no more than 10% of their cars are unsafe. 15 cars are inspected for safety, 3 were found to be unsafe. Test the manufacturer’s claim:

from scipy import stats

b = stats.binom_test(3 , n=15, p=0.1, alternative='greater')

print(b)

One-sample T-Test

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_1samp.html#scipy.stats.ttest_1samp

The literature says that men of a fictional tribe of the Eagles have an average height of 175 cm.

The anthropologist that visited the tribe measured ten men selected by random these were the heights measured.

Based on the measured sample, decide if the literature is right or not.

from scipy import stats

eagles = [153, 156, 156, 167, 166, 167, 168, 174, 175, 181]

p = stats.ttest_1samp(eagles,175)

print(p)

Sign Test

Bank of America West Palm Beach, FL branch manager indicates
that the median number of savings account customers per day is 64.
A clerk from the same branch claims that it was more than 64.
Clerk collected the number of savings account customers per day data for 10 random days.
Can we reject the branch manager’s claim at 95% significance level?

from scipy import stats

customer = [60,66,65,70,68,72,46,76,77,75]

expectedmedian = 64

lower=0

for i in customer:
    if i<expectedmedian:
        lower = lower + 1

signtest = stats.binom_test(lower , n=len(customer), p=0.5, alternative='less')

print(signtest)

Two-Sample T-Test

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html

The literature says that men of a fictional tribe of the Eagles have the same average height as the fictional tribe of the Bulls.
The researcher measured 10 random men from each tribe and want to conclude that height of tribes is the same, on 95% level of significance (alpha = 5%) Is it the same?

from scipy import stats

eagles = [153, 156, 156, 161, 166, 167, 168, 174, 175, 181]
bulls = [160, 165, 168, 170, 171, 174, 176, 181, 181, 183]

p = stats.ttest_ind(eagles,bulls)

print(p)

Correlation test

https://docs.scipy.org/doc/scipy/reference/stats.html

An HR manager wants to understand the perception of the hiring process among new hires. They ran a survey rating where 0 = awful 20 = wonderful experience. The manager want to see if the rating is correlated with the numbers of days to hire.

from scipy import stats

daystohire = [38,32,41,48,47,50,27,39,47,33,36,30]
satisfaction = [9,15,13,12,7,4,15,9,7,16,7,11]

k = stats.pearsonr(daystohire,satisfaction) 
print(k)