Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente | ||
python:first_course_statistics [2016/11/29 13:37] Beretta, Anna Letizia |
python:first_course_statistics [2017/09/26 08:54] (Version actuelle) Francesco Beretta [General instructions] |
||
---|---|---|---|
Ligne 5: | Ligne 5: | ||
* pandas [[http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe|dataframes]] | * pandas [[http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe|dataframes]] | ||
* [[http://matplotlib.org/api/pyplot_summary.html|matplotlib.pyplot]] | * [[http://matplotlib.org/api/pyplot_summary.html|matplotlib.pyplot]] | ||
+ | |||
+ | |||
+ | Get the data from [[http://people.stern.nyu.edu/jsimonof/Casebook/Data/ASCII/README.html|this site]]. | ||
+ | |||
Save your scripts in a folder inside the data folder, calling the script folder 'my_scripts' or whaterver. If 'my-scripts' is set as your [[python:generic_features#get_the_current_working_directory_address|current working directory]], then the data files are available under this address '../[data file]', for instantce: '../geyser1.TAB' | Save your scripts in a folder inside the data folder, calling the script folder 'my_scripts' or whaterver. If 'my-scripts' is set as your [[python:generic_features#get_the_current_working_directory_address|current working directory]], then the data files are available under this address '../[data file]', for instantce: '../geyser1.TAB' | ||
Ligne 35: | Ligne 39: | ||
import matplotlib.pyplot as plt | import matplotlib.pyplot as plt | ||
import pandas as pd | import pandas as pd | ||
- | gysr1_boxplot = pd.read_csv('...\geyser1.TAB', '\t') | + | gysr1_boxplot = pd.read_csv('.../geyser1.TAB', '\t') |
data_gysr1 = gysr1_boxplot['Interval'] | data_gysr1 = gysr1_boxplot['Interval'] | ||
plt.boxplot(data_gysr1) | plt.boxplot(data_gysr1) | ||
Ligne 55: | Ligne 59: | ||
import matplotlib.pyplot as plt | import matplotlib.pyplot as plt | ||
import pandas as pd | import pandas as pd | ||
- | geysr1_scatterplot = pd.read_csv('...\geyser1.TAB', '\t') | + | geysr1_scatterplot = pd.read_csv('.../geyser1.TAB', '\t') |
geysr1_data_Xax = geysr1_scatterplot['Duration'] | geysr1_data_Xax = geysr1_scatterplot['Duration'] | ||
geysr1_data_Yax = geysr1_scatterplot['Interval'] | geysr1_data_Yax = geysr1_scatterplot['Interval'] | ||
Ligne 291: | Ligne 295: | ||
===== Productivity versus quality in the assembly plant (p.29) ===== | ===== Productivity versus quality in the assembly plant (p.29) ===== | ||
+ | |||
+ | It worked the first time but now it doesn't work again. Maybe again a windows error? | ||
<code python> | <code python> | ||
+ | #1 | ||
+ | import matplotlib.pyplot as plt | ||
+ | import pandas as pd | ||
+ | data_comparison = pd.read_csv('D:\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\prdq.TAB', '\t') | ||
+ | non_japanese = data_comparison.loc[data_comparison['QualNonJ']] | ||
+ | japanese = data_comparison.loc[data_comparison['QualJapn']] | ||
+ | plt.boxplot([non_japanese['Quality'],japanese['Quality']], labels= ['Non-japanese','Japanese']) | ||
+ | plt.show() | ||
+ | |||
+ | #2 | ||
import matplotlib.pyplot as plt | import matplotlib.pyplot as plt | ||
import pandas as pd | import pandas as pd | ||
Ligne 298: | Ligne 314: | ||
non_japanese = data_comparison.loc[data_comparison['ProdNonJ']] | non_japanese = data_comparison.loc[data_comparison['ProdNonJ']] | ||
japanese = data_comparison.loc[data_comparison['ProdJapn']] | japanese = data_comparison.loc[data_comparison['ProdJapn']] | ||
- | plt.boxplot([non_japanese['Quality'],japanese['Quality']], labels= ['Non-japanese','Japanese']) | + | plt.boxplot([non_japanese['Producti'],japanese['Producti']], labels= ['Non-japanese','Japanese']) |
plt.show() | plt.show() | ||
</code> | </code> | ||
- | |||
- | <code python> | ||
- | |||
- |