Ci-dessous, les différences entre deux révisions de la page.
Les deux révisions précédentes Révision précédente Prochaine révision | Révision précédente Prochaine révision Les deux révisions suivantes | ||
python:first_course_statistics [2016/10/28 11:54] Beretta, Anna Letizia |
python:first_course_statistics [2016/11/29 14:06] Beretta, Anna Letizia [Productivity versus quality in the assembly plant (p.29)] |
||
---|---|---|---|
Ligne 194: | Ligne 194: | ||
</code> | </code> | ||
+ | \\ | ||
+ | |||
+ | |||
+ | ====== The Performance of stock mutual funds (p. 21) ====== | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | \\ | ||
+ | |||
+ | ====== Predicting the sales and airplay of popular music (p. 23)====== | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | \\ | ||
+ | |||
+ | ====== Another look at the "Old faithful" geyser and adoption visas (p.24) ====== | ||
+ | |||
+ | Modified the bins of the both histograms: | ||
+ | The Histogram is reliable for the "Old faithful" geyser but not for the Adoption rates. The appearance of the histogram changes quite a lot by changing the bins. | ||
+ | |||
+ | \\ | ||
+ | |||
+ | ====== Productivity versus quality in the assembly plant (p. 25)====== | ||
Ligne 211: | Ligne 238: | ||
</code> | </code> | ||
+ | \\ | ||
+ | |||
+ | =====Scatter Plot of PRODJAPN vs QUALJAPN (p. 27) ===== | ||
+ | |||
+ | <code Python> | ||
+ | import pandas as pd | ||
+ | import matplotlib.pyplot as plt | ||
+ | scatter_plot = pd.read_csv('D:\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\\prdq.TAB', '\t') | ||
+ | productivity_Y = scatter_plot['ProdJapn'] | ||
+ | quality_X = scatter_plot['QualJapn'] | ||
+ | plt.scatter(productivity_Y, quality_X, bins=20, colors='r') | ||
+ | ax = plt.gca() | ||
+ | ax.set_Xlabel('Assembly defects per 100 cars (Japanese origin)') | ||
+ | ax.set_Ylabel('Hours per vehicle (Japanese origin') | ||
+ | ax.set_title('Scatter Plot of PRODJAPN VS QUALJAPN') | ||
+ | plt.show() | ||
+ | </code> | ||
+ | |||
+ | |||
+ | =====Scatter Plot of PRODNONJ cs QUALNONJ (p. 27)===== | ||
+ | <code Python> | ||
+ | import pandas as pd | ||
+ | import matplotlib.pyplot as plt | ||
+ | scatter_plot = pd.read_csv('D:\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\\prdq.TAB', '\t') | ||
+ | productivity_Y = scatter_plot['ProdNonJ'] | ||
+ | quality_X = scatter_plot['QualNonJ'] | ||
+ | plt.scatter(productivity_Y, quality_X, bins=20, colors='r') | ||
+ | ax = plt.gca() | ||
+ | ax.set_Xlabel('Assembly defects per 100 cars (non-Japanese origin)') | ||
+ | ax.set_Ylabel('Hours per vehicle (non-Japanese origin') | ||
+ | ax.set_title('Scatter Plot of PRODNONJ VS QUALNONJ') | ||
+ | plt.show() | ||
+ | </code> | ||
+ | |||
+ | |||
+ | |||
+ | ===== Scatterplot of productivity VS quality (p. 28) ===== | ||
+ | <code python> | ||
+ | import pandas as pd | ||
+ | import matplotlib.pyplot as plt | ||
+ | scatter_plot = pd.read_csv('D:\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\\prdq.TAB', '\t') | ||
+ | productivity_Y = scatter_plot['Producti'] | ||
+ | quality_X = scatter_plot['Quality'] | ||
+ | plt.scatter(productivity_Y, quality_X, bins=20, colors='r') | ||
+ | ax = plt.gca() | ||
+ | ax.set_Xlabel('Assembly defects per 100 cars') | ||
+ | ax.set_Ylabel('Hours per vehicle') | ||
+ | ax.set_title('Scatter Plot of PRODUCTIVITY VS QUALITY') | ||
+ | plt.show() | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== Productivity versus quality in the assembly plant (p.29) ===== | ||
+ | |||
+ | It worked the first time but now it doesn't work again. Maybe again a windows error? | ||
+ | |||
+ | <code python> | ||
+ | #1 | ||
+ | import matplotlib.pyplot as plt | ||
+ | import pandas as pd | ||
+ | data_comparison = pd.read_csv('D:\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\prdq.TAB', '\t') | ||
+ | non_japanese = data_comparison.loc[data_comparison['QualNonJ']] | ||
+ | japanese = data_comparison.loc[data_comparison['QualJapn']] | ||
+ | plt.boxplot([non_japanese['Quality'],japanese['Quality']], labels= ['Non-japanese','Japanese']) | ||
+ | plt.show() | ||
+ | |||
+ | #2 | ||
+ | import matplotlib.pyplot as plt | ||
+ | import pandas as pd | ||
+ | data_comparison = pd.read_csv('D:\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\prdq.TAB', '\t') | ||
+ | non_japanese = data_comparison.loc[data_comparison['ProdNonJ']] | ||
+ | japanese = data_comparison.loc[data_comparison['ProdJapn']] | ||
+ | plt.boxplot([non_japanese['Producti'],japanese['Producti']], labels= ['Non-japanese','Japanese']) | ||
+ | plt.show() | ||
+ | </code> |