Outils pour utilisateurs

Outils du site


python:first_course_statistics

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
python:first_course_statistics [2016/11/29 13:37]
Beretta, Anna Letizia
python:first_course_statistics [2017/09/26 08:54] (Version actuelle)
Francesco Beretta [General instructions]
Ligne 5: Ligne 5:
   * pandas [[http://​pandas.pydata.org/​pandas-docs/​stable/​dsintro.html#​dataframe|dataframes]]   * pandas [[http://​pandas.pydata.org/​pandas-docs/​stable/​dsintro.html#​dataframe|dataframes]]
   * [[http://​matplotlib.org/​api/​pyplot_summary.html|matplotlib.pyplot]]   * [[http://​matplotlib.org/​api/​pyplot_summary.html|matplotlib.pyplot]]
 +
 +
 +Get the data from [[http://​people.stern.nyu.edu/​jsimonof/​Casebook/​Data/​ASCII/​README.html|this site]].
 +
  
 Save your scripts in a folder inside the data folder, calling the script folder '​my_scripts'​ or whaterver. If  '​my-scripts'​ is set as your [[python:​generic_features#​get_the_current_working_directory_address|current working directory]],​ then the data files are available under this address '​../​[data file]',​ for instantce: '​../​geyser1.TAB'​ Save your scripts in a folder inside the data folder, calling the script folder '​my_scripts'​ or whaterver. If  '​my-scripts'​ is set as your [[python:​generic_features#​get_the_current_working_directory_address|current working directory]],​ then the data files are available under this address '​../​[data file]',​ for instantce: '​../​geyser1.TAB'​
Ligne 35: Ligne 39:
 import matplotlib.pyplot as plt import matplotlib.pyplot as plt
 import pandas as pd import pandas as pd
-gysr1_boxplot = pd.read_csv('​...\geyser1.TAB',​ '​\t'​)+gysr1_boxplot = pd.read_csv('​.../geyser1.TAB',​ '​\t'​)
 data_gysr1 = gysr1_boxplot['​Interval'​] data_gysr1 = gysr1_boxplot['​Interval'​]
 plt.boxplot(data_gysr1) plt.boxplot(data_gysr1)
Ligne 55: Ligne 59:
 import matplotlib.pyplot as plt import matplotlib.pyplot as plt
 import pandas as pd import pandas as pd
-geysr1_scatterplot = pd.read_csv('​...\geyser1.TAB',​ '​\t'​)+geysr1_scatterplot = pd.read_csv('​.../geyser1.TAB',​ '​\t'​)
 geysr1_data_Xax = geysr1_scatterplot['​Duration'​] geysr1_data_Xax = geysr1_scatterplot['​Duration'​]
 geysr1_data_Yax = geysr1_scatterplot['​Interval'​] geysr1_data_Yax = geysr1_scatterplot['​Interval'​]
Ligne 291: Ligne 295:
  
 ===== Productivity versus quality in the assembly plant (p.29) ===== ===== Productivity versus quality in the assembly plant (p.29) =====
 +
 +It worked the first time but now it doesn'​t work again. Maybe again a windows error?
  
 <code python> <code python>
 +#1
 +import matplotlib.pyplot as plt
 +import pandas as pd
 +data_comparison = pd.read_csv('​D:​\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\prdq.TAB',​ '​\t'​)
 +non_japanese = data_comparison.loc[data_comparison['​QualNonJ'​]]
 +japanese = data_comparison.loc[data_comparison['​QualJapn'​]]
 +plt.boxplot([non_japanese['​Quality'​],​japanese['​Quality'​]],​ labels= ['​Non-japanese','​Japanese'​])
 +plt.show()
 +
 +#2
 import matplotlib.pyplot as plt import matplotlib.pyplot as plt
 import pandas as pd import pandas as pd
Ligne 298: Ligne 314:
 non_japanese = data_comparison.loc[data_comparison['​ProdNonJ'​]] non_japanese = data_comparison.loc[data_comparison['​ProdNonJ'​]]
 japanese = data_comparison.loc[data_comparison['​ProdJapn'​]] japanese = data_comparison.loc[data_comparison['​ProdJapn'​]]
-plt.boxplot([non_japanese['​Quality'​],​japanese['​Quality']], labels= ['​Non-japanese','​Japanese'​])+plt.boxplot([non_japanese['​Producti'​],​japanese['​Producti']], labels= ['​Non-japanese','​Japanese'​])
 plt.show() plt.show()
 </​code>​ </​code>​
- 
-<code python> 
- 
- 
python/first_course_statistics.1480423058.txt.gz · Dernière modification: 2016/11/29 13:37 par Beretta, Anna Letizia