Outils pour utilisateurs

Outils du site


python:first_course_statistics

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
Prochaine révision Les deux révisions suivantes
python:first_course_statistics [2016/10/13 12:52]
Beretta, Anna Letizia [Boxplot (p. 6)]
python:first_course_statistics [2016/10/19 14:57]
Beretta, Anna Letizia
Ligne 13: Ligne 13:
  
 ===== Histogram (p.5) ===== ===== Histogram (p.5) =====
- 
-FB: this script works fine ! Do not delete it ! 
  
 <code python> <code python>
Ligne 52: Ligne 50:
 ===== ScatterPlot (p. 7) ===== ===== ScatterPlot (p. 7) =====
  
-AB: Put face- and edgecolor to change both of them. You can also have to different ​color for the in- and outside of each point.+AB: Put face- and edgecolor to change both of them. You can also have two different ​colors ​for the in- and outside of each dot.
  
 <code python> <code python>
 import matplotlib.pyplot as plt import matplotlib.pyplot as plt
 import pandas as pd import pandas as pd
-geysr1_scatterplot = pd.read_csv('​D:​\Python\Libri\A Casebook for a First Course in Statistics and Data Analysis Datasets\Data\Tab\geyser1.TAB',​ '​\t'​)+geysr1_scatterplot = pd.read_csv('​...\geyser1.TAB',​ '​\t'​)
 geysr1_data_Xax = geysr1_scatterplot['​Duration'​] geysr1_data_Xax = geysr1_scatterplot['​Duration'​]
 geysr1_data_Yax = geysr1_scatterplot['​Interval'​] geysr1_data_Yax = geysr1_scatterplot['​Interval'​]
Ligne 67: Ligne 65:
 plt.show() plt.show()
 </​code>​ </​code>​
 +
 +
 +\\
 +
 +
 +===== Descriptive statistics (p.9) =====
 +
 +Note: try different examples, e.g. the whole population or only those where '​Duration'​ <= 3, the whole dataframe
 +
 +[[http://​pandas.pydata.org/​pandas-docs/​stable/​basics.html#​descriptive-statistics|doc]] – [[http://​www.marsja.se/​pandas-python-descriptive-statistics/​|example]]
 +
 +<code python>
 +import pandas as pd
 +gysr1 = pd.read_csv('​../​geyser1.tab',​ '​\t'​)
 +gysr1['​Duration'​][gysr1['​Duration'​] <= 3].describe()
 +</​code>​
 +
 +
 +\\
 +
 +
 +===== Boxplot (p.9) =====
 +
 +Selecting rows in a dataframe: [[http://​pandas.pydata.org/​pandas-docs/​stable/​indexing.html#​the-where-method-and-masking|doc]] / [[http://​stackoverflow.com/​questions/​17071871/​select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas|example]]
 +
 +<code python>
 +import matplotlib.pyplot as plt
 +import pandas as pd
 +gysr1 = pd.read_csv('​../​geyser1.tab',​ '​\t'​)
 +gysr1_inf3 = gysr1.loc[gysr1['​Duration'​] <= 3]
 +gysr1_sup3 = gysr1.loc[gysr1['​Duration'​] > 3]
 +plt.boxplot([gysr1_inf3['​Interval'​],​gysr1_sup3['​Interval'​]],​ labels= ['​inf3','​sup3'​])
 +</​code>​
 +
 +
 +\\
 +
  
 ====== International adoption rates (p.13) ====== ====== International adoption rates (p.13) ======
  
 +===== Boxplot (p.14) =====
 +
 +<code python>
 +import matplotlib.pyplot as plt
 +import pandas as pd
 +adopt_data = pd.read_csv('​D:​\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\\adopt.TAB',​ '​\t'​)
 +adopt1 = adopt_data['​Visa91'​]
 +plt.boxplot(adopt1)
 +ax = plt.gca()
 +ax.set_title('​Box and Whisker Plot')
 +ax.set_xlabel('​39 cases'​)
 +ax.set_ylabel('​Number of visas in 1991')
 +plt.show()
 +</​code>​
 +
 +
 +\\
 +
 +
 +===== Histogram (p.14) =====
 +
 +<code python>
 +import matplotlib.pyplot as plt
 +import pandas as pd
 +adopt_data = pd.read_csv('​D:​\Python\Libri\A_Casebook_for_a_First_Course_in_Statistics_and_Data_Analysis_Datasets\Data\Tab\\adopt.TAB',​ '​\t'​)
 +adopt1 = adopt_data['​Visa91'​]
 +plt.hist(adopt1)
 +plt.show()
 +</​code>​
python/first_course_statistics.txt · Dernière modification: 2017/09/26 08:54 par Francesco Beretta