Sélectionner une page
Output :
ANOVA: latitude and number of layers.
                            OLS Regression Results    
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.007
Model:                            OLS   Adj. R-squared:                  0.007
Method:                 Least Squares   F-statistic:                     918.7
Date:                Mon, 27 Jun 2016   Prob (F-statistic):               0.00
Time:                        09:37:58   Log-Likelihood:                -87460.
No. Observations:              384343   AIC:                         1.749e+05
Df Residuals:                  384339   BIC:                         1.750e+05
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================

image
As also highlighted by the graph, craters near the north pole have the highest number of layers. It is investigated now the possibility that this relationship might be influenced by the crater’s dimension.

Two-way ANOVA

A two-way ANOVA is then performed, to elucidate this possible association. The crater dimension is categorised, from smaller to greater craters.


                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.0703      0.001     83.229      0.000         0.069     0.072
C(Latitude_areas)[T.north pole]        0.0942      0.002     42.193      0.000         0.090     0.099
C(Latitude_areas)[T.south equator]    -0.0185      0.001    -16.957      0.000        -0.021    -0.016
C(Latitude_areas)[T.south pole]       -0.0141      0.002     -8.092      0.000        -0.017    -0.011
==============================================================================
Omnibus:                   413227.763   Durbin-Watson:                   1.509
Prob(Omnibus):                  0.000   Jarque-Bera (JB):         25676301.688
Skew:                           5.639   Prob(JB):                         0.00
Kurtosis:                      41.421   Cond. No.                         5.57
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Latitude_areas
north equator    0.070320
north pole       0.164545
south equator    0.051816
south pole       0.056241
Name: NUMBER_LAYERS, dtype: float64
Latitude_areas
north equator    0.321145
north pole       0.485897
south equator    0.270035
south pole       0.270795
Name: NUMBER_LAYERS, dtype: float64
As also highlighted by the graph, craters near the north pole have the highest number of layers. It is investigated now the possibility that this relationship might be influenced by the crater's dimension.
A two-way ANOVA is then performed, to elucidate this possible association. The crater dimension is categorised, from smaller to greater craters.
Two-way ANOVA: number of layers vs latitude for crater category size 60-100
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.018
Model:                            OLS   Adj. R-squared:                  0.015
Method:                 Least Squares   F-statistic:                     6.117
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           0.000401
Time:                        09:38:21   Log-Likelihood:                -92.222
No. Observations:                1010   AIC:                             192.4
Df Residuals:                    1006   BIC:                             212.1
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.0352      0.017      2.118      0.034         0.003     0.068
C(Latitude_areas)[T.north pole]        0.2426      0.065      3.746      0.000         0.116     0.370
C(Latitude_areas)[T.south equator]    -0.0112      0.020     -0.557      0.578        -0.051     0.028
C(Latitude_areas)[T.south pole]       -0.0352      0.025     -1.388      0.165        -0.085     0.015
==============================================================================
Omnibus:                     1686.978   Durbin-Watson:                   2.057
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           688671.761
Skew:                          10.838   Prob(JB):                         0.00
Kurtosis:                     129.074   Cond. No.                         9.17
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 40-60
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.005
Model:                            OLS   Adj. R-squared:                  0.003
Method:                 Least Squares   F-statistic:                     3.300
Date:                Mon, 27 Jun 2016   Prob (F-statistic):             0.0196
Time:                        09:38:21   Log-Likelihood:                -1151.3
No. Observations:                2113   AIC:                             2311.
Df Residuals:                    2109   BIC:                             2333.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.0994      0.018      5.497      0.000         0.064     0.135
C(Latitude_areas)[T.north pole]        0.0117      0.065      0.180      0.857        -0.115     0.139
C(Latitude_areas)[T.south equator]    -0.0311      0.022     -1.413      0.158        -0.074     0.012
C(Latitude_areas)[T.south pole]       -0.0829      0.027     -3.048      0.002        -0.136    -0.030
==============================================================================
Omnibus:                     2569.814   Durbin-Watson:                   2.021
Prob(Omnibus):                  0.000   Jarque-Bera (JB):           199459.206
Skew:                           6.680   Prob(JB):                         0.00
Kurtosis:                      48.684   Cond. No.                         8.44
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 20-40
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.022
Model:                            OLS   Adj. R-squared:                  0.022
Method:                 Least Squares   F-statistic:                     56.49
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           4.20e-36
Time:                        09:38:21   Log-Likelihood:                -8121.8
No. Observations:                7466   AIC:                         1.625e+04
Df Residuals:                    7462   BIC:                         1.628e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.3598      0.016     22.061      0.000         0.328     0.392
C(Latitude_areas)[T.north pole]        0.1660      0.054      3.069      0.002         0.060     0.272
C(Latitude_areas)[T.south equator]    -0.1625      0.020     -8.194      0.000        -0.201    -0.124
C(Latitude_areas)[T.south pole]       -0.2922      0.026    -11.279      0.000        -0.343    -0.241
==============================================================================
Omnibus:                     5084.510   Durbin-Watson:                   1.892
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            50669.218
Skew:                           3.324   Prob(JB):                         0.00
Kurtosis:                      13.894   Cond. No.                         7.77
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 10-20
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.032
Model:                            OLS   Adj. R-squared:                  0.031
Method:                 Least Squares   F-statistic:                     146.7
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           1.47e-93
Time:                        09:38:21   Log-Likelihood:                -15436.
No. Observations:               13487   AIC:                         3.088e+04
Df Residuals:                   13483   BIC:                         3.091e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.5440      0.012     43.974      0.000         0.520     0.568
C(Latitude_areas)[T.north pole]        0.3259      0.039      8.291      0.000         0.249     0.403
C(Latitude_areas)[T.south equator]    -0.2011      0.015    -13.173      0.000        -0.231    -0.171
C(Latitude_areas)[T.south pole]       -0.3151      0.021    -15.182      0.000        -0.356    -0.274
==============================================================================
Omnibus:                     4780.492   Durbin-Watson:                   1.933
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            13805.147
Skew:                           1.912   Prob(JB):                         0.00
Kurtosis:                       6.153   Cond. No.                         7.17
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 9-10
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.066
Model:                            OLS   Adj. R-squared:                  0.065
Method:                 Least Squares   F-statistic:                     64.48
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           2.80e-40
Time:                        09:38:21   Log-Likelihood:                -2647.3
No. Observations:                2735   AIC:                             5303.
Df Residuals:                    2731   BIC:                             5326.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.5887      0.022     26.672      0.000         0.545     0.632
C(Latitude_areas)[T.north pole]        0.4964      0.069      7.157      0.000         0.360     0.632
C(Latitude_areas)[T.south equator]    -0.2278      0.028     -8.184      0.000        -0.282    -0.173
C(Latitude_areas)[T.south pole]       -0.3248      0.039     -8.334      0.000        -0.401    -0.248
==============================================================================
Omnibus:                      522.056   Durbin-Watson:                   1.939
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              881.124
Skew:                           1.257   Prob(JB):                    4.64e-192
Kurtosis:                       4.190   Cond. No.                         6.75
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size  8-9
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.064
Model:                            OLS   Adj. R-squared:                  0.063
Method:                 Least Squares   F-statistic:                     77.29
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           2.32e-48
Time:                        09:38:21   Log-Likelihood:                -3188.9
No. Observations:                3404   AIC:                             6386.
Df Residuals:                    3400   BIC:                             6410.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.5571      0.019     29.220      0.000         0.520     0.595
C(Latitude_areas)[T.north pole]        0.4679      0.060      7.858      0.000         0.351     0.585
C(Latitude_areas)[T.south equator]    -0.2064      0.024     -8.587      0.000        -0.254    -0.159
C(Latitude_areas)[T.south pole]       -0.3271      0.035     -9.410      0.000        -0.395    -0.259
==============================================================================
Omnibus:                      687.254   Durbin-Watson:                   1.969
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             1219.237
Skew:                           1.286   Prob(JB):                    1.76e-265
Kurtosis:                       4.407   Cond. No.                         6.69
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 7-8
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.077
Model:                            OLS   Adj. R-squared:                  0.076
Method:                 Least Squares   F-statistic:                     117.4
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           4.86e-73
Time:                        09:38:21   Log-Likelihood:                -3697.4
No. Observations:                4238   AIC:                             7403.
Df Residuals:                    4234   BIC:                             7428.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.5476      0.016     34.398      0.000         0.516     0.579
C(Latitude_areas)[T.north pole]        0.5038      0.047     10.814      0.000         0.413     0.595
C(Latitude_areas)[T.south equator]    -0.1986      0.020     -9.816      0.000        -0.238    -0.159
C(Latitude_areas)[T.south pole]       -0.2938      0.029    -10.227      0.000        -0.350    -0.237
==============================================================================
Omnibus:                      595.076   Durbin-Watson:                   1.916
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              872.538
Skew:                           1.054   Prob(JB):                    3.39e-190
Kurtosis:                       3.706   Cond. No.                         6.25
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 6-7
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.067
Model:                            OLS   Adj. R-squared:                  0.066
Method:                 Least Squares   F-statistic:                     130.4
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           1.29e-81
Time:                        09:38:21   Log-Likelihood:                -4715.2
No. Observations:                5467   AIC:                             9438.
Df Residuals:                    5463   BIC:                             9465.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.4986      0.014     36.486      0.000         0.472     0.525
C(Latitude_areas)[T.north pole]        0.5595      0.039     14.206      0.000         0.482     0.637
C(Latitude_areas)[T.south equator]    -0.1330      0.018     -7.573      0.000        -0.167    -0.099
C(Latitude_areas)[T.south pole]       -0.2045      0.025     -8.234      0.000        -0.253    -0.156
==============================================================================
Omnibus:                      657.172   Durbin-Watson:                   1.958
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              914.473
Skew:                           0.983   Prob(JB):                    2.66e-199
Kurtosis:                       3.384   Cond. No.                         6.04
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 5-6
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.057
Model:                            OLS   Adj. R-squared:                  0.057
Method:                 Least Squares   F-statistic:                     149.3
Date:                Mon, 27 Jun 2016   Prob (F-statistic):           6.65e-94
Time:                        09:38:21   Log-Likelihood:                -6084.7
No. Observations:                7374   AIC:                         1.218e+04
Df Residuals:                    7370   BIC:                         1.221e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.4492      0.011     39.949      0.000         0.427     0.471
C(Latitude_areas)[T.north pole]        0.4614      0.030     15.523      0.000         0.403     0.520
C(Latitude_areas)[T.south equator]    -0.1156      0.015     -7.945      0.000        -0.144    -0.087
C(Latitude_areas)[T.south pole]       -0.1498      0.021     -7.167      0.000        -0.191    -0.109
==============================================================================
Omnibus:                      870.288   Durbin-Watson:                   1.916
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             1206.941
Skew:                           0.976   Prob(JB):                    8.24e-263
Kurtosis:                       3.347   Cond. No.                         5.58
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 4-5
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.043
Model:                            OLS   Adj. R-squared:                  0.043
Method:                 Least Squares   F-statistic:                     170.5
Date:                Mon, 27 Jun 2016   Prob (F-statistic):          3.93e-108
Time:                        09:38:21   Log-Likelihood:                -8286.0
No. Observations:               11295   AIC:                         1.658e+04
Df Residuals:                   11291   BIC:                         1.661e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.3016      0.008     36.482      0.000         0.285     0.318
C(Latitude_areas)[T.north pole]        0.3731      0.021     17.603      0.000         0.332     0.415
C(Latitude_areas)[T.south equator]    -0.0699      0.011     -6.508      0.000        -0.091    -0.049
C(Latitude_areas)[T.south pole]       -0.1111      0.015     -7.192      0.000        -0.141    -0.081
==============================================================================
Omnibus:                     3199.588   Durbin-Watson:                   1.858
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             7339.115
Skew:                           1.624   Prob(JB):                         0.00
Kurtosis:                       5.246   Cond. No.                         5.44
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 3-4
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.031
Model:                            OLS   Adj. R-squared:                  0.031
Method:                 Least Squares   F-statistic:                     222.7
Date:                Mon, 27 Jun 2016   Prob (F-statistic):          3.28e-142
Time:                        09:38:21   Log-Likelihood:                -12117.
No. Observations:               20962   AIC:                         2.424e+04
Df Residuals:                   20958   BIC:                         2.427e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.1803      0.005     34.648      0.000         0.170     0.190
C(Latitude_areas)[T.north pole]        0.2508      0.012     21.488      0.000         0.228     0.274
C(Latitude_areas)[T.south equator]    -0.0420      0.007     -6.157      0.000        -0.055    -0.029
C(Latitude_areas)[T.south pole]       -0.0075      0.010     -0.774      0.439        -0.027     0.011
==============================================================================
Omnibus:                     9540.222   Durbin-Watson:                   1.827
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            41239.822
Skew:                           2.295   Prob(JB):                         0.00
Kurtosis:                       8.113   Cond. No.                         5.01
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 2-3
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.055
Model:                            OLS   Adj. R-squared:                  0.055
Method:                 Least Squares   F-statistic:                     999.2
Date:                Mon, 27 Jun 2016   Prob (F-statistic):               0.00
Time:                        09:38:21   Log-Likelihood:                 40147.
No. Observations:               51769   AIC:                        -8.029e+04
Df Residuals:                   51765   BIC:                        -8.025e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.0048      0.001      5.611      0.000         0.003     0.006
C(Latitude_areas)[T.north pole]        0.0955      0.002     49.640      0.000         0.092     0.099
C(Latitude_areas)[T.south equator]    -0.0045      0.001     -4.070      0.000        -0.007    -0.002
C(Latitude_areas)[T.south pole]       -0.0036      0.002     -2.184      0.029        -0.007    -0.000
==============================================================================
Omnibus:                    88171.936   Durbin-Watson:                   1.936
Prob(Omnibus):                  0.000   Jarque-Bera (JB):         64114241.623
Skew:                          12.032   Prob(JB):                         0.00
Kurtosis:                     173.716   Cond. No.                         5.04
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 1-2
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.022
Model:                            OLS   Adj. R-squared:                  0.022
Method:                 Least Squares   F-statistic:                     1888.
Date:                Mon, 27 Jun 2016   Prob (F-statistic):               0.00
Time:                        09:38:22   Log-Likelihood:             4.0129e+05
No. Observations:              252719   AIC:                        -8.026e+05
Df Residuals:                  252715   BIC:                        -8.025e+05
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.0008      0.000      4.921      0.000         0.000     0.001
C(Latitude_areas)[T.north pole]        0.0327      0.000     71.096      0.000         0.032     0.034
C(Latitude_areas)[T.south equator]    -0.0008      0.000     -3.516      0.000        -0.001    -0.000
C(Latitude_areas)[T.south pole]        0.0004      0.000      1.168      0.243        -0.000     0.001
==============================================================================
Omnibus:                   596212.894   Durbin-Watson:                   1.872
Prob(Omnibus):                  0.000   Jarque-Bera (JB):       4728605013.725
Skew:                          24.132   Prob(JB):                         0.00
Kurtosis:                     671.381   Cond. No.                         5.69
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Two-way ANOVA: number of layers vs latitude for crater category size 100-1165
                            OLS Regression Results                            
==============================================================================
Dep. Variable:          NUMBER_LAYERS   R-squared:                       0.009
Model:                            OLS   Adj. R-squared:                 -0.001
Method:                 Least Squares   F-statistic:                    0.8576
Date:                Mon, 27 Jun 2016   Prob (F-statistic):              0.463
Time:                        09:38:22   Log-Likelihood:                 439.43
No. Observations:                 304   AIC:                            -870.9
Df Residuals:                     300   BIC:                            -856.0
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
======================================================================================================
                                         coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------------------------------
Intercept                              0.0118      0.006      1.890      0.060        -0.000     0.024
C(Latitude_areas)[T.north pole]       -0.0118      0.026     -0.445      0.656        -0.064     0.040
C(Latitude_areas)[T.south equator]    -0.0118      0.008     -1.506      0.133        -0.027     0.004
C(Latitude_areas)[T.south pole]       -0.0118      0.009     -1.249      0.212        -0.030     0.007
==============================================================================
Omnibus:                      698.534   Durbin-Watson:                   2.025
Prob(Omnibus):                  0.000   Jarque-Bera (JB):          1108895.362
Skew:                          17.126   Prob(JB):                         0.00
Kurtosis:                     296.890   Cond. No.                         9.29
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
The two-way ANOVA revealed that the relationship between the latitude and the number of layers is preserved when the crater dimension is small, while it is lost when the crater is very big in size.
At the end, this two-way ANOVA highlighted that the crater dimension plays an important role in the correlation between latitude and number of layers, and that this variable has to be taken into account to avoid drawing misleading conclusions.

The two-way ANOVA revealed that the relationship between the latitude and the number of layers is preserved when the crater dimension is small, while it is lost when the crater is very big in size.

At the end, this two-way ANOVA highlighted that the crater dimension plays an important role in the correlation between latitude and number of layers, and that this variable has to be taken into account to avoid drawing misleading conclusions.

Code :

# -*- coding: utf-8 -*-
 import pandas
 import statsmodels.formula.api as smf
 import seaborn
 import matplotlib.pyplot as plt
 data = pandas.read_csv('marscrater_pds.csv', low_memory=False)
 print (len(data))
 print (len(data.columns))

def latitude_categorisation_function (data):
 if -100 <= data['LATITUDE_CIRCLE_IMAGE'] < -50:
 return "south pole"
 elif -50 <= data['LATITUDE_CIRCLE_IMAGE'] < -0:
 return "south equator"
 elif 0 <= data['LATITUDE_CIRCLE_IMAGE'] < 50:
 return "north equator"
 elif 50 <= data['LATITUDE_CIRCLE_IMAGE'] <= 100:
 return "north pole"

# Define the function for latitude categorisation
 def longitude_categorisation_function (data):
 if -200 <= data['LONGITUDE_CIRCLE_IMAGE'] < -100:
 return "1"
 elif -100 <= data['LONGITUDE_CIRCLE_IMAGE'] < 0:
 return "2"
 elif 0 <= data['LONGITUDE_CIRCLE_IMAGE'] < 100:
 return "3"
 elif 100 <= data['LONGITUDE_CIRCLE_IMAGE'] <= 200:
 return "4"
 # Categorise the latitude

data['Latitude_areas'] = data.apply(lambda data: latitude_categorisation_function (data), axis=1)
 data['Latitude_areas'] = data['Latitude_areas'].astype('category')
 data['Longitude_areas'] = data.apply(lambda data: longitude_categorisation_function (data), axis=1)
 data['Longitude_areas'] = data['Longitude_areas'].astype('category')

# ANOVA between latitude and number of layers
 print ("ANOVA: latitude and number of layers.")
 anova_model_latitude_layers = smf.ols (formula = 'NUMBER_LAYERS ~ C(Latitude_areas)', data=data)
 print(anova_model_latitude_layers.fit().summary())
 seaborn.factorplot(x="Latitude_areas", y="NUMBER_LAYERS", data=data)
 plt.xlabel("Latitude")
 plt.ylabel("Number of layers")
 # Comparison of means and standard deviations
 mean_latitude_layers = data.groupby("Latitude_areas")['NUMBER_LAYERS'].mean()
 print(mean_latitude_layers)
 std_latitude_layers = data.groupby("Latitude_areas")['NUMBER_LAYERS'].std()
 print(std_latitude_layers)

print ("As also highlighted by the graph, craters near the north pole have the highest number of layers. It is investigated now the possibility that this relationship might be influenced by the crater's dimension.")

print ("A two-way ANOVA is then performed, to elucidate this possible association. The crater dimension is categorised, from smaller to greater craters.")

# Define the function for latitude categorisation
 def diameter_categorisation_function (data):
 if 0 <= data['DIAM_CIRCLE_IMAGE'] < 1:
 return "0-1"
 elif 1 <= data['DIAM_CIRCLE_IMAGE'] < 2:
 return "1-2"
 elif 2 <= data['DIAM_CIRCLE_IMAGE'] < 3:
 return "2-3"
 elif 3 <= data['DIAM_CIRCLE_IMAGE'] < 4:
 return "3-4"
 elif 4 <= data['DIAM_CIRCLE_IMAGE'] < 5:
 return "4-5"
 elif 5 <= data['DIAM_CIRCLE_IMAGE'] < 6:
 return "5-6"
 elif 6 <= data['DIAM_CIRCLE_IMAGE'] < 7:
 return "6-7"
 elif 7 <= data['DIAM_CIRCLE_IMAGE'] < 8:
 return "7-8"
 elif 8 <= data['DIAM_CIRCLE_IMAGE'] < 9:
 return " 8-9"
 elif 9 <= data['DIAM_CIRCLE_IMAGE'] < 10:
 return "9-10"
 elif 10 <= data['DIAM_CIRCLE_IMAGE'] < 20:
 return "10-20"
 elif 20 <= data['DIAM_CIRCLE_IMAGE'] < 40:
 return "20-40"
 elif 40 <= data['DIAM_CIRCLE_IMAGE'] < 60:
 return "40-60"
 elif 60 <= data['DIAM_CIRCLE_IMAGE'] < 100:
 return "60-100"
 elif 100 <= data['DIAM_CIRCLE_IMAGE'] <= 1165:
 return "100-1165"

# Categorise the crater diameter
 data['Crater_size_category'] = data.apply(lambda data: diameter_categorisation_function (data), axis=1)
 data['Crater_size_category'] = data['Crater_size_category'].astype('category')

### Two-way ANOVA
 anova_model_latitude_layers_two_way_crater_size = []
 for category in data['Crater_size_category'].unique():
 print ("Two-way ANOVA: number of layers vs latitude for crater category size %s" %(category))
 data_subset = data[data['Crater_size_category']== category]
 anova_model_latitude_layers_two_way = smf.ols (formula = 'NUMBER_LAYERS ~ C(Latitude_areas)', data=data_subset)
 anova_model_latitude_layers_two_way_crater_size.append(anova_model_latitude_layers_two_way)
 print(anova_model_latitude_layers_two_way.fit().summary())

print ("The two-way ANOVA revealed that the relationship between the latitude and the number of layers is preserved when the crater dimension is small, while it is lost when the crater is very big in size.")
 print ("At the end, this two-way ANOVA highlighted that the crater dimension plays an important role in the correlation between latitude and number of layers, and that this variable has to be taken into account to avoid drawing misleading conclusions.")