Ce forum est le 3ième dans lequel je pose mon problème car malheureusement dans mon pays il n'existe pas de spécialiste de R .T out ausi ce travail est très important pour moi car je viens de passer ces 10 derniers mois à apprendre le langage et le code R. Je vous supplie de m'apporter votre aide et votre expertise sur le sujet. Merci a tous.
Code : Tout sélectionner
> base<-read.csv("FX.csv",sep=";",dec=",",header=T)
> attach(base)
### DISCRETISATION DES VARIABLES QUANTITATIVES ###
> breakAGE=c(22,35,45,53,max(AGE))
> AGE.d=cut(AGE,breaks=breakAGE,include.lowest=T)
> breakNBP=c(1,4,8,max(NBP))
> NBP.d=cut(NBP,breaks=breakNBP,include.lowest=T)
> breakIMP=c(0,20600,29400,37600,max(IMP))
> IMP.d=cut(IMP,breaks=breakIMP,include.lowest=T)
> breakCRD=c(0,720000,2000000,5300000,max(CRD))
> CRD.d=cut(CRD,breaks=breakCRD,include.lowest=T)
> breakNPR=c(1,4,8,max(NPR))
> NPR.d=cut(NPR,breaks=breakNPR,include.lowest=T)
> breakVG=c(0,650000,1808000,4272000,max(VG))
> VG.d=cut(VG,breaks=breakVG,include.lowest=T)
> breakDD=c(0,12,24,48,max(DD))
> DD.d=cut(DD,breaks=breakDD,include.lowest=T)
> base2<-cbind(base,AGE.d,NBP.d,IMP.d,CRD.d,NPR.d,VG.d,DD.d)
> detach(base)
> attach(base2)
The following objects are masked _by_ .GlobalEnv:
AGE.d, CRD.d, DD.d, IMP.d, NBP.d, NPR.d, VG.d
### ECHANTILLONNAGE DE MA BASE DISCRETISER (base3) ###
> base3<-subset(base2,select=-c(AGE,NBP,IMP,CRD,NPR,VG,DD,TxC))
> tail(base3)
SEX SEC ST AGE.d NBP.d IMP.d CRD.d NPR.d
795 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
796 M COMMERCE BON (35,45] [1,4] [0,2.06e+04] (2e+06,5.3e+06] [1,4]
797 M SERVICES BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
798 M COMMERCE BON [22,35] [1,4] [0,2.06e+04] (7.2e+05,2e+06] [1,4]
799 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] (8,12]
800 F COMMERCE BON [22,35] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
VG.d DD.d
795 [0,6.5e+05] (12,24]
796 (4.27e+06,6.78e+07] [0,12]
797 (6.5e+05,1.81e+06] [0,12]
798 [0,6.5e+05] [0,12]
799 (1.81e+06,4.27e+06] [0,12]
800 (1.81e+06,4.27e+06] [0,12]
> d=sort(sample(nrow(base3),nrow(base3)*0.65))
> appren<-base3[d,]
> test<-base3[-d,]
> tail(appren)
SEX SEC ST AGE.d NBP.d IMP.d CRD.d NPR.d
795 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
796 M COMMERCE BON (35,45] [1,4] [0,2.06e+04] (2e+06,5.3e+06] [1,4]
797 M SERVICES BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
798 M COMMERCE BON [22,35] [1,4] [0,2.06e+04] (7.2e+05,2e+06] [1,4]
799 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] (8,12]
800 F COMMERCE BON [22,35] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
VG.d DD.d
795 [0,6.5e+05] (12,24]
796 (4.27e+06,6.78e+07] [0,12]
797 (6.5e+05,1.81e+06] [0,12]
798 [0,6.5e+05] [0,12]
799 (1.81e+06,4.27e+06] [0,12]
800 (1.81e+06,4.27e+06] [0,12]
> summary(appren)
SEX SEC ST AGE.d NBP.d
F:181 AGRI-CACAO : 95 BON :462 [22,35]:129 [1,4] :476
M:339 AGRICULTURE: 20 MAUVAIS: 58 (35,45]:202 (4,8] : 33
COMMERCE :296 (45,53]:122 (8,13]: 11
MANIFACTURE: 22 (53,76]: 67
SERVICES : 87
IMP.d CRD.d NPR.d
[0,2.06e+04] :490 [0,7.2e+05] :179 [1,4] :379
(2.06e+04,2.94e+04]: 12 (7.2e+05,2e+06] :171 (4,8] :109
(2.94e+04,3.76e+04]: 11 (2e+06,5.3e+06] :106 (8,12]: 32
(3.76e+04,4.29e+04]: 7 (5.3e+06,1.65e+07]: 64
VG.d DD.d
[0,6.5e+05] :171 [0,12] :464
(6.5e+05,1.81e+06] :146 (12,24]: 36
(1.81e+06,4.27e+06]:110 (24,48]: 15
(4.27e+06,6.78e+07]: 93 (48,78]: 5
### SORTIE DU MODEL GENERAL ###
> model<-glm(ST~SEX+SEC+AGE.d+NBP.d+IMP.d+CRD.d+NPR.d+VG.d+DD.d,data=appren,family=binomial)
Warning message:
glm.fit: des probabilités ont été ajustées numériquement à 0 ou 1
> summary(model)
Call:
glm(formula = ST ~ SEX + SEC + AGE.d + NBP.d + IMP.d + CRD.d +
NPR.d + VG.d + DD.d, family = binomial, data = appren)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.60945 -0.14466 -0.00003 0.00000 2.83347
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.064e+01 4.372e+03 -0.005 0.99623
SEXM 1.669e-02 6.217e-01 0.027 0.97858
SECAGRICULTURE 1.239e+00 7.225e+03 0.000 0.99986
SECCOMMERCE 1.979e+01 4.372e+03 0.005 0.99639
SECMANIFACTURE -3.737e-02 9.328e+03 0.000 1.00000
SECSERVICES 1.954e+01 4.372e+03 0.004 0.99643
AGE.d(35,45] -1.031e+00 6.887e-01 -1.497 0.13433
AGE.d(45,53] -1.409e+00 9.958e-01 -1.415 0.15698
AGE.d(53,76] -9.958e-01 1.162e+00 -0.857 0.39159
NBP.d(4,8] 3.210e+00 9.813e-01 3.271 0.00107 **
NBP.d(8,13] 4.012e+01 1.151e+04 0.003 0.99722
IMP.d(2.06e+04,2.94e+04] 4.012e+01 1.125e+04 0.004 0.99715
IMP.d(2.94e+04,3.76e+04] 3.893e+01 1.203e+04 0.003 0.99742
IMP.d(3.76e+04,4.29e+04] 3.988e+01 1.529e+04 0.003 0.99792
CRD.d(7.2e+05,2e+06] -2.409e+00 1.122e+00 -2.148 0.03171 *
CRD.d(2e+06,5.3e+06] -4.577e-01 7.377e-01 -0.620 0.53494
CRD.d(5.3e+06,1.65e+07] -1.950e+01 4.990e+03 -0.004 0.99688
NPR.d(4,8] 1.967e-02 8.804e-01 0.022 0.98218
NPR.d(8,12] -1.770e+01 6.976e+03 -0.003 0.99798
VG.d(6.5e+05,1.81e+06] -2.175e+00 1.093e+00 -1.991 0.04651 *
VG.d(1.81e+06,4.27e+06] -6.829e-01 7.411e-01 -0.921 0.35683
VG.d(4.27e+06,6.78e+07] -1.936e+01 4.307e+03 -0.004 0.99641
DD.d(12,24] 5.380e-01 9.545e-01 0.564 0.57299
DD.d(24,48] -1.878e+01 7.153e+03 -0.003 0.99791
DD.d(48,78] -1.826e+01 1.908e+04 -0.001 0.99924
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 363.709 on 519 degrees of freedom
Residual deviance: 83.801 on 495 degrees of freedom
AIC: 133.8
Number of Fisher Scoring iterations: 21
### SELECTION DES VARIABLES ###
>
> local({pkg <- select.list(sort(.packages(all.available = TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
> model.trivial<-glm(ST~1,data=appren,family=binomial)
> str_constant<-"~1"
> str_all<-"~SEX+SEC+AGE.d+NBP.d+IMP.d+CRD.d+NPR.d+VG.d+DD.d"
> model.both<-stepAIC(model.trivial,scope=list(lower=str_constant,upper=str_all),trace=T,data=appren,direction="both")
Start: AIC=365.71
ST ~ 1
Df Deviance AIC
+ IMP.d 3 214.65 222.65
+ NBP.d 2 230.65 236.65
+ VG.d 3 296.00 304.00
+ CRD.d 3 298.40 306.40
+ SEX 1 340.84 344.84
+ SEC 4 335.05 345.05
+ AGE.d 3 343.54 351.54
+ NPR.d 2 351.37 357.37
<none> 363.71 365.71
+ DD.d 3 361.92 369.92
Step: AIC=222.65
ST ~ IMP.d
Df Deviance AIC
+ NBP.d 2 136.14 148.14
+ VG.d 3 186.10 200.10
+ SEC 4 193.89 209.89
+ CRD.d 3 198.07 212.07
+ SEX 1 204.96 214.96
+ AGE.d 3 206.45 220.45
<none> 214.65 222.65
+ NPR.d 2 210.68 222.68
+ DD.d 3 213.48 227.48
- IMP.d 3 363.71 365.71
Step: AIC=148.14
ST ~ IMP.d + NBP.d
Df Deviance AIC
+ VG.d 3 113.56 131.56
+ CRD.d 3 115.44 133.44
+ SEC 4 123.99 143.99
+ AGE.d 3 128.46 146.46
<none> 136.14 148.14
+ SEX 1 135.45 149.45
+ NPR.d 2 133.86 149.86
+ DD.d 3 133.94 151.94
- NBP.d 2 214.65 222.65
- IMP.d 3 230.65 236.65
Step: AIC=131.56
ST ~ IMP.d + NBP.d + VG.d
Df Deviance AIC
+ SEC 4 101.86 127.86
+ CRD.d 3 104.10 128.10
<none> 113.56 131.56
+ SEX 1 113.19 133.19
+ AGE.d 3 109.20 133.20
+ NPR.d 2 112.29 134.29
+ DD.d 3 111.03 135.03
- VG.d 3 136.14 148.14
- IMP.d 3 181.38 193.38
- NBP.d 2 186.10 200.10
Step: AIC=127.86
ST ~ IMP.d + NBP.d + VG.d + SEC
Df Deviance AIC
+ CRD.d 3 90.284 122.28
<none> 101.863 127.86
+ NPR.d 2 99.404 129.40
+ SEX 1 101.854 129.85
+ AGE.d 3 98.204 130.20
- SEC 4 113.556 131.56
+ DD.d 3 100.072 132.07
- VG.d 3 123.992 143.99
- NBP.d 2 158.612 180.61
- IMP.d 3 161.631 181.63
Step: AIC=122.28
ST ~ IMP.d + NBP.d + VG.d + SEC + CRD.d
Df Deviance AIC
<none> 90.284 122.28
+ SEX 1 90.147 124.15
+ AGE.d 3 86.774 124.77
+ NPR.d 2 89.028 125.03
+ DD.d 3 88.750 126.75
- VG.d 3 101.165 127.17
- CRD.d 3 101.863 127.86
- SEC 4 104.104 128.10
- IMP.d 3 133.450 159.45
- NBP.d 2 150.320 178.32
There were 41 warnings (use warnings() to see them)
### RESUME DE LA SELECTION DES VARIABLE ###
> summary(model.both)
Call:
glm(formula = ST ~ IMP.d + NBP.d + VG.d + SEC + CRD.d, family = binomial,
data = appren)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.81267 -0.15696 -0.00005 0.00000 2.76255
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -20.4112 2700.0209 -0.008 0.993968
IMP.d(2.06e+04,2.94e+04] 37.5305 6867.3627 0.005 0.995640
IMP.d(2.94e+04,3.76e+04] 37.6881 7289.5247 0.005 0.995875
IMP.d(3.76e+04,4.29e+04] 37.8871 9253.0925 0.004 0.996733
NBP.d(4,8] 2.9167 0.7975 3.657 0.000255 ***
NBP.d(8,13] 25.8031 6413.9268 0.004 0.996790
VG.d(6.5e+05,1.81e+06] -2.3049 1.0925 -2.110 0.034878 *
VG.d(1.81e+06,4.27e+06] -0.8064 0.7296 -1.105 0.269044
VG.d(4.27e+06,6.78e+07] -18.4180 2697.2488 -0.007 0.994552
SECAGRICULTURE 1.2517 4409.3888 0.000 0.999773
SECCOMMERCE 18.9224 2700.0209 0.007 0.994408
SECMANIFACTURE 0.5163 5870.9554 0.000 0.999930
SECSERVICES 18.8326 2700.0209 0.007 0.994435
CRD.d(7.2e+05,2e+06] -2.4847 1.0983 -2.262 0.023673 *
CRD.d(2e+06,5.3e+06] -0.5969 0.6794 -0.879 0.379654
CRD.d(5.3e+06,1.65e+07] -18.6500 3181.2691 -0.006 0.995322
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 363.709 on 519 degrees of freedom
Residual deviance: 90.284 on 504 degrees of freedom
AIC: 122.28
Number of Fisher Scoring iterations: 20
### Determination du model final###
> model.final<-glm(ST ~ IMP.d + NBP.d + VG.d + SEC + CRD.d, family = binomial,data = appren)
Warning message:
glm.fit: des probabilités ont été ajustées numériquement à 0 ou 1
> summary(model.final)
Call:
glm(formula = ST ~ IMP.d + NBP.d + VG.d + SEC + CRD.d, family = binomial,
data = appren)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.81267 -0.15696 -0.00005 0.00000 2.76255
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -20.4112 2700.0209 -0.008 0.993968
IMP.d(2.06e+04,2.94e+04] 37.5305 6867.3627 0.005 0.995640
IMP.d(2.94e+04,3.76e+04] 37.6881 7289.5247 0.005 0.995875
IMP.d(3.76e+04,4.29e+04] 37.8871 9253.0925 0.004 0.996733
NBP.d(4,8] 2.9167 0.7975 3.657 0.000255 ***
NBP.d(8,13] 25.8031 6413.9268 0.004 0.996790
VG.d(6.5e+05,1.81e+06] -2.3049 1.0925 -2.110 0.034878 *
VG.d(1.81e+06,4.27e+06] -0.8064 0.7296 -1.105 0.269044
VG.d(4.27e+06,6.78e+07] -18.4180 2697.2488 -0.007 0.994552
SECAGRICULTURE 1.2517 4409.3888 0.000 0.999773
SECCOMMERCE 18.9224 2700.0209 0.007 0.994408
SECMANIFACTURE 0.5163 5870.9554 0.000 0.999930
SECSERVICES 18.8326 2700.0209 0.007 0.994435
CRD.d(7.2e+05,2e+06] -2.4847 1.0983 -2.262 0.023673 *
CRD.d(2e+06,5.3e+06] -0.5969 0.6794 -0.879 0.379654
CRD.d(5.3e+06,1.65e+07] -18.6500 3181.2691 -0.006 0.995322
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 363.709 on 519 degrees of freedom
Residual deviance: 90.284 on 504 degrees of freedom
AIC: 122.28
Number of Fisher Scoring iterations: 20
### Determination odds-ratios ###
> exp(cbind(OR=coef(model.final),confint(model.final)))
Waiting for profiling to be done...
OR 2.5 % 97.5 %
(Intercept) 1.366283e-09 0.000000e+00 2.843411e+40
IMP.d(2.06e+04,2.94e+04] 1.992025e+16 2.154332e-81 2.048024e+152
IMP.d(2.94e+04,3.76e+04] 2.332099e+16 1.628742e-59 3.369592e+88
IMP.d(3.76e+04,4.29e+04] 2.845422e+16 2.140137e-229 NA
NBP.d(4,8] 1.847928e+01 4.182520e+00 1.003902e+02
NBP.d(8,13] 1.607467e+11 9.542125e-116 Inf
VG.d(6.5e+05,1.81e+06] 9.977251e-02 5.186268e-03 5.836345e-01
VG.d(1.81e+06,4.27e+06] 4.464625e-01 9.032385e-02 1.727507e+00
VG.d(4.27e+06,6.78e+07] 1.002731e-08 NA 9.023988e+66
SECAGRICULTURE 3.496449e+00 7.872504e-56 1.552893e+56
SECCOMMERCE 1.651635e+08 1.536970e-67 NA
SECMANIFACTURE 1.675824e+00 1.089431e-72 1.260262e+74
SECSERVICES 1.509736e+08 4.494038e-64 NA
CRD.d(7.2e+05,2e+06] 8.335133e-02 4.303995e-03 4.938711e-01
CRD.d(2e+06,5.3e+06] 5.505081e-01 1.307285e-01 1.985370e+00
CRD.d(5.3e+06,1.65e+07] 7.951053e-09 NA 2.307151e+74
There were 50 or more warnings (use warnings() to see the first 50)
### Determination ROC et AUC echantillon d'apprentissage ###
> appren.p<-cbind(appren,predict(model.final,newdata=appren,type="response")
+ )
> tail(appren.p)
SEX SEC ST AGE.d NBP.d IMP.d CRD.d NPR.d
795 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
796 M COMMERCE BON (35,45] [1,4] [0,2.06e+04] (2e+06,5.3e+06] [1,4]
797 M SERVICES BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
798 M COMMERCE BON [22,35] [1,4] [0,2.06e+04] (7.2e+05,2e+06] [1,4]
799 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] (8,12]
800 F COMMERCE BON [22,35] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4]
VG.d DD.d
795 [0,6.5e+05] (12,24]
796 (4.27e+06,6.78e+07] [0,12]
797 (6.5e+05,1.81e+06] [0,12]
798 [0,6.5e+05] [0,12]
799 (1.81e+06,4.27e+06] [0,12]
800 (1.81e+06,4.27e+06] [0,12]
predict(model.final, newdata = appren, type = "response")
795 1.841132e-01
796 1.245670e-09
797 2.016533e-02
798 1.846183e-02
799 9.152752e-02
800 9.152752e-02
> summary(appren.p)
SEX SEC ST AGE.d NBP.d
F:181 AGRI-CACAO : 95 BON :462 [22,35]:129 [1,4] :476
M:339 AGRICULTURE: 20 MAUVAIS: 58 (35,45]:202 (4,8] : 33
COMMERCE :296 (45,53]:122 (8,13]: 11
MANIFACTURE: 22 (53,76]: 67
SERVICES : 87
IMP.d CRD.d NPR.d
[0,2.06e+04] :490 [0,7.2e+05] :179 [1,4] :379
(2.06e+04,2.94e+04]: 12 (7.2e+05,2e+06] :171 (4,8] :109
(2.94e+04,3.76e+04]: 11 (2e+06,5.3e+06] :106 (8,12]: 32
(3.76e+04,4.29e+04]: 7 (5.3e+06,1.65e+07]: 64
VG.d DD.d
[0,6.5e+05] :171 [0,12] :464
(6.5e+05,1.81e+06] :146 (12,24]: 36
(1.81e+06,4.27e+06]:110 (24,48]: 15
(4.27e+06,6.78e+07]: 93 (48,78]: 5
predict(model.final, newdata = appren, type = "response")
Min. :0.000000
1st Qu.:0.000000
Median :0.001873
Mean :0.111538
3rd Qu.:0.048252
> pred.proba<-predict(model.final,newdata=appren,type="response")
> appren.p<-cbind(appren,pred.proba)
> appren.p<-cbind(appren,predict(model.final,newdata=appren,type="response",se=T)
+ )
> tail(appren.p)
SEX SEC ST AGE.d NBP.d IMP.d CRD.d NPR.d VG.d DD.d fit
795 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4] [0,6.5e+05] (12,24] 1.841132e-01
796 M COMMERCE BON (35,45] [1,4] [0,2.06e+04] (2e+06,5.3e+06] [1,4] (4.27e+06,6.78e+07] [0,12] 1.245670e-09
797 M SERVICES BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4] (6.5e+05,1.81e+06] [0,12] 2.016533e-02
798 M COMMERCE BON [22,35] [1,4] [0,2.06e+04] (7.2e+05,2e+06] [1,4] [0,6.5e+05] [0,12] 1.846183e-02
799 F COMMERCE BON (45,53] [1,4] [0,2.06e+04] [0,7.2e+05] (8,12] (1.81e+06,4.27e+06] [0,12] 9.152752e-02
800 F COMMERCE BON [22,35] [1,4] [0,2.06e+04] [0,7.2e+05] [1,4] (1.81e+06,4.27e+06] [0,12] 9.152752e-02
se.fit residual.scale
795 6.336826e-02 1
796 3.359881e-06 1
797 2.290578e-02 1
798 1.971383e-02 1
799 5.925215e-02 1
800 5.925215e-02 1
> appren.p2<-cbind(appren.p,pred.ST=factor(ifelse(appren.p$fit>0.5,"Presence","Absence")))
> matrix<-table(appren.p2$pred.ST,appren.p2$ST)
> matrix
BON MAUVAIS
Absence 459 13
Presence 3 45
> Tx_err<-(matrix[2,1]+matrix[1,2])/sum(matrix)
> Tx_err
[1] 0.03076923
> local({pkg <- select.list(sort(.packages(all.available = TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
Le chargement a nécessité le package : gplots
Attachement du package : ‘gplots’
The following object is masked from ‘package:stats’:
lowess
> local({pkg <- select.list(sort(.packages(all.available = TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
> local({pkg <- select.list(sort(.packages(all.available = TRUE)),graphics=TRUE)
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
> attach(appren.p2)
The following objects are masked _by_ .GlobalEnv:
AGE.d, CRD.d, DD.d, IMP.d, NBP.d, NPR.d, VG.d
The following objects are masked from base2:
AGE.d, CRD.d, DD.d, IMP.d, NBP.d, NPR.d, SEC, SEX, ST, VG.d
> pred.appren=prediction(appren.p$fit,appren.p$ST)
> perf.appren=performance(pred.appren,"tpr","fpr")
> plot(perf.appren)
> pred.appren<-performance(pred.appren,"auc")
> pred.appren@y.values[[1]]
[1] 0.9835983
### Determination ROC et AUC echantillon TEST###
> test.p<-cbind(test,predict(model.final,newdata=test,type="response",se=T)
+ )
> test.p2<-cbind(test.p,pred.ST=factor(ifelse(test.p$fit>0.5,"Presence","Absence")))
> matrix<-table(test.p2$pred.ST,test.p2$ST)
> matrix
BON MAUVAIS
Absence 242 12
Presence 0 26
> Tx_err<-(matrix[2,1]+matrix[1,2])/sum(matrix
+ )
> Tx_err
[1] 0.04285714
> pred.test=prediction(test.p$fit,test.p$ST)
> perf.test=performance(pred.test,"tpr","fpr")
> plot(perf.test)
> pred.test<-performance(pred.test,"auc")
> pred.test@y.values[[1]]
[1] 0.8985428
### DETERMINATION DE LA P.VALUE ###
> par(mfrow=c(1,1))
> plot(rstudent(model.final),type="p",cex=0.5,ylab="Résidus Studentisés",col="black",ylim=c(-3,3))
> abline(h=c(-2,2),col="red")
> chi2<-model.final$null.deviance-model.fina$deviance
Erreur : objet 'model.fina' introuvable
> chi2<-model.final$null.deviance-model.final$deviance
> ddl<-model.final$df.null-model.final$df.residual
> pvalue<-pchisq(chi2,ddl,lower.tail=F)
> pvalue
[1] 1.812119e-49