All subset regression tests all possible subsets of the set of potential independent variables. If there are K potential independent variables (besides the constant), then there are \(2^{k}\) distinct subsets of them to be tested. For example, if you have 10 candidate independent variables, the number of subsets to be tested is \(2^{10}\), which is 1024, and if you have 20 candidate variables, the number is \(2^{20}\), which is more than one million.
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_all_subset(model)## Index N Predictors R-Square Adj. R-Square Mallow's Cp
## 1 1 1 wt 0.753 0.745 12.4809
## 2 2 1 disp 0.718 0.709 18.1296
## 3 3 1 hp 0.602 0.589 37.1126
## 4 4 1 qsec 0.175 0.148 107.0696
## 5 5 2 hp wt 0.827 0.815 2.369
## 6 6 2 wt qsec 0.826 0.814 2.4295
## 7 7 2 disp wt 0.781 0.766 9.8791
## 8 8 2 disp hp 0.748 0.731 15.2331
## 9 9 2 disp qsec 0.722 0.702 19.6028
## 10 10 2 hp qsec 0.637 0.612 33.4722
## 11 11 3 hp wt qsec 0.835 0.817 3.0617
## 12 12 3 disp hp wt 0.827 0.808 4.3607
## 13 13 3 disp wt qsec 0.826 0.808 4.4293
## 14 14 3 disp hp qsec 0.754 0.728 16.2578
## 15 15 4 disp hp wt qsec 0.835 0.811 5
The plot method shows the panel of fit criteria for all possible regression methods.
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
k <- ols_all_subset(model)
plot(k)Select the subset of predictors that do the best at meeting some well-defined objective criterion, such as having the largest R2 value or the smallest MSE, Mallow’s Cp or AIC.
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_best_subset(model)## Best Subsets Regression
## ------------------------------
## Model Index Predictors
## ------------------------------
## 1 wt
## 2 hp wt
## 3 hp wt qsec
## 4 disp hp wt qsec
## ------------------------------
##
## Subsets Regression Summary
## -------------------------------------------------------------------------------------------------------------------------------
## Adj. Pred
## Model R-Square R-Square R-Square C(p) AIC SBIC SBC MSEP FPE HSP APC
## -------------------------------------------------------------------------------------------------------------------------------
## 1 0.7530 0.7450 0.7087 12.4809 166.0294 74.2916 170.4266 9.8972 9.8572 0.3199 0.2801
## 2 0.8270 0.8150 0.7811 2.3690 156.6523 66.5755 162.5153 7.4314 7.3563 0.2402 0.2090
## 3 0.8350 0.8170 0.782 3.0617 157.1426 67.7238 164.4713 7.6140 7.4756 0.2461 0.2124
## 4 0.8350 0.8110 0.771 5.0000 159.0696 70.0408 167.8640 8.1810 7.9497 0.2644 0.2259
## -------------------------------------------------------------------------------------------------------------------------------
## AIC: Akaike Information Criteria
## SBIC: Sawa's Bayesian Information Criteria
## SBC: Schwarz Bayesian Criteria
## MSEP: Estimated error of prediction, assuming multivariate normality
## FPE: Final Prediction Error
## HSP: Hocking's Sp
## APC: Amemiya Prediction Criteria
The plot method shows the panel of fit criteria for best subset regression methods.
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
k <- ols_best_subset(model)
plot(k)Build regression model from a set of candidate predictor variables by entering predictors based on p values, in a stepwise manner until there is no variable left to enter any more. The model should include all the candidate predictor variables. If details is set to TRUE, each step is displayed.
# stepwise forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward(model)## We are selecting variables based on p value...
## 1 variable(s) added....
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## No more variables satisfy the condition of penter: 0.3
## Forward Selection Method
##
## Candidate Terms:
##
## 1 . bcs
## 2 . pindex
## 3 . enzyme_test
## 4 . liver_test
## 5 . age
## 6 . gender
## 7 . alc_mod
## 8 . alc_heavy
##
## ------------------------------------------------------------------------------
## Selection Summary
## ------------------------------------------------------------------------------
## Variable Adj.
## Step Entered R-Square R-Square C(p) AIC RMSE
## ------------------------------------------------------------------------------
## 1 liver_test 0.455 0.444 62.5119 771.8753 296.2992
## 2 alc_heavy 0.567 0.550 41.3681 761.4394 266.6484
## 3 enzyme_test 0.659 0.639 24.3379 750.5089 238.9145
## 4 pindex 0.750 0.730 7.5373 735.7146 206.5835
## 5 bcs 0.781 0.758 3.1925 730.6204 195.4544
## ------------------------------------------------------------------------------
model <- lm(y ~ ., data = surgical)
k <- ols_step_forward(model)## We are selecting variables based on p value...
## 1 variable(s) added....
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## No more variables satisfy the condition of penter: 0.3
plot(k)# stepwise forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward(model, details = TRUE)## We are selecting variables based on p value...
## 1 variable(s) added....
## Variable Selection Procedure
## Dependent Variable: y
##
## Forward Selection: Step 1
##
## Variable liver_test Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.675 RMSE 296.299
## R-Squared 0.455 Coef. Var 42.202
## Adj. R-Squared 0.444 MSE 87793.232
## Pred R-Squared 0.386 MAE 212.857
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 3804272.477 1 3804272.477 43.332 0.0000
## Residual 4565248.060 52 87793.232
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## -------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -------------------------------------------------------------------------------------------
## (Intercept) 15.191 111.869 0.136 0.893 -209.290 239.671
## liver_test 250.305 38.025 0.674 6.583 0.000 174.003 326.607
## -------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Forward Selection: Step 2
##
## Variable alc_heavy Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.753 RMSE 266.648
## R-Squared 0.567 Coef. Var 37.979
## Adj. R-Squared 0.55 MSE 71101.387
## Pred R-Squared 0.487 MAE 187.393
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 4743349.776 2 2371674.888 33.356 0.0000
## Residual 3626170.761 51 71101.387
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## --------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## --------------------------------------------------------------------------------------------
## (Intercept) -5.069 100.828 -0.050 0.960 -207.490 197.352
## liver_test 234.597 34.491 0.632 6.802 0.000 165.353 303.841
## alc_heavy 342.183 94.156 0.338 3.634 0.001 153.157 531.208
## --------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Forward Selection: Step 3
##
## Variable enzyme_test Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.812 RMSE 238.914
## R-Squared 0.659 Coef. Var 34.029
## Adj. R-Squared 0.639 MSE 57080.128
## Pred R-Squared 0.567 MAE 170.603
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 5515514.136 3 1838504.712 32.209 0.0000
## Residual 2854006.401 50 57080.128
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ---------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ---------------------------------------------------------------------------------------------
## (Intercept) -344.559 129.156 -2.668 0.010 -603.976 -85.141
## liver_test 183.844 33.845 0.495 5.432 0.000 115.865 251.823
## alc_heavy 319.662 84.585 0.315 3.779 0.000 149.769 489.555
## enzyme_test 6.263 1.703 0.335 3.678 0.001 2.843 9.683
## ---------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Forward Selection: Step 4
##
## Variable pindex Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.866 RMSE 206.584
## R-Squared 0.75 Coef. Var 29.424
## Adj. R-Squared 0.73 MSE 42676.744
## Pred R-Squared 0.669 MAE 146.473
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6278360.060 4 1569590.015 36.779 0.0000
## Residual 2091160.477 49 42676.744
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## -----------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -----------------------------------------------------------------------------------------------
## (Intercept) -789.012 153.372 -5.144 0.000 -1097.226 -480.799
## liver_test 125.474 32.358 0.338 3.878 0.000 60.448 190.499
## alc_heavy 359.875 73.754 0.355 4.879 0.000 211.660 508.089
## enzyme_test 7.548 1.503 0.404 5.020 0.000 4.527 10.569
## pindex 7.876 1.863 0.335 4.228 0.000 4.133 11.620
## -----------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Forward Selection: Step 5
##
## Variable bcs Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.884 RMSE 195.454
## R-Squared 0.781 Coef. Var 27.839
## Adj. R-Squared 0.758 MSE 38202.426
## Pred R-Squared 0.7 MAE 137.656
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6535804.090 5 1307160.818 34.217 0.0000
## Residual 1833716.447 48 38202.426
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------
## (Intercept) -1178.330 208.682 -5.647 0.000 -1597.914 -758.746
## liver_test 58.064 40.144 0.156 1.446 0.155 -22.652 138.779
## alc_heavy 317.848 71.634 0.314 4.437 0.000 173.818 461.878
## enzyme_test 9.748 1.656 0.521 5.887 0.000 6.419 13.077
## pindex 8.924 1.808 0.38 4.935 0.000 5.288 12.559
## bcs 59.864 23.060 0.241 2.596 0.012 13.498 106.230
## ------------------------------------------------------------------------------------------------
## No more variables satisfy the condition of penter: 0.3
## Forward Selection Method
##
## Candidate Terms:
##
## 1 . bcs
## 2 . pindex
## 3 . enzyme_test
## 4 . liver_test
## 5 . age
## 6 . gender
## 7 . alc_mod
## 8 . alc_heavy
##
## ------------------------------------------------------------------------------
## Selection Summary
## ------------------------------------------------------------------------------
## Variable Adj.
## Step Entered R-Square R-Square C(p) AIC RMSE
## ------------------------------------------------------------------------------
## 1 liver_test 0.455 0.444 62.5119 771.8753 296.2992
## 2 alc_heavy 0.567 0.550 41.3681 761.4394 266.6484
## 3 enzyme_test 0.659 0.639 24.3379 750.5089 238.9145
## 4 pindex 0.750 0.730 7.5373 735.7146 206.5835
## 5 bcs 0.781 0.758 3.1925 730.6204 195.4544
## ------------------------------------------------------------------------------
Build regression model from a set of candidate predictor variables by removing predictors based on p values, in a stepwise manner until there is no variable left to remove any more. The model should include all the candidate predictor variables. If details is set to TRUE, each step is displayed.
# stepwise backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward(model)## We are eliminating variables based on p value...
## No more variables satisfy the condition of prem: 0.3
## Backward Elimination Method
##
## Candidate Terms:
##
## 1 . bcs
## 2 . pindex
## 3 . enzyme_test
## 4 . liver_test
## 5 . age
## 6 . gender
## 7 . alc_mod
## 8 . alc_heavy
##
## --------------------------------------------------------------------------
## Elimination Summary
## --------------------------------------------------------------------------
## Variable Adj.
## Step Removed R-Square R-Square C(p) AIC RMSE
## --------------------------------------------------------------------------
## 1 alc_mod 0.782 0.749 7.0141 734.4068 199.2637
## 2 gender 0.781 0.754 5.0870 732.4942 197.2921
## 3 age 0.781 0.758 3.1925 730.6204 195.4544
## --------------------------------------------------------------------------
model <- lm(y ~ ., data = surgical)
k <- ols_step_backward(model)## We are eliminating variables based on p value...
## No more variables satisfy the condition of prem: 0.3
plot(k)# stepwise backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward(model, details = TRUE)## We are eliminating variables based on p value...
## Backward Elimination: Step 1
##
## Variable alc_mod Removed
##
## Model Summary
## -----------------------------------------------------------------
## R 0.884 RMSE 199.264
## R-Squared 0.782 Coef. Var 28.381
## Adj. R-Squared 0.749 MSE 39706.04
## Pred R-Squared 0.678 MAE 137.053
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6543042.709 7 934720.387 23.541 0.0000
## Residual 1826477.828 46 39706.040
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------
## (Intercept) -1145.971 238.536 -4.804 0.000 -1626.119 -665.822
## bcs 62.274 24.187 0.251 2.575 0.013 13.589 110.959
## pindex 8.987 1.850 0.382 4.857 0.000 5.262 12.711
## enzyme_test 9.875 1.720 0.528 5.743 0.000 6.414 13.337
## liver_test 50.763 44.379 0.137 1.144 0.259 -38.567 140.093
## age -0.911 2.599 -0.025 -0.351 0.728 -6.142 4.320
## gender 15.786 57.840 0.02 0.273 0.786 -100.639 132.212
## alc_heavy 315.854 73.849 0.312 4.277 0.000 167.202 464.505
## ------------------------------------------------------------------------------------------------
##
##
## Backward Elimination: Step 2
##
## Variable gender Removed
##
## Model Summary
## -----------------------------------------------------------------
## R 0.884 RMSE 197.292
## R-Squared 0.781 Coef. Var 28.101
## Adj. R-Squared 0.754 MSE 38924.162
## Pred R-Squared 0.692 MAE 138.16
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6540084.920 6 1090014.153 28.004 0.0000
## Residual 1829435.617 47 38924.162
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------
## (Intercept) -1143.080 235.943 -4.845 0.000 -1617.737 -668.424
## bcs 61.424 23.748 0.248 2.586 0.013 13.649 109.199
## pindex 8.974 1.832 0.382 4.900 0.000 5.290 12.659
## enzyme_test 9.852 1.700 0.527 5.794 0.000 6.431 13.273
## liver_test 54.053 42.288 0.146 1.278 0.207 -31.019 139.125
## age -0.850 2.563 -0.024 -0.332 0.742 -6.007 4.307
## alc_heavy 314.585 72.974 0.31 4.311 0.000 167.781 461.390
## ------------------------------------------------------------------------------------------------
##
##
## Backward Elimination: Step 3
##
## Variable age Removed
##
## Model Summary
## -----------------------------------------------------------------
## R 0.884 RMSE 195.454
## R-Squared 0.781 Coef. Var 27.839
## Adj. R-Squared 0.758 MSE 38202.426
## Pred R-Squared 0.7 MAE 137.656
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6535804.090 5 1307160.818 34.217 0.0000
## Residual 1833716.447 48 38202.426
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------
## (Intercept) -1178.330 208.682 -5.647 0.000 -1597.914 -758.746
## bcs 59.864 23.060 0.241 2.596 0.012 13.498 106.230
## pindex 8.924 1.808 0.38 4.935 0.000 5.288 12.559
## enzyme_test 9.748 1.656 0.521 5.887 0.000 6.419 13.077
## liver_test 58.064 40.144 0.156 1.446 0.155 -22.652 138.779
## alc_heavy 317.848 71.634 0.314 4.437 0.000 173.818 461.878
## ------------------------------------------------------------------------------------------------
## No more variables satisfy the condition of prem: 0.3
## Backward Elimination Method
##
## Candidate Terms:
##
## 1 . bcs
## 2 . pindex
## 3 . enzyme_test
## 4 . liver_test
## 5 . age
## 6 . gender
## 7 . alc_mod
## 8 . alc_heavy
##
## --------------------------------------------------------------------------
## Elimination Summary
## --------------------------------------------------------------------------
## Variable Adj.
## Step Removed R-Square R-Square C(p) AIC RMSE
## --------------------------------------------------------------------------
## 1 alc_mod 0.782 0.749 7.0141 734.4068 199.2637
## 2 gender 0.781 0.754 5.0870 732.4942 197.2921
## 3 age 0.781 0.758 3.1925 730.6204 195.4544
## --------------------------------------------------------------------------
Build regression model from a set of candidate predictor variables by entering and removing predictors based on p values, in a stepwise manner until there is no variable left to enter or remove any more. The model should include all the candidate predictor variables. If details is set to TRUE, each step is displayed.
# stepwise regression
model <- lm(y ~ ., data = surgical)
ols_stepwise(model)## We are selecting variables based on p value...
## 1 variable(s) added....
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## No more variables to be added or removed.
## Stepwise Selection Method
##
## Candidate Terms:
##
## 1 . bcs
## 2 . pindex
## 3 . enzyme_test
## 4 . liver_test
## 5 . age
## 6 . gender
## 7 . alc_mod
## 8 . alc_heavy
##
## ------------------------------------------------------------------------------------------
## Stepwise Selection Summary
## ------------------------------------------------------------------------------------------
## Added/ Adj.
## Step Variable Removed R-Square R-Square C(p) AIC RMSE
## ------------------------------------------------------------------------------------------
## 1 liver_test addition 0.455 0.444 62.5119 771.8753 296.2992
## 2 alc_heavy addition 0.567 0.550 41.3681 761.4394 266.6484
## 3 enzyme_test addition 0.659 0.639 24.3379 750.5089 238.9145
## 4 pindex addition 0.750 0.730 7.5373 735.7146 206.5835
## 5 bcs addition 0.781 0.758 3.1925 730.6204 195.4544
## ------------------------------------------------------------------------------------------
model <- lm(y ~ ., data = surgical)
k <- ols_stepwise(model)## We are selecting variables based on p value...
## 1 variable(s) added....
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## 1 variable(s) added...
## No more variables to be added or removed.
plot(k)# stepwise regression
model <- lm(y ~ ., data = surgical)
ols_stepwise(model, details = TRUE)## We are selecting variables based on p value...
## 1 variable(s) added....
## Variable Selection Procedure
## Dependent Variable: y
##
## Stepwise Selection: Step 1
##
## Variable liver_test Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.675 RMSE 296.299
## R-Squared 0.455 Coef. Var 42.202
## Adj. R-Squared 0.444 MSE 87793.232
## Pred R-Squared 0.386 MAE 212.857
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 3804272.477 1 3804272.477 43.332 0.0000
## Residual 4565248.060 52 87793.232
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## -------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -------------------------------------------------------------------------------------------
## (Intercept) 15.191 111.869 0.136 0.893 -209.290 239.671
## liver_test 250.305 38.025 0.674 6.583 0.000 174.003 326.607
## -------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Stepwise Selection: Step 2
##
## Variable alc_heavy Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.753 RMSE 266.648
## R-Squared 0.567 Coef. Var 37.979
## Adj. R-Squared 0.55 MSE 71101.387
## Pred R-Squared 0.487 MAE 187.393
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 4743349.776 2 2371674.888 33.356 0.0000
## Residual 3626170.761 51 71101.387
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## --------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## --------------------------------------------------------------------------------------------
## (Intercept) -5.069 100.828 -0.050 0.960 -207.490 197.352
## liver_test 234.597 34.491 0.632 6.802 0.000 165.353 303.841
## alc_heavy 342.183 94.156 0.338 3.634 0.001 153.157 531.208
## --------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Stepwise Selection: Step 3
##
## Variable enzyme_test Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.812 RMSE 238.914
## R-Squared 0.659 Coef. Var 34.029
## Adj. R-Squared 0.639 MSE 57080.128
## Pred R-Squared 0.567 MAE 170.603
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 5515514.136 3 1838504.712 32.209 0.0000
## Residual 2854006.401 50 57080.128
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ---------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ---------------------------------------------------------------------------------------------
## (Intercept) -344.559 129.156 -2.668 0.010 -603.976 -85.141
## liver_test 183.844 33.845 0.495 5.432 0.000 115.865 251.823
## alc_heavy 319.662 84.585 0.315 3.779 0.000 149.769 489.555
## enzyme_test 6.263 1.703 0.335 3.678 0.001 2.843 9.683
## ---------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Stepwise Selection: Step 4
##
## Variable pindex Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.866 RMSE 206.584
## R-Squared 0.75 Coef. Var 29.424
## Adj. R-Squared 0.73 MSE 42676.744
## Pred R-Squared 0.669 MAE 146.473
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6278360.060 4 1569590.015 36.779 0.0000
## Residual 2091160.477 49 42676.744
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## -----------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## -----------------------------------------------------------------------------------------------
## (Intercept) -789.012 153.372 -5.144 0.000 -1097.226 -480.799
## liver_test 125.474 32.358 0.338 3.878 0.000 60.448 190.499
## alc_heavy 359.875 73.754 0.355 4.879 0.000 211.660 508.089
## enzyme_test 7.548 1.503 0.404 5.020 0.000 4.527 10.569
## pindex 7.876 1.863 0.335 4.228 0.000 4.133 11.620
## -----------------------------------------------------------------------------------------------
## 1 variable(s) added...
## Stepwise Selection: Step 5
##
## Variable bcs Entered
##
## Model Summary
## -----------------------------------------------------------------
## R 0.884 RMSE 195.454
## R-Squared 0.781 Coef. Var 27.839
## Adj. R-Squared 0.758 MSE 38202.426
## Pred R-Squared 0.7 MAE 137.656
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6535804.090 5 1307160.818 34.217 0.0000
## Residual 1833716.447 48 38202.426
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------
## (Intercept) -1178.330 208.682 -5.647 0.000 -1597.914 -758.746
## liver_test 58.064 40.144 0.156 1.446 0.155 -22.652 138.779
## alc_heavy 317.848 71.634 0.314 4.437 0.000 173.818 461.878
## enzyme_test 9.748 1.656 0.521 5.887 0.000 6.419 13.077
## pindex 8.924 1.808 0.38 4.935 0.000 5.288 12.559
## bcs 59.864 23.060 0.241 2.596 0.012 13.498 106.230
## ------------------------------------------------------------------------------------------------
## No more variables to be added or removed.
## Stepwise Selection Method
##
## Candidate Terms:
##
## 1 . bcs
## 2 . pindex
## 3 . enzyme_test
## 4 . liver_test
## 5 . age
## 6 . gender
## 7 . alc_mod
## 8 . alc_heavy
##
## ------------------------------------------------------------------------------------------
## Stepwise Selection Summary
## ------------------------------------------------------------------------------------------
## Added/ Adj.
## Step Variable Removed R-Square R-Square C(p) AIC RMSE
## ------------------------------------------------------------------------------------------
## 1 liver_test addition 0.455 0.444 62.5119 771.8753 296.2992
## 2 alc_heavy addition 0.567 0.550 41.3681 761.4394 266.6484
## 3 enzyme_test addition 0.659 0.639 24.3379 750.5089 238.9145
## 4 pindex addition 0.750 0.730 7.5373 735.7146 206.5835
## 5 bcs addition 0.781 0.758 3.1925 730.6204 195.4544
## ------------------------------------------------------------------------------------------
Build regression model from a set of candidate predictor variables by entering predictors based on Akaike Information Criteria, in a stepwise manner until there is no variable left to enter any more. The model should include all the candidate predictor variables. If details is set to TRUE, each step is displayed.
# stepwise aic forward regression
model <- lm(y ~ ., data = surgical)
ols_stepaic_forward(model)## ---------------------------------------------------------------------------
## Variable AIC Sum Sq RSS R-Sq Adj. R-Sq
## ---------------------------------------------------------------------------
## liver_test 771.8753 3804272.477 4565248.06 0.455 0.444
## alc_heavy 761.4394 4743349.776 3626170.761 0.567 0.55
## enzyme_test 750.5089 5515514.136 2854006.401 0.659 0.639
## pindex 735.7146 6278360.06 2091160.477 0.75 0.73
## bcs 730.6204 6535804.09 1833716.447 0.781 0.758
## ---------------------------------------------------------------------------
model <- lm(y ~ ., data = surgical)
k <- ols_stepaic_forward(model)
plot(k)# stepwise aic forward regression
model <- lm(y ~ ., data = surgical)
ols_stepaic_forward(model, details = TRUE)## Step 0: AIC = 802.606
## y ~ 1
##
## ---------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## ---------------------------------------------------------------------------------
## liver_test 1 771.8753 3804272.477 4565248.06 0.455 0.444
## enzyme_test 1 782.6289 2798309.881 5571210.656 0.334 0.322
## pindex 1 794.0997 1479766.753 6889753.784 0.177 0.161
## alc_heavy 1 794.3008 1454057.255 6915463.282 0.174 0.158
## bcs 1 797.6971 1005151.658 7364368.879 0.12 0.103
## alc_mod 1 802.8282 271062.33 8098458.207 0.032 0.014
## gender 1 802.9564 251808.57 8117711.967 0.03 0.011
## age 1 803.8336 118862.559 8250657.978 0.014 -0.005
## ---------------------------------------------------------------------------------
##
##
## Step 1 : AIC = 771.8753
## y ~ liver_test
##
## -------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## -------------------------------------------------------------------------------
## alc_heavy 1 761.4394 939077.299 3626170.761 0.567 0.55
## enzyme_test 1 762.077 896004.331 3669243.729 0.562 0.544
## pindex 1 770.3869 285591.786 4279656.274 0.489 0.469
## alc_mod 1 771.1412 225396.238 4339851.822 0.481 0.461
## gender 1 773.8024 6162.222 4559085.838 0.455 0.434
## age 1 773.8312 3726.297 4561521.763 0.455 0.434
## bcs 1 773.8672 685.255 4564562.805 0.455 0.433
## -------------------------------------------------------------------------------
##
##
## Step 2 : AIC = 761.4394
## y ~ liver_test + alc_heavy
##
## -------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## -------------------------------------------------------------------------------
## enzyme_test 1 750.5089 772164.36 2854006.401 0.659 0.639
## pindex 1 756.125 459358.635 3166812.126 0.622 0.599
## bcs 1 763.0628 25195.588 3600975.173 0.57 0.544
## age 1 763.11 22048.109 3604122.652 0.569 0.544
## alc_mod 1 763.4277 784.551 3625386.21 0.567 0.541
## gender 1 763.4328 443.344 3625727.417 0.567 0.541
## -------------------------------------------------------------------------------
##
##
## Step 3 : AIC = 750.5089
## y ~ liver_test + alc_heavy + enzyme_test
##
## --------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------
## pindex 1 735.7146 762845.924 2091160.477 0.75 0.73
## bcs 1 750.7818 89836.308 2764170.093 0.67 0.643
## alc_mod 1 752.4027 5607.57 2848398.831 0.66 0.632
## age 1 752.4162 4896.081 2849110.32 0.66 0.632
## gender 1 752.5088 5.958 2854000.443 0.659 0.631
## --------------------------------------------------------------------------------
##
##
## Step 4 : AIC = 735.7146
## y ~ liver_test + alc_heavy + enzyme_test + pindex
##
## -------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## -------------------------------------------------------------------------------
## bcs 1 730.6204 257444.03 1833716.447 0.781 0.758
## age 1 737.6804 1325.881 2089834.596 0.75 0.724
## gender 1 737.7123 90.187 2091070.29 0.75 0.724
## alc_mod 1 737.7131 60.62 2091099.857 0.75 0.724
## -------------------------------------------------------------------------------
##
##
## Step 5 : AIC = 730.6204
## y ~ liver_test + alc_heavy + enzyme_test + pindex + bcs
##
## ------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## ------------------------------------------------------------------------------
## age 1 732.4942 4280.83 1829435.617 0.781 0.754
## gender 1 732.5509 2360.288 1831356.159 0.781 0.753
## alc_mod 1 732.614 216.992 1833499.455 0.781 0.753
## ------------------------------------------------------------------------------
## No more variables to be added.
## Model Summary
## -----------------------------------------------------------------
## R 0.884 RMSE 195.454
## R-Squared 0.781 Coef. Var 27.839
## Adj. R-Squared 0.758 MSE 38202.426
## Pred R-Squared 0.7 MAE 137.656
## -----------------------------------------------------------------
## RMSE: Root Mean Square Error
## MSE: Mean Square Error
## MAE: Mean Absolute Error
##
## ANOVA
## -------------------------------------------------------------------
## Sum of
## Squares DF Mean Square F Sig.
## -------------------------------------------------------------------
## Regression 6535804.090 5 1307160.818 34.217 0.0000
## Residual 1833716.447 48 38202.426
## Total 8369520.537 53
## -------------------------------------------------------------------
##
## Parameter Estimates
## ------------------------------------------------------------------------------------------------
## model Beta Std. Error Std. Beta t Sig lower upper
## ------------------------------------------------------------------------------------------------
## (Intercept) -1178.330 208.682 -5.647 0.000 -1597.914 -758.746
## liver_test 58.064 40.144 0.156 1.446 0.155 -22.652 138.779
## alc_heavy 317.848 71.634 0.314 4.437 0.000 173.818 461.878
## enzyme_test 9.748 1.656 0.521 5.887 0.000 6.419 13.077
## pindex 8.924 1.808 0.38 4.935 0.000 5.288 12.559
## bcs 59.864 23.060 0.241 2.596 0.012 13.498 106.230
## ------------------------------------------------------------------------------------------------
## ---------------------------------------------------------------------------
## Variable AIC Sum Sq RSS R-Sq Adj. R-Sq
## ---------------------------------------------------------------------------
## liver_test 771.8753 3804272.477 4565248.06 0.455 0.444
## alc_heavy 761.4394 4743349.776 3626170.761 0.567 0.55
## enzyme_test 750.5089 5515514.136 2854006.401 0.659 0.639
## pindex 735.7146 6278360.06 2091160.477 0.75 0.73
## bcs 730.6204 6535804.09 1833716.447 0.781 0.758
## ---------------------------------------------------------------------------
Build regression model from a set of candidate predictor variables by removing predictors based on Akaike Information Criteria, in a stepwise manner until there is no variable left to remove any more. The model should include all the candidate predictor variables. If details is set to TRUE, each step is displayed.
# stepwise aic backward regression
model <- lm(y ~ ., data = surgical)
k <- ols_stepaic_backward(model)
k##
##
## Backward Elimination Summary
## -------------------------------------------------------------------------
## Variable AIC RSS Sum Sq R-Sq Adj. R-Sq
## -------------------------------------------------------------------------
## Full Model 736.39 1825905.713 6543614.824 0.782 0.743
## alc_mod 734.407 1826477.828 6543042.709 0.782 0.749
## gender 732.494 1829435.617 6540084.920 0.781 0.754
## age 730.62 1833716.447 6535804.090 0.781 0.758
## -------------------------------------------------------------------------
### Plotmodel <- lm(y ~ ., data = surgical)
k <- ols_stepaic_backward(model)
plot(k)# stepwise aic backward regression
model <- lm(y ~ ., data = surgical)
ols_stepaic_backward(model, details = TRUE)## Step 0: AIC = 736.39
## y ~ bcs + pindex + enzyme_test + liver_test + age + gender + alc_mod + alc_heavy
##
## --------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------
## alc_mod 1 734.407 572.115 1826477.828 0.782 0.749
## gender 1 734.478 2990.338 1828896.051 0.781 0.748
## age 1 734.544 5231.108 1831136.821 0.781 0.748
## liver_test 1 735.878 51016.156 1876921.869 0.776 0.742
## bcs 1 741.677 263780.393 2089686.106 0.75 0.712
## alc_heavy 1 749.21 576636.222 2402541.935 0.713 0.669
## pindex 1 756.624 930187.311 2756093.024 0.671 0.621
## enzyme_test 1 763.557 1307756.931 3133662.644 0.626 0.569
## --------------------------------------------------------------------------------
##
## Step 1 : AIC = 734.407
## y ~ bcs + pindex + enzyme_test + liver_test + age + gender + alc_heavy
##
## --------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------
## gender 1 732.494 2957.789 1829435.617 0.781 0.754
## age 1 732.551 4878.331 1831356.159 0.781 0.753
## liver_test 1 733.921 51951.343 1878429.171 0.776 0.747
## bcs 1 739.677 263219.094 2089696.922 0.75 0.718
## alc_heavy 1 750.486 726328.685 2552806.513 0.695 0.656
## pindex 1 754.759 936543.762 2763021.59 0.67 0.628
## enzyme_test 1 761.596 1309433.006 3135910.834 0.625 0.577
## --------------------------------------------------------------------------------
##
## Step 2 : AIC = 732.494
## y ~ bcs + pindex + enzyme_test + liver_test + age + alc_heavy
##
## --------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------
## age 1 730.62 4280.83 1833716.447 0.781 0.758
## liver_test 1 732.34 63596.19 1893031.807 0.774 0.75
## bcs 1 737.68 260398.979 2089834.596 0.75 0.724
## alc_heavy 1 748.486 723371.473 2552807.09 0.695 0.663
## pindex 1 752.777 934511.071 2763946.688 0.67 0.635
## enzyme_test 1 759.596 1306482.666 3135918.283 0.625 0.586
## --------------------------------------------------------------------------------
##
## Step 3 : AIC = 730.62
## y ~ bcs + pindex + enzyme_test + liver_test + alc_heavy
##
## --------------------------------------------------------------------------------
## Variable DF AIC Sum Sq RSS R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------
## liver_test 1 730.924 79919.825 1913636.272 0.771 0.753
## bcs 1 735.715 257444.03 2091160.477 0.75 0.73
## alc_heavy 1 747.181 752122.827 2585839.274 0.691 0.666
## pindex 1 750.782 930453.646 2764170.093 0.67 0.643
## enzyme_test 1 757.971 1324076.125 3157792.572 0.623 0.592
## --------------------------------------------------------------------------------
## No more variables to be removed.
##
##
## Backward Elimination Summary
## -------------------------------------------------------------------------
## Variable AIC RSS Sum Sq R-Sq Adj. R-Sq
## -------------------------------------------------------------------------
## Full Model 736.39 1825905.713 6543614.824 0.782 0.743
## alc_mod 734.407 1826477.828 6543042.709 0.782 0.749
## gender 732.494 1829435.617 6540084.920 0.781 0.754
## age 730.62 1833716.447 6535804.090 0.781 0.758
## -------------------------------------------------------------------------
Build regression model from a set of candidate predictor variables by entering and removing predictors based on Akaike Information Criteria, in a stepwise manner until there is no variable left to enter or remove any more. The model should include all the candidate predictor variables. If details is set to TRUE, each step is displayed.
# stepwise aic regression
model <- lm(y ~ ., data = surgical)
ols_stepaic_both(model)## No more variables to be added or removed.
##
##
## Stepwise Summary
## --------------------------------------------------------------------------------------
## Variable Method AIC RSS Sum Sq R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------------
## liver_test addition 771.875 4565248.060 3804272.477 0.455 0.444
## alc_heavy addition 761.439 3626170.761 4743349.776 0.567 0.55
## enzyme_test addition 750.509 2854006.401 5515514.136 0.659 0.639
## pindex addition 735.715 2091160.477 6278360.060 0.75 0.73
## bcs addition 730.62 1833716.447 6535804.090 0.781 0.758
## --------------------------------------------------------------------------------------
model <- lm(y ~ ., data = surgical)
k <- ols_stepaic_both(model)## No more variables to be added or removed.
plot(k)# stepwise aic regression
model <- lm(y ~ ., data = surgical)
ols_stepaic_both(model, details = TRUE)## Step 0: AIC = 802.606
## y ~ 1
##
##
##
## Step 1 : AIC = 771.8753
## y ~ liver_test
##
##
##
## Step 2 : AIC = 761.4394
## y ~ liver_test + alc_heavy
##
##
##
## Step 3 : AIC = 750.5089
## y ~ liver_test + alc_heavy + enzyme_test
##
##
##
## Step 4 : AIC = 735.7146
## y ~ liver_test + alc_heavy + enzyme_test + pindex
##
##
##
## Step 5 : AIC = 730.6204
## y ~ liver_test + alc_heavy + enzyme_test + pindex + bcs
## No more variables to be added or removed.
##
##
## Stepwise Summary
## --------------------------------------------------------------------------------------
## Variable Method AIC RSS Sum Sq R-Sq Adj. R-Sq
## --------------------------------------------------------------------------------------
## liver_test addition 771.875 4565248.060 3804272.477 0.455 0.444
## alc_heavy addition 761.439 3626170.761 4743349.776 0.567 0.55
## enzyme_test addition 750.509 2854006.401 5515514.136 0.659 0.639
## pindex addition 735.715 2091160.477 6278360.060 0.75 0.73
## bcs addition 730.62 1833716.447 6535804.090 0.781 0.758
## --------------------------------------------------------------------------------------