Fitting Multinomial Logistic Regression model in Divide and Recombine approach to Large Data Sets
drglm_multinom.Rmd
Using the function drglm.multinom(), multinomial logistic regression models can be fitted to large data sets.
#Generating a Data Set
set.seed(123)
#Number of rows to be generated
n <- 1000000
#creating dataset
dataset <- data.frame(
Var_1 = round(rnorm(n, mean = 50, sd = 10)),
Var_2 = round(rnorm(n, mean = 7.5, sd = 2.1)),
Var_3 = as.factor(sample(c("0", "1"), n, replace = TRUE)),
Var_4 = as.factor(sample(c("0", "1", "2"), n, replace = TRUE)),
Var_5 = as.factor(sample(0:15, n, replace = TRUE)),
Var_6 = round(rnorm(n, mean = 60, sd = 5))
)
This data set contains six variables of which four of them are continuous generated from normal distribution and two of them are categorical and other one is count variable. Now we shall fit different GLMs with this data set below.
Fitting Multinomial Logistic Regression Model
Now, we shall fit multinomial logistic regression model to the data sets assuming Var_4 as response variable and all other variables as independent ones.
mmodel=drglm::drglm.multinom(Var_4~ Var_1+ Var_2+ Var_3+ Var_5+ Var_6,
data=dataset, k=10)
## # weights: 63 (40 variable)
## initial value 109861.228867
## final value 109861.228162
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109842.503510
## iter 20 value 109840.273128
## final value 109838.002508
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109850.296686
## iter 20 value 109846.528490
## final value 109842.945823
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109847.393856
## iter 20 value 109841.079169
## final value 109840.175418
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109842.805655
## iter 20 value 109840.979230
## iter 30 value 109838.911934
## final value 109838.864166
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109841.472994
## iter 20 value 109839.598647
## final value 109837.733262
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109851.271296
## iter 20 value 109846.660324
## iter 30 value 109839.769091
## iter 40 value 109838.903624
## iter 40 value 109838.903182
## iter 40 value 109838.903178
## final value 109838.903178
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109840.806578
## iter 20 value 109837.263429
## final value 109834.528438
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109850.031314
## iter 20 value 109849.169972
## final value 109846.685488
## converged
## # weights: 63 (40 variable)
## initial value 109861.228867
## iter 10 value 109848.501910
## iter 20 value 109846.077070
## final value 109845.048526
## converged
#Output
print(mmodel)
## Estimate.1 Estimate.2 standard error.1 standard error.2
## (Intercept) 4.081904e-02 2.071676e-03 0.0344340641 0.0344561368
## Var_1 -9.984185e-05 1.415146e-05 0.0002448509 0.0002449696
## Var_2 1.402186e-03 2.012445e-04 0.0011559414 0.0011565234
## Var_31 -1.835696e-03 -5.230905e-05 0.0048983192 0.0049007854
## Var_51 -2.570995e-03 6.045345e-03 0.0138744940 0.0138774417
## Var_52 2.589983e-03 7.659461e-03 0.0138717808 0.0138809960
## Var_53 -4.951806e-03 -1.604007e-02 0.0138678622 0.0139049681
## Var_54 1.456459e-03 1.530690e-02 0.0138888131 0.0138830238
## Var_55 -2.225580e-02 -2.838295e-02 0.0138490644 0.0138778135
## Var_56 -1.001576e-02 -1.472764e-02 0.0138454752 0.0138710823
## Var_57 3.229535e-03 -1.157117e-03 0.0138747222 0.0139001413
## Var_58 2.181392e-05 -1.234939e-03 0.0138865421 0.0139071506
## Var_59 -1.823170e-02 -1.626911e-02 0.0138691698 0.0138837951
## Var_510 -1.050656e-02 -1.295762e-02 0.0138664395 0.0138884550
## Var_511 -1.114918e-02 6.444328e-03 0.0138833413 0.0138709237
## Var_512 -5.482693e-03 1.265131e-03 0.0138716161 0.0138773671
## Var_513 -1.979504e-02 -2.113650e-02 0.0138717368 0.0138919857
## Var_514 -3.300604e-02 -1.611510e-02 0.0138574025 0.0138463110
## Var_515 -8.855361e-03 3.537469e-03 0.0138783001 0.0138751272
## Var_6 -6.124825e-04 -1.379973e-05 0.0004887467 0.0004889809
## z value.1 z value.2 Pr(>|z|).1 Pr(>|z|).2 95% lower CI.1
## (Intercept) 1.185426192 0.06012503 0.23584898 0.95205606 -0.0266704840
## Var_1 -0.407765880 0.05776822 0.68344557 0.95393325 -0.0005797408
## Var_2 1.213025485 0.17400812 0.22512008 0.86185908 -0.0008634171
## Var_31 -0.374760305 -0.01067361 0.70783874 0.99148386 -0.0114362249
## Var_51 -0.185303723 0.43562392 0.85299082 0.66310961 -0.0297645039
## Var_52 0.186708754 0.55179480 0.85188899 0.58108895 -0.0245982079
## Var_53 -0.357070594 -1.15354944 0.72103896 0.24868494 -0.0321323162
## Var_54 0.104865617 1.10256256 0.91648244 0.27021717 -0.0257651146
## Var_55 -1.607025597 -2.04520340 0.10804875 0.04083481 -0.0493994683
## Var_56 -0.723395604 -1.06175109 0.46943687 0.28834870 -0.0371523886
## Var_57 0.232763905 -0.08324495 0.81594474 0.93365677 -0.0239644213
## Var_58 0.001570867 -0.08879882 0.99874663 0.92924180 -0.0271953084
## Var_59 -1.314548921 -1.17180582 0.18866155 0.24127502 -0.0454147755
## Var_510 -0.757696897 -0.93297782 0.44863246 0.35083142 -0.0376842803
## Var_511 -0.803061741 0.46459254 0.42193905 0.64222327 -0.0383600292
## Var_512 -0.395245419 0.09116504 0.69266178 0.92736145 -0.0326705607
## Var_513 -1.427005454 -1.52148900 0.15357832 0.12813717 -0.0469831484
## Var_514 -2.381834365 -1.16385533 0.01722664 0.24448265 -0.0601660472
## Var_515 -0.638072450 0.25495039 0.52342652 0.79876141 -0.0360563293
## Var_6 -1.253169669 -0.02822140 0.21014397 0.97748557 -0.0015704084
## 95% lower CI.2 95% upper CI.1 95% upper CI.2
## (Intercept) -0.0654611110 0.1083085670 0.0696044634
## Var_1 -0.0004659801 0.0003800571 0.0004942831
## Var_2 -0.0020654997 0.0036677897 0.0024679886
## Var_31 -0.0096576719 0.0077648337 0.0095530538
## Var_51 -0.0211539403 0.0246225131 0.0332446313
## Var_52 -0.0195467909 0.0297781737 0.0348657137
## Var_53 -0.0432933050 0.0222287046 0.0112131685
## Var_54 -0.0119033243 0.0286780325 0.0425171290
## Var_55 -0.0555829659 0.0048878664 -0.0011829367
## Var_56 -0.0419144585 0.0171208768 0.0124591850
## Var_57 -0.0284008929 0.0304234903 0.0260866597
## Var_58 -0.0284924528 0.0272389363 0.0260225757
## Var_59 -0.0434808504 0.0089513711 0.0109426265
## Var_510 -0.0401784920 0.0166711639 0.0142632510
## Var_511 -0.0207421833 0.0160616687 0.0336308387
## Var_512 -0.0259340090 0.0217051753 0.0284642705
## Var_513 -0.0483642950 0.0073930604 0.0060912882
## Var_514 -0.0432533737 -0.0058460277 0.0110231681
## Var_515 -0.0236572804 0.0183456074 0.0307322187
## Var_6 -0.0009721847 0.0003454434 0.0009445853