Problems with k-NN regression in R -
i trying run knnreg package caret. reason, training set works:
> summary(train1) v1 v2 v3 13 : 10474 1 : 6435 7 : 8929 10 : 10315 2 : 6435 6 : 8895 4 : 10272 3 : 6435 9 : 8892 1 : 10244 4 : 6435 10 : 8892 2 : 10238 7 : 6435 15 : 8874 24 : 10228 8 : 6435 40 : 8870 (other):359799 (other):382960 (other):368218
while 1 won't work:
> summary(train2) v1 v2 v3 v4 13 : 10474 1 : 6436 7 : 8929 christmas : 5946 10 : 10315 2 : 6436 6 : 8895 labor day : 8861 4 : 10272 3 : 6438 9 : 8892 none :391909 1 : 10244 4 : 6435 10 : 8892 super bowl : 8895 2 : 10238 7 : 6435 15 : 8874 thanksgiving: 5959 24 : 10228 8 : 6435 40 : 8870 (other):359799 (other):382960 (other):368218
here target vector:
> summary(target) min. 1st qu. median mean 3rd qu. max. -499 200 712 1980 20210 693100
the error during prediction phase:
> fit <- knnreg(train2, target, k = 2) > prediction <- predict(fit, newdata=test) error in knnregtrain(train = list(v1 = c(1l, 1l, 1l, 1l, 1l, 1l, 1l, : na/nan/inf in foreign function call (arg 5) in addition: warning messages: 1: in knnregtrain(train = list(v1 = c(1l, 1l, 1l, 1l, 1l, 1l, 1l, : nas introduced coercion 2: in knnregtrain(train = list(v1 = c(1l, 1l, 1l, 1l, 1l, 1l, 1l, : nas introduced coercion
while test set:
> summary(test) v1 v2 v3 v4 13 : 2836 1 : 1755 51 : 3002 christmas : 2988 4 : 2803 2 : 1755 49 : 2989 labor day : 0 19 : 2799 3 : 1755 52 : 2988 none :106136 2 : 2797 4 : 1755 50 : 2986 super bowl : 2964 27 : 2791 7 : 1755 6 : 2984 thanksgiving: 2976 24 : 2790 8 : 1755 47 : 2976 (other):98248 (other):104534 (other):97139
what missing?
edit: switching v4 set labels '1', '2', ... fixes problem. algorithm considers features numerical though they're factors?
i realized knnreg receive numerical values , when tried train model train1, considered values numerical (when in fact categorical). train2 returns error because v4 not numerical, , knnreg can't convert numerical either.
Comments
Post a Comment