score:0

Accepted answer

Additionally to what you have tried, you can also see if

import numpy as np
features = np.nan_to_num(features)
rewards = np.nan_to_num(rewards)

This sets all non-numeric values in your arrays to 0, and should at least make your algorithm run, unless the error occurs somewhere internal to the algorithm. Make sure there aren't to many non-numeric entries in your data, as setting them all to 0 may cause strange biases in your estimates.

If this is not the case, and you are using weights='distance', then please check whether any of the train samples are identical. This will cause a division by zero in inverse distance.

If inverse distances are the cause of division by zero, you can circumvent this by using your own distance function, e.g.

def better_inv_dist(dist):
    c = 1.
    return 1. / (c + dist)

and then use 'weights': better_inv_dist. You may need to adapt the constant c to the right scale. In any case it will avoid division by zero as long as c > 0.

score:0

I ran into the same problem with KNN regression on scikit-learn. I was using weights='distance' and that led to infinite values while computing the predictions (but not while fitting the KNN model i.e. learning appropriate KD Tree or Ball Tree). I switched to weights='uniform' and the program ran to completion correctly, indicating the supplied weight function was the problem. If you want to use distance-based weights, supply a custom-weight function that doesn't explode to infinity at zero distance as indicated in eickenberg's answer.


Related Query