Skip to content

[BUG] ValueError: Found array with 0 sample(s)  #742

Closed
@allenyllee

Description

@allenyllee

Describe the bug

When using SVMSMOTE on dataset which contains a minority class which has very few samples (may be < 10), it'll raise error ValueError: Found array with 0 sample(s) (shape=(0, 600)) while a minimum of 1 is required.

Steps/Code to Reproduce

from collections import Counter
from sklearn.datasets import make_classification
from imblearn.over_sampling import SVMSMOTE # doctest: +NORMALIZE_WHITESPACE

X, y = make_classification(n_classes=3, class_sep=0,
            weights=[0.004, 0.451, 0.545], n_informative=3, n_redundant=0, flip_y=0,
            n_features=3, n_clusters_per_class=2, n_samples=1000, random_state=10)
print('Original dataset shape %s' % Counter(y))


sm = SVMSMOTE(random_state=42, k_neighbors=4)
X_res, y_res = sm.fit_resample(X, y)
print('Resampled dataset shape %s' % Counter(y_res))

Expected Results

Running without error

Actual Results

Original dataset shape Counter({2: 544, 1: 451, 0: 5})

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-78-8f5d2308c2bd> in <module>()
     10 
     11 sm = SVMSMOTE(random_state=42, k_neighbors=4)
---> 12 X_res, y_res = sm.fit_resample(X, y)
     13 print('Resampled dataset shape %s' % Counter(y_res))

~/anaconda3/lib/python3.6/site-packages/imblearn/base.py in fit_resample(self, X, y)
     82             self.sampling_strategy, y, self._sampling_type)
     83 
---> 84         output = self._fit_resample(X, y)
     85 
     86         if binarize_y:

~/anaconda3/lib/python3.6/site-packages/imblearn/over_sampling/_smote.py in _fit_resample(self, X, y)
    530     def _fit_resample(self, X, y):
    531         # print("_fit_resample X shape", X.shape)
--> 532         return self._sample(X, y)
    533 
    534     def _sample(self, X, y):

~/anaconda3/lib/python3.6/site-packages/imblearn/over_sampling/_smote.py in _sample(self, X, y)
    569 
    570             danger_bool = self._in_danger_noise(
--> 571                 self.nn_m_, support_vector, class_sample, y, kind='danger')
    572             safety_bool = np.logical_not(danger_bool)
    573 

~/anaconda3/lib/python3.6/site-packages/imblearn/over_sampling/_smote.py in _in_danger_noise(self, nn_estimator, samples, target_class, y, kind)
    213         # print("kind", kind)
    214         # print("_in_danger_noise samples shape", samples.shape)
--> 215         x = nn_estimator.kneighbors(samples, return_distance=False)[:, 1:]
    216         # print("x", x)
    217         nn_label = (y[x] != target_class).astype(int)

~/anaconda3/lib/python3.6/site-packages/sklearn/neighbors/base.py in kneighbors(self, X, n_neighbors, return_distance)
    400         if X is not None:
    401             query_is_train = False
--> 402             X = check_array(X, accept_sparse='csr')
    403         else:
    404             query_is_train = True

~/anaconda3/lib/python3.6/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    548                              " minimum of %d is required%s."
    549                              % (n_samples, array.shape, ensure_min_samples,
--> 550                                 context))
    551 
    552     if ensure_min_features > 0 and array.ndim == 2:

ValueError: Found array with 0 sample(s) (shape=(0, 3)) while a minimum of 1 is required.

Versions

System:
python: 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0]
executable: /home/allenyl/anaconda3/bin/python
machine: Linux-4.15.0-112-generic-x86_64-with-debian-buster-sid

Python deps:
pip: 19.2.2
setuptools: 41.0.1
sklearn: 0.21.3
numpy: 1.15.1
scipy: 1.4.1
Cython: 0.28.2
pandas: 0.24.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions