Stop using GridsearchCV or RandomizedSearchCV

Gia Huy ( CisMine)
5 min readJun 7, 2023

--

Have you ever had the same thought as mine that using GridsearchCV-RandomizedSearchCV will take too long to wait? In this post I’ll show you a new library that is faster, more useful than GridsearchCV-RandomizedSearchCV

Why not GridsearchCV-RandomizedSearchCV

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV

The worst thing about these methods is taking too much time to search for the best algorithm and hyperparameters by using the loop to check all the cases. I’d say that it’s not bad but not good enough so I’ll show you a new way to find the best algorithm

Lazypredict

Lazy Predict helps build a lot of basic models without much code and helps understand which models work better without any parameter tuning.

pip install lazypredict

the best thing about lazypredict is the speed, it takes much less time to find an algorithm that suitest your dataset but in default parameters

How to use

Just a quick reminder you don’t need to import any algorithm or even install it before in order to use lazypredict.

classification

from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

data = load_breast_cancer()
X = data.data
y= data.target

X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123)

clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

print(models)


| Model | Accuracy | Balanced Accuracy | ROC AUC | F1 Score | Time Taken |
|:-------------------------------|-----------:|--------------------:|----------:|-----------:|-------------:|
| LinearSVC | 0.989474 | 0.987544 | 0.987544 | 0.989462 | 0.0150008 |
| SGDClassifier | 0.989474 | 0.987544 | 0.987544 | 0.989462 | 0.0109992 |
| MLPClassifier | 0.985965 | 0.986904 | 0.986904 | 0.985994 | 0.426 |
| Perceptron | 0.985965 | 0.984797 | 0.984797 | 0.985965 | 0.0120046 |
| LogisticRegression | 0.985965 | 0.98269 | 0.98269 | 0.985934 | 0.0200036 |
| LogisticRegressionCV | 0.985965 | 0.98269 | 0.98269 | 0.985934 | 0.262997 |
| SVC | 0.982456 | 0.979942 | 0.979942 | 0.982437 | 0.0140011 |
| CalibratedClassifierCV | 0.982456 | 0.975728 | 0.975728 | 0.982357 | 0.0350015 |
| PassiveAggressiveClassifier | 0.975439 | 0.974448 | 0.974448 | 0.975464 | 0.0130005 |
| LabelPropagation | 0.975439 | 0.974448 | 0.974448 | 0.975464 | 0.0429988 |
| LabelSpreading | 0.975439 | 0.974448 | 0.974448 | 0.975464 | 0.0310006 |
| RandomForestClassifier | 0.97193 | 0.969594 | 0.969594 | 0.97193 | 0.033 |
| GradientBoostingClassifier | 0.97193 | 0.967486 | 0.967486 | 0.971869 | 0.166998 |
| QuadraticDiscriminantAnalysis | 0.964912 | 0.966206 | 0.966206 | 0.965052 | 0.0119994 |
| HistGradientBoostingClassifier | 0.968421 | 0.964739 | 0.964739 | 0.968387 | 0.682003 |
| RidgeClassifierCV | 0.97193 | 0.963272 | 0.963272 | 0.971736 | 0.0130029 |
| RidgeClassifier | 0.968421 | 0.960525 | 0.960525 | 0.968242 | 0.0119977 |
| AdaBoostClassifier | 0.961404 | 0.959245 | 0.959245 | 0.961444 | 0.204998 |
| ExtraTreesClassifier | 0.961404 | 0.957138 | 0.957138 | 0.961362 | 0.0270066 |
| KNeighborsClassifier | 0.961404 | 0.95503 | 0.95503 | 0.961276 | 0.0560005 |
| BaggingClassifier | 0.947368 | 0.954577 | 0.954577 | 0.947882 | 0.0559971 |
| BernoulliNB | 0.950877 | 0.951003 | 0.951003 | 0.951072 | 0.0169988 |
| LinearDiscriminantAnalysis | 0.961404 | 0.950816 | 0.950816 | 0.961089 | 0.0199995 |
| GaussianNB | 0.954386 | 0.949536 | 0.949536 | 0.954337 | 0.0139935 |
| NuSVC | 0.954386 | 0.943215 | 0.943215 | 0.954014 | 0.019989 |
| DecisionTreeClassifier | 0.936842 | 0.933693 | 0.933693 | 0.936971 | 0.0170023 |
| NearestCentroid | 0.947368 | 0.933506 | 0.933506 | 0.946801 | 0.0160074 |
| ExtraTreeClassifier | 0.922807 | 0.912168 | 0.912168 | 0.922462 | 0.0109999 |
| CheckingClassifier | 0.361404 | 0.5 | 0.5 | 0.191879 | 0.0170043 |
| DummyClassifier | 0.512281 | 0.489598 | 0.489598 | 0.518924 | 0.0119965 |

Regression

from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np

boston = datasets.load_boston()
X, y = shuffle(boston.data, boston.target, random_state=13)
X = X.astype(np.float32)

offset = int(X.shape[0] * 0.9)

X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None)
models, predictions = reg.fit(X_train, X_test, y_train, y_test)

print(models)


| Model | Adjusted R-Squared | R-Squared | RMSE | Time Taken |
|:------------------------------|-------------------:|----------:|------:|-----------:|
| SVR | 0.83 | 0.88 | 2.62 | 0.01 |
| BaggingRegressor | 0.83 | 0.88 | 2.63 | 0.03 |
| NuSVR | 0.82 | 0.86 | 2.76 | 0.03 |
| RandomForestRegressor | 0.81 | 0.86 | 2.78 | 0.21 |
| XGBRegressor | 0.81 | 0.86 | 2.79 | 0.06 |
| GradientBoostingRegressor | 0.81 | 0.86 | 2.84 | 0.11 |
| ExtraTreesRegressor | 0.79 | 0.84 | 2.98 | 0.12 |
| AdaBoostRegressor | 0.78 | 0.83 | 3.04 | 0.07 |
| HistGradientBoostingRegressor | 0.77 | 0.83 | 3.06 | 0.17 |
| PoissonRegressor | 0.77 | 0.83 | 3.11 | 0.01 |
| LGBMRegressor | 0.77 | 0.83 | 3.11 | 0.07 |
| KNeighborsRegressor | 0.77 | 0.83 | 3.12 | 0.01 |
| DecisionTreeRegressor | 0.65 | 0.74 | 3.79 | 0.01 |
| MLPRegressor | 0.65 | 0.74 | 3.80 | 1.63 |
| HuberRegressor | 0.64 | 0.74 | 3.84 | 0.01 |
| GammaRegressor | 0.64 | 0.73 | 3.88 | 0.01 |
| LinearSVR | 0.62 | 0.72 | 3.96 | 0.01 |
| RidgeCV | 0.62 | 0.72 | 3.97 | 0.01 |
| BayesianRidge | 0.62 | 0.72 | 3.97 | 0.01 |
| Ridge | 0.62 | 0.72 | 3.97 | 0.01 |
| TransformedTargetRegressor | 0.62 | 0.72 | 3.97 | 0.01 |
| LinearRegression | 0.62 | 0.72 | 3.97 | 0.01 |
| ElasticNetCV | 0.62 | 0.72 | 3.98 | 0.04 |
| LassoCV | 0.62 | 0.72 | 3.98 | 0.06 |
| LassoLarsIC | 0.62 | 0.72 | 3.98 | 0.01 |
| LassoLarsCV | 0.62 | 0.72 | 3.98 | 0.02 |
| Lars | 0.61 | 0.72 | 3.99 | 0.01 |
| LarsCV | 0.61 | 0.71 | 4.02 | 0.04 |
| SGDRegressor | 0.60 | 0.70 | 4.07 | 0.01 |
| TweedieRegressor | 0.59 | 0.70 | 4.12 | 0.01 |
| GeneralizedLinearRegressor | 0.59 | 0.70 | 4.12 | 0.01 |
| ElasticNet | 0.58 | 0.69 | 4.16 | 0.01 |
| Lasso | 0.54 | 0.66 | 4.35 | 0.02 |
| RANSACRegressor | 0.53 | 0.65 | 4.41 | 0.04 |
| OrthogonalMatchingPursuitCV | 0.45 | 0.59 | 4.78 | 0.02 |
| PassiveAggressiveRegressor | 0.37 | 0.54 | 5.09 | 0.01 |
| GaussianProcessRegressor | 0.23 | 0.43 | 5.65 | 0.03 |
| OrthogonalMatchingPursuit | 0.16 | 0.38 | 5.89 | 0.01 |
| ExtraTreeRegressor | 0.08 | 0.32 | 6.17 | 0.01 |
| DummyRegressor | -0.38 | -0.02 | 7.56 | 0.01 |
| LassoLars | -0.38 | -0.02 | 7.56 | 0.01 |
| KernelRidge | -11.50 | -8.25 | 22.74 | 0.01 |

Conclusion

Thank you for reading this post; if you enjoyed it, check out the others in the series “ Stop using that if you know this”

Reference

https://pypi.org/project/lazypredict/

--

--

Gia Huy ( CisMine)

My name is Huy Gia. I am currently pursuing a B.Sc. degree. I am interested in the following topics: DL in Computer Vision, Parallel Programming With Cuda.