How many of you have seen this error while building your machine learning models using “sklearn”?
I bet most of us! At least in the initial days.
This error occurs when dealing with categorical (string) variables. In sklearn, you are required to convert these categories in the numerical format.
In order to do this conversion, we use several pre-processing methods like “label encoding”, “one hot encoding” and others.
In this article, I will discuss a recently open sourced library ” CatBoost” developed and contributed by Yandex. CatBoost can use categorical features directly and is scalable in nature.
“This is the first Russian machine learning technology that’s an open source,” said Mikhail Bilenko, Yandex’s head of machine intelligence and research.
P.S. You can also read this article written by me before “How to deal with categorical variables?“.
- What is CatBoost?
- Advantages of CatBoost library
- CatBoost in comparison to other boosting algorithms
- Installing CatBoost
- Solving ML …