Yandex open sources CatBoost machine learning library

Yandex open sources CatBoost machine learning library

Russian search engine creator Yandex has joined the ranks of Google, Amazon, and Microsoft by releasing its own open source machine learning library, CatBoost.

The Apache-licensed CatBoost is for “open-source gradient boosting on decision trees,” according to its GitHub repository’s README. It provides a way to perform classifications and rankings of data by using a collection of decision-making mechanisms, or “learners,” rather than a single one. Results generated by the learners are weighted and classified based on the strengths and weaknesses of each learner. By combining many learners, CatBoost can yield better results than decision-making systems that rely on individual learners.

CatBoost comes with support for Python and R, as well as a command-line interface to drive the machine learning library. The Python packages for CatBoost also include data visualization tools for plotting statistics of the training process. The resulting plots can be viewed in a Jupyter notebook or in CatBoost’s own data viewer application.

Many machine learning libraries already implement some manner of gradient boosting algorithm. Python’s Scikit-learn package has one versionXGBoost is available for multiple languages and data platforms; and Microsoft has the LightGBM library as part of its Distributed Machine Learning Toolkit project.

CatBoost is meant to stand apart from those projects, according to Yandex, by being pre-tuned to perform at scale for Yandex’s own services. Yandex noted that it uses CatBoost to deliver predictions for its weather services, and that CatBoost has been deployed at the European Organization for Nuclear Research (CERN) to refine results from the particle experiments conducted there. 

Trained models created in CatBoost can be deployed in Apple’s Core ML format, for use in MacOS, iOS, tvOS, and watchOS apps backed by machine learning.

IDG Insider

PREVIOUS ARTICLE

«How Google's Chrome browser does updates

NEXT ARTICLE

Bluetooth devices could soon have mesh networking capabilities»
author_image
IDG Connect

IDG Connect tackles the tech stories that matter to you

Add Your Comment

Most Recent Comments

Our Case Studies

IDG Connect delivers full creative solutions to meet all your demand generatlon needs. These cover the full scope of options, from customized content and lead delivery through to fully integrated campaigns.

images

Our Marketing Research

Our in-house analyst and editorial team create a range of insights for the global marketing community. These look at IT buying preferences, the latest soclal media trends and other zeitgeist topics.

images

Poll

Will Kotlin overtake Java as the most popular Android programming language in 2018?