Systems | Development | Analytics | API | Testing

ClearML

Building an MLOps infrastructure on OpenShift

Most data science projects don’t pass the PoC phase and hence never generate any business value. In 2019, Gartner estimated that “through 2022, only 20% of analytic insights will deliver business outcomes”. One of the main reasons for this is undoubtedly that data scientists often lack a clear vision of how to deploy their solutions into production, how to integrate them with existing systems and workflows and how to operate and maintain them.

Enabling distributed NLP research at SIL

In my main position, as a data scientist at SIL International, I work on expanding language possibilities with AI. Practically this includes applying recent advances in Natural Language Processing (NLP) to low resource and multilingual contexts. We work on things like spoken language identification, multilingual dialogue systems, machine translation, and translation quality estimation.

ClearML-Data Lemonade: getting local datasets quickly and easily

Congratulations on creating a clean(ish) dataset to use for training! Now while the dataset is stored where it’s accessible to everyone, the distribution itself is a hassle! Local workstations, local GPU machines, and cloud machines (that may be spun up and down without disk persistence) are getting data everywhere. …and to say it is annoying is an understatement!

Data management is ALL THE RAGE!

Everyone wants to manage their data, and if it’s a feature store, even better! But for optimal data management, we must first discuss lightweight zero upfront setup costs and maximizing utility with ClearML-data. ClearML-data mimics the light weightiness of git for data (who doesn’t know git?) and gives it a spin. It is an open-source dataset management tool which is extremely efficient and conveys how we view DataOps and its distinction from git-like solutions, including.

[MLOps] The Clear SHOW - S02E13 - mlops_this: Copilot Shenanigans

Ariel should have known better than to mess with shitposts on mlops.community ;) Here is a ClearML pipeline integrated with the notorious mlops_this generated by GitHub's Copilot. ClearML is the only open-source tool to manage all your MLOps in a unified and robust platform providing collaborative experiment management, powerful orchestration, easy-to-build data stores, and one-click model deployment.

[MLOps] The Clear SHOW - S02E12 - Goodbye Fig .1 [Sculley15]

Sometimes, even in a field as young and bustling, one has to say goodbye to an old friend. Today we bid adieu to Fig. 1 of D. Sculley et al., AKA "Hidden technical debt in Machine learning systems." Listen to Ariel Biller explaining what's going on and what are we going to use in lieu of Fig. 1

[MLOps] The Clear SHOW - S02E11 - DIY Strikes Back! Building the Model Store!

Ariel extends ClearML's "experiment first" approach towards a "model first" approach - by building a model store. See how easy it is to add metadata to the model artifacts. + Colab notebook (uses the demo server, just run it and see what happens) ClearML is the only open-source tool to manage all your MLOps in a unified and robust platform providing collaborative experiment management, powerful orchestration, easy-to-build data stores, and one-click model deployment.