Data Science | AI | DataOps | Engineering
backgroundGrey.png

Blog

Data Science & Data Engineering blogs

Data and AI Summit 2022 ML Announcements

With 5000 in-person attendees and 60,000 online attendees at the Data and AI Summit 2022 conference, San Francisco was bustling last week. The keynotes were split into four sessions spread across two days, the first day was focused on Data Engineering and the second day covered Machine Learning announcements.

The Data and AI Summit 2022 Announcements blog provides an overview of some of the Data Engineering announcements. This blog will provide a quick overview of the AI and Machine Learning announcements.

Data + AI Summit 2022

Ali Ghodshi, Co-Founder and CEO of Databricks presented the opening keynote session and mentioned how more and more companies are in the process of moving to the right-hand side of the data and AI maturity curve. If you look at the maturity curve, the horizontal axis represents the maturity level of Data and AI, whilst the vertical value represents the value that organisations gain. Moving to the right-hand side of the curve would yield a better competitive edge for organisations.

Data maturity curve

“The future of data is AI”

- Ali Ghodsi

However, most companies still face a challenge when it comes to productionising machine learning models. To help with productionising models, Databricks introduced MLflow back in 2018 and for years, MLflow was the most successful and popular open-source MLOps framework with over 11M monthly downloads. In fact, MLflow has become the standard format for tracking, packaging and deploying machine learning applications.

MLflow downloads

MLFlow 2.0

Last week, at the AI and Data summit, Databricks unveiled MLFlow 2.0, a new feature coming soon that features MLflow Pipelines. MLFlow Pipelines provides a structured framework with pre-defined, production-ready templates to accelerate the deployment of machine learning models.

MLflow Pipelines

services

Databricks announced Services, which is a full end-to-end service deployment of ML models inside of a Databricks lakehouse. Services include Serverless Model Endpoints and Integrated Model Monitoring. Serverless Model Endpoint lets you deploy your models for real-time inference whilst Model Monitoring lets you track the performance of your production models. At the time of this announcement, Services are in private preview stage, however, in a couple of months, it will be coming out to public preview.

Serverless Model Endpoints and Integrated Model Monitoring

Other complementary releases include capabilities such as Project Lightspeed, which simplifies the data streaming and processing using Spark clusters and Databricks Marketplace, which is the first open AI and data marketplace for data solutions, code, notebooks, solution accelerators and machine learning models. More about these new features are described in Data and AI Summit 2022 Announcements blog and in the Advancing Spark - Data + AI Summit 2022 Day 1 Recap video.


Databricks have made it fairly clear, that the future of data is AI and the recent announcements reinforce this while considerably improving the machine learning capabilities provided by the Databricks platform.

Let us know if you’re as excited as we are about the new ML features, and if you would like a more in-depth review.