MLOps Community
timezone
+00:00 GMT
SIGN IN
  • Home
  • Events
  • Content
  • People
  • Messages
  • Channels
  • Help
Sign In
Sign in or Join the community to continue

Serving ML Models at a High Scale with Low Latency

Posted Jan 20
# Presentation
# Model Serving
Share
SPEAKER
Manoj Agarwal
Manoj Agarwal
Manoj Agarwal
Software Architect @ Salesforce

Manoj Agarwal is a Software Architect in the Einstein Platform team at Salesforce. Salesforce Einstein was released back in 2016, integrated with all the major Salesforce clouds. Fast forward to today and Einstein is delivering 80+ billion predictions across Sales, Service, Marketing & Commerce Clouds per day.

+ Read More

Manoj Agarwal is a Software Architect in the Einstein Platform team at Salesforce. Salesforce Einstein was released back in 2016, integrated with all the major Salesforce clouds. Fast forward to today and Einstein is delivering 80+ billion predictions across Sales, Service, Marketing & Commerce Clouds per day.

+ Read More
SUMMARY

Serving machine learning models is a scalability challenge at many companies. Most applications require a small number of machine learning models (often <100) to serve predictions. On the other hand, cloud platforms that support model serving, though they support hundreds of thousands of models, provision separate hardware for different customers. Salesforce has a unique challenge that only very few companies deal with; Salesforce needs to run hundreds of thousands of models sharing the underlying infrastructure for multiple tenants for cost-effectiveness.

+ Read More

Watch More

50:42
Posted Apr 01 | Views 150
# Auto MLOps
# Automate Data
# ML Orchestration
49:15
Posted Aug 02 | Views 1.4K
# MLX
# ML Flow
# Pipelines
See more