MLOps Community
timezone
+00:00 GMT
Sign in or Join the community to continue

The Birth and Growth of Spark: An Open Source Success Story

Posted Apr 23, 2023 | Views 5.7K
# Spark
# Open Source
# Databricks
Share
SPEAKERS
Matei Zaharia
Matei Zaharia
Matei Zaharia
Cofounder and Chief Technologist @ Databricks

Matei Zaharia is a Co-founder and Chief Technologist at Databricks as well as an Assistant Professor of Computer Science at Stanford. He started the Apache Spark project during his Ph.D. at UC Berkeley in 2009 and has worked broadly on other widely used data and AI software, including MLflow, Delta Lake, Dolly, and ColBERT. He works on a wide variety of projects in data management and machine learning at Databricks and Stanford. Matei’s research was recognized through the 2014 ACM Doctoral Dissertation Award, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE).

+ Read More

Matei Zaharia is a Co-founder and Chief Technologist at Databricks as well as an Assistant Professor of Computer Science at Stanford. He started the Apache Spark project during his Ph.D. at UC Berkeley in 2009 and has worked broadly on other widely used data and AI software, including MLflow, Delta Lake, Dolly, and ColBERT. He works on a wide variety of projects in data management and machine learning at Databricks and Stanford. Matei’s research was recognized through the 2014 ACM Doctoral Dissertation Award, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE).

+ Read More
Vishnu Rachakonda
Vishnu Rachakonda
Vishnu Rachakonda
Data Scientist @ Firsthand

Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.

+ Read More

Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.

+ Read More
SUMMARY

We dive deep into the creation of Spark, with the creator himself - Matei Zaharia Chief technologist at Databricks. This episode also explores the development of Databricks' other open source home run ML Flow and the concept of "lake house ML". As a special treat Matei talked to us about the details of the "DSP" (Demonstrate Search Predict) project, which aims to enable building applications by combining LLMs and other text-returning systems.

About the guest: Matei has the unique advantage of being able to see different perspectives, having worked in both academia and the industry. He listens carefully to people's challenges and excitement about ML and uses this to come up with new ideas. As a member of Databricks, Matei also has the advantage of applying ML to Databricks' own internal practices. He is constantly asking the question "What's a better way to do this?"

+ Read More

Watch More

34:57
Posted Jun 20, 2023 | Views 487
# LLM in Production
# Scalable Evaluation
# Anyscale.com
# Redis.io
# Gantry.io
# Predibase.com
# Humanloop.com
# Zilliz.com
# Arize.com
# Nvidia.com
# TrueFoundry.com
# Premai.io
# Continual.ai
# Argilla.io
# Genesiscloud.com
# Rungalileo.io