MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Data Selection for Data-Centric AI: Data Quality Over Quantity

Posted Oct 10, 2021 | Views 517
Share
speaker
avatar
Cody Coleman
CEO @ Coactive

Cody Coleman recently finished his PhD in CS at Stanford University, where he was advised by Professors Matei Zaharia and Peter Bailis. His research spans from performance benchmarking of hardware and software systems (i.e., DAWNBench and MLPerf) to computationally efficient methods for active learning and core-set selection. His work has been supported by the NSF GRFP, the Stanford DAWN Project, and the Open Phil AI Fellowship.

+ Read More
SUMMARY

Big data has been critical to many of the successes in ML, but it brings its own problems. Working with massive datasets is cumbersome and expensive, especially with unstructured data like images, videos, and speech. Careful data selection can mitigate the pains of big data by focusing computational and labeling resources on the most valuable examples. Cody Coleman, a recent Ph.D. from Stanford University and founding member of MLCommons, joins us to describe how a more data-centric approach that focuses on data quality rather than quantity can lower the AI/ML barrier. Instead of managing clusters of machines and setting up cumbersome labeling pipelines, you can spend more time tackling real problems.

+ Read More

Watch More

34:30
Driving ML Data Quality with Data Contracts
Posted Nov 29, 2022 | Views 2.5K
# ML Data
# Data Contracts
# GoCardless
Data-Centric AI Means Centralizing Training Data
Posted Nov 10, 2021 | Views 499
# Computer Vision
# Presentation
# Health Care
# v7labs.com
Solving the Last Mile Problem of Foundation Models with Data-Centric AI
Posted Apr 18, 2023 | Views 2K
# LLM in Production
# Foundation Models
# Data-centric AI
# Snorkel.ai
# Rungalileo.io
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Redis.com
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com