MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Let's Talk About Raw Documents

Posted Mar 01, 2023 | Views 500
# Preprocessing API
# Unstrcutured.io
# NLP-focused
Share
speakers
avatar
Crag Wolfe
Infrastructure Team Lead @ Unstructured.io

Back End Engineer by trade including a decade at Red Hat. Previous 5 years at an NLP startup serving as the technical lead for a key product.

+ Read More
avatar
Ben Epstein
Founding Software Engineer @ Galileo

Ben was the machine learning lead for Splice Machine, leading the development of their MLOps platform and Feature Store. He is now a founding software engineer at Galileo (rungalileo.io) focused on building data discovery and data quality tooling for machine learning teams. Ben also works as an adjunct professor at Washington University in St. Louis teaching concepts in cloud computing and big data analytics.

+ Read More
SUMMARY

Modern ML pipelines still often need pre-processed documents. This isn't changing anytime soon, in fact, the appetite is growing.

Unstructured.io is focused on extracting structured data from raw documents (pdf, pptx, html, etc). In the near term, we're more NLP-focused.

Check out Unstructured.io's open-source libraries!

+ Read More

Watch More

51:56
Let's Continue Bundling into the Database
Posted Nov 08, 2022 | Views 599
# Large Language Models
# Database Bundling
# Feature Stores
# Trade-offs
# Square
# Squareup.com
Let's Build a Website in 10 minutes with GitHub Copilot
Posted Mar 06, 2024 | Views 553
# GitHub Copilot
# Automation
# GitHub
All About Evaluating LLM Applications
Posted Sep 28, 2023 | Views 797
# Evaluation
# LLM Applications
# Exploding Gradients