Let's Talk About Raw Documents

Name: Let's%20Talk%20About%20Raw%20Documents
Uploaded: 2023-03-01T11:59:35.068Z

Posted Mar 01, 2023 | Views 570

# Preprocessing API

# Unstrcutured.io

# NLP-focused

Crag Wolfe

Infrastructure Team Lead @ Unstructured.io

Back End Engineer by trade including a decade at Red Hat. Previous 5 years at an NLP startup serving as the technical lead for a key product.

+ Read More

Ben Epstein

Co-Founder & CTO @ GrottoAI

Ben was the machine learning lead for Splice Machine, leading the development of their MLOps platform and Feature Store. He is now a the Co-founder and CTO at GrottoAI focused on supercharging multifamily teams and reduce vacancy loss with AI-powered guidance for leasing and renewals. Ben also works as an adjunct professor at Washington University in St. Louis teaching concepts in cloud computing and big data analytics.

+ Read More

SUMMARY

Modern ML pipelines still often need pre-processed documents. This isn't changing anytime soon, in fact, the appetite is growing.

Unstructured.io is focused on extracting structured data from raw documents (pdf, pptx, html, etc). In the near term, we're more NLP-focused.

Check out Unstructured.io's open-source libraries!

+ Read More

Watch More

Let's Continue Bundling into the Database

Posted Nov 08, 2022 | Views 659

# Large Language Models

# Database Bundling

# Feature Stores

# Trade-offs

# Square

# Squareup.com

Talk to Your Data: The SQL Data Analyst

Posted Feb 28, 2025 | Views 1.5K

# Token Data Analyst

# LLM

# Prosus

All About Evaluating LLM Applications

Posted Sep 28, 2023 | Views 900

# Evaluation

# LLM Applications

# Exploding Gradients

Let's Talk About Raw Documents

Speakers

SUMMARY

Watch More