MLOps Community
timezone
+00:00 GMT
SIGN IN
  • Home
  • Events
  • Content
  • Tools
  • Help
Sign In
Sign in or Join the community to continue

Let's Talk About Raw Documents

Posted Mar 01, 2023 | Views 71
# Preprocessing API
# Unstrcutured.io
# NLP-focused
Share
SPEAKER
 Crag Wolfe
 Crag Wolfe
Crag Wolfe
Infrastructure Team Lead @ Unstructured.io

Back End Engineer by trade including a decade at Red Hat. Previous 5 years at an NLP startup serving as the technical lead for a key product.

+ Read More

Back End Engineer by trade including a decade at Red Hat. Previous 5 years at an NLP startup serving as the technical lead for a key product.

+ Read More
SUMMARY

Modern ML pipelines still often need pre-processed documents. This isn't changing anytime soon, in fact, the appetite is growing.

Unstructured.io is focused on extracting structured data from raw documents (pdf, pptx, html, etc). In the near term, we're more NLP-focused.

Check out Unstructured.io's open-source libraries!

+ Read More

Watch More

51:56
Posted Nov 08, 2022 | Views 210
# Large Language Models
# Database Bundling
# Feature Stores
# Trade-offs
53:15
Posted Sep 01, 2022 | Views 379
# Feathr
# Feature Stores
# LinkedIn
41:58
Posted Apr 10, 2022 | Views 257
# Building Communities
# Data Council
# Data Community