MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Different Ways to Scale Python & Pandas

Posted Nov 05, 2022 | Views 389
# Python
# Pandas
# Fugue
# Prefect.io
Share
speaker
avatar
Kevin Kho
Open Source Community Engineer @ Prefect

Kevin Kho is an Open Source Community Engineer at Prefect, an open-source workflow orchestration management system. Previously, he was a data scientist for four years working in the energy and HR spaces. Outside of work, he is a contributor for Fugue, an abstraction layer for Pandas, Spark, and Dask. He also organizes the Orlando Machine Learning and Data Science Meetup.

+ Read More
SUMMARY

With the volume of data increasing, a lot of data practitioners are needing to migrate existing Python or pandas code to distributed computing frameworks such as Spark and Dask. In this tutorial, we discuss the possible solutions and their specific behaviors. Pandas-like frameworks such as Modin (for Dask) and Koalas (for Spark) offer the promise of a drop-in replacement for Pandas.

Fugue, on the other hand, chooses to deviate away from the Pandas interface. Fugue users instead write minimal additional code to port existing Python and pandas code. To learn the tradeoffs of these approaches, we will learn underlying distributed computing concepts. Attendees will deepen their understanding of distributed computing and understand the pros and cons when evaluating these options.

+ Read More

Watch More

Building a Python-Centric Feature Platform to Power Production AI Applications
Posted Feb 27, 2024 | Views 247
# AI Applications
# Python
# Tecton
Using LLMs to Power Consumer Search at Scale
Posted Jul 21, 2023 | Views 605
# LLM in Production
# Power Consumer Search
# Perplexity AI
Scalable Python for Everyone, Everywhere, Conversation with the Creators of Dask
Posted Oct 14, 2020 | Views 401
# Presentation
# Coding Workshop