MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Modern Data Science with Vaex: A New Approach to DataFrames and Pipelines

Posted Apr 15, 2022 | Views 897
# Pipelines
# DataFrames
# Vaex
# Vaex.io
# Tiqets.com
Share
speakers
avatar
Maarten Breddels
Founder @ Vaex.io

Maarten is an entrepreneur and freelance developer/consultant/data scientist working mostly with Python, C++, and Javascript in the Jupyter ecosystem. Creator of ipyvolume and vaex, founder of Vaex.io. His expertise ranges from fast numerical computation, API design, to 3d visualization. He has a Bachelor's in ICT, a Master's and Ph.D. in Astronomy, likes to code and solve problems.

+ Read More
avatar
Jovan Veljanoski
Senior Data Scientist @ Tiqets

Jovan is a senior data scientist at Tiqets, where he creates predictive models and recommender systems centered around the e-commerce domain. Working mostly with Python in the Jupyter/PyData ecosystem, he has considerable experience in creating dashboards, clustering analysis, and predictive modeling. Jovan has a Ph.D. in Astrophysics, is a co-founder of vaex.io, and is interested in novel machine learning technologies and applications.

+ Read More
avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
avatar
Ben Epstein
Founding Software Engineer @ Galileo

Ben was the machine learning lead for Splice Machine, leading the development of their MLOps platform and Feature Store. He is now a founding software engineer at Galileo (rungalileo.io) focused on building data discovery and data quality tooling for machine learning teams. Ben also works as an adjunct professor at Washington University in St. Louis teaching concepts in cloud computing and big data analytics.

+ Read More
SUMMARY

Jovan and Maarten showcase Vaex, an open-source DataFrame library in Python, tailor-made to allow fast, interactive workflows with datasets that are too large to fit in RAM on a single node. Vaex makes this possible by leveraging lazy evaluations, efficient out-of-core algorithms, memory mapping, and computational graphs, all mostly behind the scenes and out of the user's way.

Using data from the New York City YellowCab taxi service comprising 1.1 billion samples and taking up over 100 GB on disk, Jovan and Maarten show how one can conduct an exploratory data analysis, complete with filtering, grouping, calculations of statistics, and interactive visualizations on a single laptop in real-time. Jovan and Maarten also demonstrate how one can automatically build a machine learning pipeline as a by-product of the exploratory data analysis using the computational graphs in Vaex.

+ Read More

Watch More

55:21
Building a Modern Data Analytics Stack
Posted Mar 24, 2022 | Views 867
# Building ML
# Analytics
# Data Stack
Challenges and Opportunities in Building Data Science Solutions with LLMs
Posted Apr 18, 2023 | Views 1.5K
# LLM in Production
# Data Science Solutions
# QuantumBlack
# Rungalileo.io
# Snorkel.ai
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Redis.com
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com