MLOps Community
+00:00 GMT
Blog
# Multimodal Data
# Data Landscape
# Aperture Data

Your Multimodal Data Is Constantly Evolving - How Bad Can It Get?

The blog post discusses the challenges and complexities associated with managing multimodal data, which includes images, videos, and documents for enterprise-scale AI, as it evolves. The evolution of multimodal data is driven by various factors such as annotations, embeddings, new classifications, and just new data. Traditional relational databases fall short due to their rigid schemas that aren't compatible with the dynamic relationships in multimodal data. Tracking and managing versions of datasets used for training AI models can be tedious and costly. Scaling issues, seamlessly integrating data updates into processing pipelines, and maintaining consistent views across disjointed databases are additional challenges. Towards the end, the blog concludes with how these requirements have led to the design and development of ApertureDB, a database purpose-built for multimodal AI.
Vishakha Gupta
Vishakha Gupta · Jun 12th, 2024
Popular topics
# MLops
# AI
# Machine learning
# Interview
# LinkedIn
# LLM in Production
# LLM
# LLMs
# Machine Learning
# about.gitlab.com
# Presentation
# Monitoring
# Case Study
# FinTech
# Kubernetes
# Investors
# MLOps Landscape
# Startups
# GPU
# Observability
All
Médéric Hurier
Médéric Hurier · Jun 10th, 2024
The MLOps Coding Course equips data scientists with essential software development skills to meet the demands of modern AI/ML projects. This open-source course covers building, deploying, and managing AI/ML systems, focusing on code structuring, validation, automation, dependency management, and Docker environments. Support includes the MLOps Coding Assistant and mentoring sessions. The companion MLOps Python Package demonstrates best practices. Ideal for both new and experienced ML engineers, this community-driven course encourages open-source collaboration. Start mastering MLOps today.
# Coding
# Data Scientists
# Machine Learning Engineers
Sonam Gupta
Sonam Gupta · Jun 5th, 2024
The blog post explains creating a PDF query assistant using Upstage AI Solar and LangChain. It involves loading and analyzing PDFs, splitting text into chunks, embedding text for vector representation, and constructing query and QA chains to handle user questions.
# PDF Query
# AI Solar
# LangChain
Vasudev Sharma
Ayla Khan
Jess Leung
+1
Vasudev Sharma, Ayla Khan, Jess Leung & 1 more author · Jun 4th, 2024
Recursion's development of their Phenom-1 foundation model showcases the importance of MLOps practices throughout the machine learning lifecycle. To train this large model on massive image datasets, they implemented solutions for experiment tracking, resource allocation, and data management. This included using tools like PyTorch Lightning and Hydra for better control and efficiency. They also optimized data transfer speeds and storage formats to handle the workload. Furthermore, Recursion highlights the crucial role of MLOps culture, emphasizing collaboration across diverse teams and utilizing infrastructure like their BioHive-1 supercomputer effectively.
# MLOps
# Foundation Model
# Recursion
Ville Tuulos
Eddie Mattia
Vidyasagar Ananthan
+3
Ville Tuulos, Eddie Mattia, Vidyasagar Ananthan & 3 more authors · May 28th, 2024
This blog discusses how Metaflow and AWS Trainium enable the development and training of large machine-learning models in a cost-efficient manner. Metaflow simplifies the workflow by automating infrastructure management, while AWS Trainium provides powerful and economical hardware specifically designed for machine learning tasks. Together, they enhance productivity and reduce expenses for training large-scale models.
# LLMs
# Metaflow
# AWS Trainium
Aishwarya Prabhat
Aishwarya Prabhat · May 23rd, 2024
# RAG
# DREAM
Rajdeep Borgohain
Aishwarya Goel
Rajdeep Borgohain & Aishwarya Goel · May 21st, 2024
In this blog post, we continue our benchmarking series by evaluating five large language models (LLMs) ranging from 10B to 34B parameters across six inference libraries. Key performance metrics such as Time to First Token (TTFT), tokens per second, and total inference time were assessed using an A100 GPU on Azure. This comprehensive analysis aims to assist developers, researchers, and AI enthusiasts in selecting the most suitable LLM for their needs.
# LLMs,
# Speed Benchmarks
# Inferless
In the race to infuse intelligence into every product and application, the speed at which machine learning (ML) teams can innovate is not just a metric of efficiency. It’s what sets industry leaders apart, empowering them to constantly improve and deliver models that provide timely, accurate predictions that adapt to evolving data and user needs. Moving quickly in ML isn’t easy, though.
# Feature Platforms
# ML Data Management
# Model Development
# Tecton Feature Platform
Practical strategies to protect language models apps (or at least doing your best) I started my career in the cybersecurity space. Dancing the endless dance of deploying defense mechanisms only to be hijacked by a more brilliant attacker a few months later. Hacking language models and language-powered applications are no different.
# Jailbreaks
# LLMs
# MLops
# Prompt Injections
Daniel Liden
Daniel Liden · Mar 21st, 2024
Large Language Models (LLMs) are trained on vast corpora of text, giving them impressive language comprehension and generation capabilities. However, this training does not inherently provide them with the ability to directly answer questions or follow instructions. To achieve this, we need to fine-tune these models for the specific task of instruction following.
# Large Language Models
# Machine learning
# OLMo 1B
MLOps projects are straightforward to initiate, but challenging to perfect. While AI/ML projects often start with a notebook for prototyping, deploying them directly in production is often considered poor practice by the MLOps community . Transitioning to a dedicated Python code base is essential for industrializing the project, yet this move presents several challenges: 1) How can we maintain a code base that is robust yet flexible for agile development? 2) Is it feasible to implement proven design patterns while keeping the code base accessible to all developers? 3) How can we leverage Python’s dynamic nature while adopting strong typing practices akin to static languages? Throughout my career, I have thoroughly explored various strategies to make my code base both simple and powerful.
# MLops
# Pydantic
# Python
Popular
MLOps - Design Thinking to Build ML Infra for ML and LLM Use Cases
Amritha Arun Babu, Abhik Choudhury & Demetrios Brinkmann