Racing the Playhead: Real-time Model Inference in a Video Streaming Environment
Brannon Dorsey is an early employee at Runway, where he leads the Backend team. His team keeps infrastructure and high-performance models running at scale and helps to enable a quick iteration cycle between the research and product teams.
Before joining Runway, Brannon worked on the Security Team at Linode. Brannon is also a practicing artist who uses software to explore ideas of digital literacy, agency, and complex systems.
At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.
Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.
Runway ML is doing an incredibly cool workaround applying machine learning to video editing. Brannon is a software engineer there and he’s here to tell us all about machine learning in video and how Runway maintains their machine learning infrastructure.
Quotes
“We care about professional video editors and we’re also interested in the space of quick form video sharing that social media has created.”
“We’re targeting users who can experience what is new to them and also allow them to do things that they weren’t able to do. Something that would have taken five hours maybe now takes five minutes.”
“We’re building a tool that allows video editors of all types to use our features so we’re really excited to see who’s using it.”
“Green screen was sort of our breakout product.”
“The whole team just poured our resources in making the best rotoscoping experience.”
“We were engaging with folks that knew nothing about ML and honestly didn’t even care like ML or AI. Sounds cool but they just wanted to edit their videos faster.”
“We thought we could do rotoscoping better than any tool out there. That was our pitch. We’re working on some models internally and we’re pleased with the performance.”
“We knew that rotoscoping is this manual process that if we could do something that had a pretty good performance worked quickly but then also produced a nice looking result that people might use it. That was our bet.”
“So far we haven’t looked back. We’ve continued to build video editing tools. We’ve built an entire linear video editor on the web.”
“We’ve been solving challenges around one goal of ours internally which is, we want you to be able to edit your videos at the park.”
“We do a lot of model inference on the fly and I think that is one of the biggest engineering challenges of building our tools.”
“Behind the scenes is a lot of computation and it’s a challenging thing to do with a lot of concurrent users.”
“It’s hard to test because it’s qualitative. A lot of times we define our metrics or goals at the user experience level.”
“The golden metric is response time to users and user experience when using the app. That’s where we set our targets.”
“We’ve made a lot of awesome improvements. It’s still difficult and every time we bring a new model into the equation, we have to tweak our system a bit.”
“We now live in a world where Google docs exist on the web in such a natural way that we almost forget that there was another way of doing it.”
“A big part of what makes Runway ML work so well in a video editing flow I think is the fact that we quickly upload media to cloud storage and then it can be accessible from any device.”
“If we offload a lot of the hard parts from a processing perspective, then we also just make this tool available to way more people than it would have been previously.”
“We want to appeal to the lowest common denominator edge hardware, meaning I guess in this case ‘users’.”
“Believe it or not, it’s actually way easier to run three or four models developed in-house at scale than it was to run 150 models from 150 different authors.”
“From a perspective of managing an engineering team, we can only put so much effort into that. What we’re trying to optimize for is end-user experience, then there’s some sub-optimal back-end system that we just have to be comfortable with because we want to optimize for user experience. It’s like a balance, a tug of push and a pull.”
“A voice that I totally respect oftentimes says, ‘Kubernetes is way too much, way too complex, it just divorces itself from the eunuch style of do something small or do something well and it’s like a beast. It’s unwieldy’.”
“Kubernetes is a beast but so is the territory and we’ve found it to be really helpful.”
“Make your workloads themselves the pods, basically the atomic unit of compute be stateless but then have stateful things that they can grab work from and do a little bit of it. If they get interrupted, maybe reach a checkpoint that work isn’t wasted.”
” I think that good MLOps in a lot of ways starts with good DevOps.”
“If we can create pipelines in our code authoring and engineering practices that get us to deploy our changes to master, in this case to production as quickly as possible, then we just squash the feedback loop of releasing iterative changes.”
“Master always represents production.”
“We have an ethos at this company where we are constantly trying to understand how people are using stuff and if we’re wrong about an assumption we just work to change it in the next iteration.”
“If you have a very very small error rate noise floor, you should still investigate why that’s happening because it could also cause an outage in the future.”
“Don’t come to the product because it’s AI or ML. Come to the product because it’s useful in your life. I wish more companies would do that.”
“I’d like to be remembered as somebody who cared about what they did and cared about the people they did it with.”