Speed and Sensibility: Balancing Latency and UX in Generative AI

Name: Speed%20and%20Sensibility:%20Balancing%20Latency%20and%20UX%20in%20Generative%20AI
Uploaded: 2023-10-26T09:52:08.125Z

Posted Oct 26, 2023 | Views 475

# Conversational AI

# Humans and AI

# Deepgram

Julia Kroll

Applied Engineer @ Deepgram

Julia Kroll is an Applied Engineer at Deepgram, where she provides engineering and product expertise on speech-to-text and language models, enabling developers to use language as the universal interface between humans and machines. She previously worked as a Senior Machine Learning Engineer creating natural-sounding AI voices, following five years at Amazon, where she contributed to machine learning and data engineering for AWS and Alexa. She holds two computer science degrees, a master's from the University of Wisconsin-Madison and a bachelor's from Carleton College. Her interests lie at the intersection of technology, linguistics, and society.

+ Read More

Adam Becker

IRL @ MLOps Community

I'm a tech entrepreneur and I spent the last decade founding companies that drive societal change.

I am now building Deep Matter, a startup still in stealth mode...

I was most recently building Telepath, the world's most developer-friendly machine learning platform. Throughout my previous projects, I had learned that building machine learning powered applications is hard - especially hard when you don't have a background in data science. I believe that this is choking innovation, especially in industries that can't support large data teams.

For example, I previously co-founded Call Time AI, where we used Artificial Intelligence to assemble and study the largest database of political contributions. The company powered progressive campaigns from school board to the Presidency. As of October, 2020, we helped Democrats raise tens of millions of dollars. In April of 2021, we sold Call Time to Political Data Inc.. Our success, in large part, is due to our ability to productionize machine learning.

I believe that knowledge is unbounded, and that everything that is not forbidden by laws of nature is achievable, given the right knowledge. This holds immense promise for the future of intelligence and therefore for the future of well-being. I believe that the process of mining knowledge should be done honestly and responsibly, and that wielding it should be done with care. I co-founded Telepath to give more tools to more people to access more knowledge.

I'm fascinated by the relationship between technology, science and history. I graduated from UC Berkeley with degrees in Astrophysics and Classics and have published several papers on those topics. I was previously a researcher at the Getty Villa where I wrote about Ancient Greek math and at the Weizmann Institute, where I researched supernovae.

I currently live in New York City. I enjoy advising startups, thinking about how they can make for an excellent vehicle for addressing the Israeli-Palestinian conflict, and hearing from random folks who stumble on my LinkedIn profile. Reach out, friend!

+ Read More

SUMMARY

Conversational AI demands low latency for a seamless dialogue between humans and AI. However, engineers face the dilemma that some latency is inherently required in order to process human speech and craft a response. Some incremental wins to shave off milliseconds involve trade-offs against how the AI response could be enriched during the additional processing time. Others simply refactor out inefficiency to obtain more performant results from AI devtools. This talk presents best practices of designing streaming speech-to-text applications, as well as reasons to accept extra latency for the sake of an enhanced product experience.

+ Read More