MLOps Community
+00:00 GMT
Sign in or Join the community to continue

Machine Learning SRE

Posted Sep 09, 2021 | Views 529
# Machine Learning
# Stanza.systems
Share

speakers

avatar
Niall Murphy
Co-founder & Consultant @ Stanza

Niall Murphy has been interested in Internet infrastructure since the mid-1990s. He has worked with all of the major cloud providers from their Dublin, Ireland offices - most recently at Microsoft, where he was global head of Azure Site Reliability Engineering (SRE). His books have sold approximately a quarter of a million copies world-wide, most notably the award-winning Site Reliability Engineering, and he is probably one of the few people in the world to hold degrees in Computer Science, Mathematics, and Poetry Studies. He lives in Dublin, Ireland, with his wife and two children.

+ Read More
avatar
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
avatar
David Aponte
Senior Research SDE, Applied Sciences Group @ Microsoft

David is one of the organizers of the MLOps Community. He is an engineer, teacher, and lifelong student. He loves to build solutions to tough problems and share his learnings with others. He works out of NYC and loves to hike and box for fun. He enjoys meeting new people so feel free to reach out to him!

+ Read More
avatar
Vishnu Rachakonda
Data Scientist @ Firsthand

Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.

+ Read More

SUMMARY

SRE is making its way into the machine learning world. Software engineering for machine learning requires reliability, performance, and maintainability. Site reliability engineering is the field that deals with reliability and ensuring constant, real-time performance. Niall Murphy, most recently Global Head of SRE at Microsoft Azure, helps us understand what SRE can do for modern ML products and teams. Building machine learning teams requires a diverse set of technical experiences, and Niall shares his thoughts on how to do that most effectively. Machine learning organizations need to start to take advantage of SRE best practices like SLOs, which Niall walks through. Production machine learning depends on high-quality software engineering, and we get Niall's take on how to ensure that in a machine learning context.

+ Read More

Watch More

Trustworthy Machine Learning
Posted Sep 20, 2022 | Views 1.4K
# Trustworthy ML
# IBM
# IBM Research
FastAPI for Machine Learning
Posted Apr 29, 2022 | Views 1.6K
# FastAPI
# ML Platform
# Building Communities
# forethought.ai
Monzo Machine Learning Case Study
Posted Dec 07, 2020 | Views 1.9K
# FinTech
# Case Study
# Interview