AIQCON SAN FRANCISCO 2024

Enterprise AI Governance: A Comprehensive Playbook

This talk presents a comprehensive overview of enterprise AI governance, highlighting its importance, key components, and practical implementation stages. As AI systems become increasingly prevalent in business operations, organizations must establish robust governance frameworks to mitigate risks, ensure compliance, and foster responsible AI innovation. I define AI governance, and articulates its relationship to AI risks and a common set of emerging regulatory requirements. I then outline a three-stage approach to enterprise AI governance: organization-level governance, intake, and ongoing governance. At each stage I give examples of actions that support effective oversight, and articulate how they are actually operationalized in practice.

Ian Eisenberg · Aug 16th, 2024

All

Paco Nathan · Aug 16th, 2024

Entity Resolved Knowledge Graphs

Knowledge graphs have spiked recently in popular use, for example in _retrieval augmented generation_ (RAG) methods used to mitigate hallucination in LLMs. Graphs emphasize _relationships_ in data, adding _semantics_ — more so than with SQL or vector databases. However, data quality issues can degrade linking during KG construction and updating, which makes downstream use cases inaccurate and defeats the point of using a graph. When you have join keys (unique identifiers) building relationships in a graph may be straightforward, although false positives (duplicate nodes) can result from: typos or minor differences in attributes like name, address, phone, etc.; family members sharing email; duplicate customer entries, and so on. This talk describes what an _Entity Resolved Knowledge Graph_ is, why it's important, plus patterns for deploying _entity resolution_ (ER) which are proven to work. We'll cover how to make graphs more meaningful in data-centric architectures by repairing connected data: Unify connected data from across multiple data sources. Consolidate duplicate nodes and reveal hidden connections. Create more accurate, intuitive graphs which provide greater downstream utility for AI applications.

9:07

Mona Rakibe · Aug 16th, 2024

Designing Data Quality for Ai-Usecase

Often, the data stored for AI workloads is in its raw formats and stored in data lakes with open formats. This talk will focus on designing a data quality strategy for these raw formats.

14:22

Faizaan Charania · Aug 16th, 2024

Building Robust and Trustworthy Gen AI Products: A Playbook

A practitioner's take on how you can consistently build robust, performant, and trustworthy Gen. AI products, at scale. The talk will touch on different parts of the Gen. AI product development cycle covering the must-haves, the gotchas, and insights from existing products in the market.

16:32

Linus Lee · Aug 15th, 2024

Evaluating Evaluations

Since our first LLM product a year and a half ago, Notion's AI team has learned a lot about evaluating LLM-based systems through the full life cycle of a feature, from ideation and prototyping to production and iteration, from single-shot text completion models to agents. In this talk, rather than focus on the nitty-gritty details of specific evaluation metrics or tools, I'll share the biggest, most transferable lessons we learned about evaluating frontier AI products, and the role eval plays in Notion's AI team, on our journey to serving tens of billions of tokens weekly today.

# Evaluations

# LLMs

# AI

# Notion

18:17

Mohamed El-Geish · Aug 15th, 2024

Evaluation of ML Systems in the Real World

Evaluation seeks to assess the quality, reliability, latency, cost, and generalizability of ML systems, given assumptions about operating conditions in the real world. That is easier said than done! This talk presents some of the common pitfalls that ML practitioners ought to avoid and makes the case for tying model evaluation to business objectives.

# Evaluation

# ML Systems

# Bank Of America

23:15

Todd Underwood · Aug 15th, 2024

Do Re MI for Training Metrics: Start at the Beginning

Model quality/performance is the only true end-to-end metric of model training performance and correctness. But it is usually far too slow to be useful from a production point of view. It tells us what happened with training hours or days ago, when we have already suffered some kind of a problem. To improve the reliability of training we will want to look for short-term proxies for the kinds of problems we experience in large model training. This talk will identify some of the common failures that happen during model training and some faster/cheaper metrics that can serve as reasonable proxies of those failures.

# Training Metrics

# AI

# OpenAI

23:58

Emmanuel Ameisen · Aug 15th, 2024

Integrating LLMs Into Products

Learn about best practices when integrating Large Language Models (LLMs) into product development. We will discuss the strengths of modern LLMs like Claude and how they can be leveraged to enable and enhance various applications. The presentation will cover simple prompting strategies and design patterns that facilitate the effective incorporation of LLMs into products.

# LLMs

# AI Products

# Anthropic

23:56

Jerry Liu · Aug 15th, 2024

Building Advanced Question-Answering Agents Over Complex Data

Large Language Models (LLMs) are revolutionizing how users can search for, interact with, and generate new content, leading to a huge wave of developer-led, context-augmented LLM applications. Some recent stacks and toolkits around Retrieval-Augmented Generation (RAG) have emerged, enabling developers to build applications such as chatbots using LLMs on their private data. However, while setting up basic RAG-powered QA is straightforward, solving complex question-answering over large quantities of complex data requires new data, retrieval, and LLM architectures. This talk provides an overview of these agentic systems, the opportunities they unlock, how to build them, as well as remaining challenges.

# LLMs

# RAG

# LlamaIndex

16:49

Will Gaviria Rojas & Florent Blachot · Aug 15th, 2024

Enhancing AI Quality for Content Understanding in Visual Data

In this presentation, we'll unpack how multimodal AI enhances content understanding of visual data, which is pivotal for ensuring quality and trust in digital communities. Through our work to date, we find that striking a balance between model sophistication and transparency is key to fostering trust, alongside the ability to swiftly adapt to evolving user behaviors and moderation standards. We'll discuss the overall importance of AI quality in this context, focusing on interpretability and feedback mechanisms needed to achieve these goals. More broadly, we'll also highlight how AI quality forms the foundation for the transformative impact of multimodal AI in creating safer digital environments.

# AI Quality

# Visual Data

# CoActive

# Fandom

13:54

Chang She · Aug 15th, 2024

Self-Improving RAG

Higher-quality retrieval isn't just about more complex retrieval techniques. Using user feedback to improve model results is a tried and true technique from the ancient days of *checks notes* recommender systems. And if you know something about the pattern about your data and user queries, even synthetic data can produce fine-tuned models that significantly improve retrieval quality.

# RAG

# AI Quality

# LanceDB

21:00

AIQCON SAN FRANCISCO 2024

.css-1t9010w-StyledLink:hover *{color:var(--theme-color-primary, #C92C7F);}Enterprise AI Governance: A Comprehensive Playbook

Enterprise AI Governance: A Comprehensive Playbook