AI Success Factors: Engineering Trust in Deployments – Return of the Nuggets

This is all Peter’s fault!

Peter: How do we prompt better?

Prompting in the context of AI Engineering

AI Engineering as a Practice?

The Rise of the AI Engineer- by swyx - Laintent Space

AI Engineering Summit

AI Engineering Summit – Youtube

LLMOps Engineering

A Survey of Techniques for Maximizing LLM Performance

LLMOps Engineering

LLM Engineering – Knowledge Engineering, RAG Engineering, Fine Tuning Engineering.

LLMOps – Cognitive Agents

Wang, Lei, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, et al. 2023. “A Survey on Large Language Model Based Autonomous Agents.” arXiv. http://arxiv.org/abs/2308.11432.

A Caution: Outward facing Data Fabric vs Inward Facing…

We don’t want to be engineering data silos!
With Agents, the “World Wide Web” is a Data Fabric!
We want to expose some information as Distributed, Decentralized Knowledge Graphs!

Data Fabrics are going to be used as Data Engines!

So, it’s creepy looking AI turtles all the way down…

How to Prompt?

How to Prompt Engineer…

Prompting Guide

OpenAI Cookbook

OpenAI Examples

Important: Prompt Structure Performance Changes with Model!

Prompt Testing?

Challenges in evaluating AI systems

Challenges with prompt structure in evals

Retrieval Augmented Generation

Unit Testing of LLMs

Prompt Engineering is about adding context!

KGs for Context

Sequeda, Juan, Dean Allemang, and Bryon Jacob. 2023. “A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model’s Accuracy for Question Answering on Enterprise SQL Databases.” arXiv. http://arxiv.org/abs/2311.07509.

KGs for Context

Information Extraction for RAG (Tool Use)

Xu, Silei, Shicheng Liu, Theo Culhane, Elizaveta Pertseva, Meng-Hsi Wu, Sina J. Semnani, and Monica S. Lam. 2023. “Fine-Tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata.” arXiv. http://arxiv.org/abs/2305.14202.

Prompting Patterns for RAG – Planning and Action

Prasad, Archiki, Alexander Koller, Mareike Hartmann, Peter Clark, Ashish Sabharwal, Mohit Bansal, and Tushar Khot. 2023. “ADaPT: As-Needed Decomposition and Planning with Language Models.” arXiv. https://doi.org/10.48550/arXiv.2311.05772.

Training and Fine Tuning

Textbooks are all you need II: phi-1.5

Textbooks are all you need III: phi-2

phi-2 metrics

Microsoft Ignite

Maybe we need more than Textbooks?

A “Curriculum” for Logic?

Feng, Jiazhan, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, and Weizhu Chen. 2023. “Language Models Can Be Logical Solvers.” arXiv. http://arxiv.org/abs/2311.06158.

A “Curriculum” for Logic?

Fine Tuning for Truthfulness

Tian, Katherine, Eric Mitchell, Huaxiu Yao, Christopher D. Manning, and Chelsea Finn. 2023. “Fine-Tuning Language Models for Factuality.” arXiv. http://arxiv.org/abs/2311.08401.

Fine Tuning for Truthfulness

DoD Need for smaller private models

Return of the Nuggets – AI Engineering

This is all Peter’s fault!

Peter: How do we prompt better?

Prompting in the context of AI Engineering

AI Engineering as a Practice?

AI Engineering Summit

AI Engineering Summit – Youtube

LLMOps Engineering

LLMOps Engineering

LLM Engineering – Knowledge Engineering, RAG Engineering, Fine Tuning Engineering.

LLMOps – Cognitive Agents

A Caution: Outward facing Data Fabric vs Inward Facing…

Data Fabrics are going to be used as Data Engines!

So, it’s creepy looking AI turtles all the way down…

How to Prompt?

How to Prompt Engineer…

Prompting Guide

OpenAI Cookbook

Important: Prompt Structure Performance Changes with Model!

Prompt Testing?

Challenges in evaluating AI systems

Challenges with prompt structure in evals

Retrieval Augmented Generation

Unit Testing of LLMs

Prompt Engineering is about adding context!

KGs for Context

KGs for Context

Information Extraction for RAG (Tool Use)

Prompting Patterns for RAG – Planning and Action

Training and Fine Tuning

Textbooks are all you need II: phi-1.5

Textbooks are all you need III: phi-2

phi-2 metrics

Microsoft Ignite

Maybe we need more than Textbooks?

A “Curriculum” for Logic?

A “Curriculum” for Logic?

Fine Tuning for Truthfulness

Fine Tuning for Truthfulness

DoD Need for smaller private models