Return of the Nuggets – AI Engineering
Charles F. Vardeman II
Center for Research Computing, University of Notre Dame
2023-11-17
This is all Peter’s fault!
Peter: How do we prompt better?
Prompting in the context of AI Engineering
AI Engineering as a Practice?
The Rise of the AI Engineer- by swyx - Laintent Space
AI Engineering Summit
AI Engineering Summit
AI Engineering Summit – Youtube
AI Engineering Summit
LLMOps Engineering
A Survey of Techniques for Maximizing LLM Performance
LLMOps Engineering
A Survey of Techniques for Maximizing LLM Performance
LLM Engineering – Knowledge Engineering, RAG Engineering, Fine Tuning Engineering.
LLMOps – Cognitive Agents
Wang, Lei, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, et al. 2023. “A Survey on Large Language Model Based Autonomous Agents.” arXiv. http://arxiv.org/abs/2308.11432.
A Caution: Outward facing Data Fabric vs Inward Facing…
We don’t want to be engineering data silos!
With Agents, the “World Wide Web” is a
Data Fabric
!
We want to expose some information as
Distributed, Decentralized Knowledge Graphs
!
Data Fabrics are going to be used as Data Engines!
Twitter:
Andrej Karpathy
So, it’s creepy looking AI turtles all the way down…
How to Prompt?
How to Prompt Engineer…
Prompting Guide
Prompt Engineering Guide
OpenAI Cookbook
OpenAI Cookbook
OpenAI Examples
Important: Prompt Structure Performance Changes with Model!
Prompt Testing?
Evaluating LLMs is a minefield
Challenges in evaluating AI systems
Anthropic Challenges in evaluating AI systems
Challenges with prompt structure in evals
Anthropic Challenges in evaluating AI systems
Retrieval Augmented Generation
Unit Testing of LLMs
A Survey of Techniques for Maximizing LLM Performance
Prompt Engineering is about adding context!
KGs for Context
Sequeda, Juan, Dean Allemang, and Bryon Jacob. 2023. “A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model’s Accuracy for Question Answering on Enterprise SQL Databases.” arXiv. http://arxiv.org/abs/2311.07509.
KGs for Context
Sequeda, Juan, Dean Allemang, and Bryon Jacob. 2023. “A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model’s Accuracy for Question Answering on Enterprise SQL Databases.” arXiv. http://arxiv.org/abs/2311.07509.
Information Extraction for RAG (Tool Use)
Xu, Silei, Shicheng Liu, Theo Culhane, Elizaveta Pertseva, Meng-Hsi Wu, Sina J. Semnani, and Monica S. Lam. 2023. “Fine-Tuned LLMs Know More, Hallucinate Less with Few-Shot Sequence-to-Sequence Semantic Parsing over Wikidata.” arXiv. http://arxiv.org/abs/2305.14202.
Prompting Patterns for RAG – Planning and Action
Prasad, Archiki, Alexander Koller, Mareike Hartmann, Peter Clark, Ashish Sabharwal, Mohit Bansal, and Tushar Khot. 2023. “ADaPT: As-Needed Decomposition and Planning with Language Models.” arXiv. https://doi.org/10.48550/arXiv.2311.05772.
Training and Fine Tuning
Textbooks are all you need II: phi-1.5
Y. Li, S. Bubeck, R. Eldan, A. Del Giorno, S. Gunasekar, and Y. T. Lee, “Textbooks Are All You Need II: phi-1.5 technical report.” arXiv, Sep. 11, 2023. Accessed: Sep. 12, 2023. [Online]. Available: http://arxiv.org/abs/2309.05463
Textbooks are all you need III: phi-2
Sebastien Bubeck X
phi-2 metrics
Sebastien Bubeck X
Microsoft Ignite
AI+Mixed Reality for the Front Line
Maybe we need more than Textbooks?
A “Curriculum” for Logic?
Feng, Jiazhan, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, and Weizhu Chen. 2023. “Language Models Can Be Logical Solvers.” arXiv. http://arxiv.org/abs/2311.06158.
A “Curriculum” for Logic?
Feng, Jiazhan, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, and Weizhu Chen. 2023. “Language Models Can Be Logical Solvers.” arXiv. http://arxiv.org/abs/2311.06158.
Fine Tuning for Truthfulness
Tian, Katherine, Eric Mitchell, Huaxiu Yao, Christopher D. Manning, and Chelsea Finn. 2023. “Fine-Tuning Language Models for Factuality.” arXiv. http://arxiv.org/abs/2311.08401.
Fine Tuning for Truthfulness
Tian, Katherine, Eric Mitchell, Huaxiu Yao, Christopher D. Manning, and Chelsea Finn. 2023. “Fine-Tuning Language Models for Factuality.” arXiv. http://arxiv.org/abs/2311.08401.
DoD Need for smaller private models
LLMs-at-DoD Chatting with your Data