AI Success Factors: Engineering Trust in Deployments – Towards Trusted LLM based Curator Agents

GitHub Repo

Trusted AI - Towards a Curator (TAITaC)

DoD Data Vision

Norquist, David L. n.d. “DOD Data Strategy.” https://media.defense.gov/2020/Oct/08/2002514180/-1/-1/0/DOD-DATA-STRATEGY.PDF.

Ontology Design Patterns as a Semantic Bridge

Ontology Engineering: A View from the Trenches - WOP 2015 Keynote | PPT (slideshare.net)

AI Agents for Interoperability

Tangi, Luca, Marco Combetto, BOSCH Jaume Martin, and MÜLLER Paula Rodriguez. 2023. “Artificial Intelligence for Interoperability in the European Public Sector.” JRC Publications Repository. October 4, 2023. https://doi.org/10.2760/633646.

Problem – Can we use LLM Based Cognitive Agents to accelerate and create “Active Metadata”?

Problem – How can LLM Based Cognitive Agents use Data Centric AI to be more FACTUAL through Retrieval Augmented Generation (RAG) and Tool Use?

Problem – Data Centric AI is Hard but necessary for Trusted AI – Can we use LLM Based Cognitive Agents to lower the barrier to Data Centric AI?

Problem – How can we Trust, Validate, and integrate Human in the loop for LLM Based Agents used for Data Curation?

Motivation: TAMMS KG

Starting Architecture…

AI Curator “Agents”: Team “LEMON”

Framework for architecture design of LLM Based Agents

Wang, Lei, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, et al. 2023. “A Survey on Large Language Model Based Autonomous Agents.” arXiv. http://arxiv.org/abs/2308.11432.

Cognitive Architectures for Language Agents

LLM Powered Agents

Example

Activity Specific Agents: Visual Agents

Visual Agents Architecture: Different LLMs based on Role

Activity Specific Agents: Visual Agents Transition Graph

Different LLM’s for Different Tasks

Lesson’s from Kaggle Science Exam Competition winning team solution
Becomes a search - context retrieval problem for the curator LLM

Local LLMs vs API based LLMs

LM Compatibility Tracking

LLMs fine tuned to be Agents

Zeng, Aohan, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong, and Jie Tang. 2023. “AgentTuning: Enabling Generalized Agent Abilities for LLMs.” arXiv. http://arxiv.org/abs/2310.12823.

GPTQ model files for Knowledge Engineering GroupAgentLM 70B.

LLMs fine tuned to be Agents

GPTQ model files for Knowledge Engineering GroupAgentLM 70B.

Tool Use (Calling Python Functions)

Structured Responses and LLMs

AWS Agents for Amazon Bedrock

Fully Managed Agents – Amazon Bedrock – AWS

Curation State Graphs?

Modeling The World!

Ontology Engineering: A View from the Trenches - WOP 2015 Keynote | PPT (slideshare.net)

Moo Architecture

We need to think through what Trusted Means!

Frameworks – Data Engine

Copying Tesla’s Data Engine

Frameworks to Capture Provenance of Models!

SBoMs and AI BoMs for Agents
- They are KGs Themselves!
Data Cards and Model Cards for Models
Agents will be exposed as Microservices themselves
We should be able to ask the Microservice Layer for “Trust Information”
Agent should store “Metadata” in the Graph Fragment they are constructing.

Starting with a CSV (Navy Maintenance Data)

Korini, Keti, and Christian Bizer. 2023. “Column Type Annotation Using ChatGPT.” arXiv. http://arxiv.org/abs/2306.00745.

Context Matters!

Converting Legacy Enterprise Data into Knowledge Graphs with AI and JSON LD | Eliud Polanco

JSON-LD as a Bridge

Converting Legacy Enterprise Data into Knowledge Graphs with AI and JSON LD | Eliud Polanco

Aside: Curator AI’s should be multimodal

Dr. Vardeman’s Law: Data “Lives” in different locations and formats – not every digital object can or should be in the KG layer. The Curator AI should “Catalog” this information.
Multimodal LLM’s like AVIS can bridge that Gap!

Semantic AI-based Micro Services

How can we create “Semantic Microservices”

Tim Berners-Lee, James Hendler, and Ora Lassila. “The Semantic Web.” Scientific American 284, no. 5 (2001): 34–43. https://lassila.org/publications/2001/SciAm.html

Semantic Web “Layer Cake”

John Sowa, “Semantics.” n.d. Accessed October 17, 2023. https://www.jfsowa.com/ikl/. Q92665

Aside: Sowa’s law of standards

“Whenever a major organization develops a new system as an official standard for X, the primary result is the widespread adoption of some simpler system as a de facto standard for X.”

Jano’s Layer Cake

Ontology Engineering: A View from the Trenches - WOP 2015 Keynote | PPT (slideshare.net)

Distributed Knowledge Graph Layer Cake

DKG Example

OriginTrail powering consumer interaction with the new GS1 Digital Link

“Web 2.0 Architecture – Microservices”

RESTful web API design

Documenting REST-APIs

About Swagger Specification | Documentation | Swagger

Example: HuggingFace Embedding Service

A blazing fast inference solution for text embeddings models

Example: HuggingFace Embedding Service

Text Generation Inference API

OpenAI “Plugins”

Microsoft and “OpenAI Plugins”

Create and run a ChatGPT plugin with Semantic Kernel | Microsoft Learn

Bridging Rest to AI using JSON-LD

JSON-LD 1.1 – A JSON-based Serialization for Linked Data

JSON-LD Best Practices

JSON as JSON-LD

GET /ordinary-json-document.json HTTP/1.1
Host: example.com
Accept: application/ld+json,application/json,*/*;q=0.1

====================================

HTTP/1.1 200 OK
...
Content-Type: application/json
Link: <https://json-ld.org/contexts/person.jsonld>; rel="http://www.w3.org/ns/json-ld#context"; type="application/ld+json"

{
  "name": "Markus Lanthaler",
  "homepage": "http://www.markus-lanthaler.com/",
  "image": "http://twitter.com/account/profile_image/markuslanthaler"
}

Gorilla: Retrieval Aware Training for APIs

Gorilla: Retrieval Aware Training for APIs

Problem with REST – Interoperability, Scale and Queriability

“Semantic APIs for KG’s”

SPARQL 1.1 Federated Queries

How do we provide “Context” to LLMs to QUERY a KG?

SPARQL 1.1 Service Description to provide Context!

Dataset and API Discovery in Linked Data

Example in the Wild

Swiss Linked Data: https://geo.ld.admin.ch/.well-known/void

UniProt: https://sparql.uniprot.org/.well-known/void

ChatGPT “Plugin” Architecture as Example

Example Service – Retrieval Augmented Generation (We’re not doing this yet!)

Experiments in extracting tables from navy 3-M manual for OPNAV 4790/2K data structure Resources
Sample KG construction using OPNAV forms 4790/ 2K as a schema template
Repository for formatting the Joint Electronics Type Designation System for ML and KG Usage
Likely needed to be stored in a Vector Store

Towards Trusted LLM based Curator Agents

GitHub Repo

DoD Data Vision

Ontology Design Patterns as a Semantic Bridge

AI Agents for Interoperability

Problem – Can we use LLM Based Cognitive Agents to accelerate and create “Active Metadata”?

Problem – How can LLM Based Cognitive Agents use Data Centric AI to be more FACTUAL through Retrieval Augmented Generation (RAG) and Tool Use?

Problem – Data Centric AI is Hard but necessary for Trusted AI – Can we use LLM Based Cognitive Agents to lower the barrier to Data Centric AI?

Problem – How can we Trust, Validate, and integrate Human in the loop for LLM Based Agents used for Data Curation?

Motivation: TAMMS KG

Starting Architecture…

AI Curator “Agents”: Team “LEMON”

Framework for architecture design of LLM Based Agents

Cognitive Architectures for Language Agents

LLM Powered Agents

Example

Activity Specific Agents: Visual Agents

Visual Agents Architecture: Different LLMs based on Role

Activity Specific Agents: Visual Agents Transition Graph

Different LLM’s for Different Tasks

Local LLMs vs API based LLMs

LLMs fine tuned to be Agents

LLMs fine tuned to be Agents

Tool Use (Calling Python Functions)

Structured Responses and LLMs

AWS Agents for Amazon Bedrock

Curation State Graphs?

Modeling The World!

Moo Architecture

We need to think through what Trusted Means!

Frameworks – Data Engine

Frameworks to Capture Provenance of Models!

Starting with a CSV (Navy Maintenance Data)

Context Matters!

JSON-LD as a Bridge

Aside: Curator AI’s should be multimodal

Semantic AI-based Micro Services

How can we create “Semantic Microservices”

Semantic Web “Layer Cake”

Aside: Sowa’s law of standards

Jano’s Layer Cake

Distributed Knowledge Graph Layer Cake

DKG Example

“Web 2.0 Architecture – Microservices”

Documenting REST-APIs

Example: HuggingFace Embedding Service

Example: HuggingFace Embedding Service

OpenAI “Plugins”

Microsoft and “OpenAI Plugins”

Bridging Rest to AI using JSON-LD

JSON-LD Best Practices

JSON as JSON-LD

Gorilla: Retrieval Aware Training for APIs

Gorilla: Retrieval Aware Training for APIs

Problem with REST – Interoperability, Scale and Queriability

“Semantic APIs for KG’s”

SPARQL 1.1 Federated Queries

How do we provide “Context” to LLMs to QUERY a KG?

SPARQL 1.1 Service Description to provide Context!

Example in the Wild

ChatGPT “Plugin” Architecture as Example

Example Service – Retrieval Augmented Generation (We’re not doing this yet!)

SPARQL Interfaces

KG Interpretation in Contexts

FAIR Vocabularies and Ontologies