RLM Tutorial: Progressive Disclosure Over RDF Graphs

# Shared namespace for the entire notebook - demonstrating REPL persistence
import sys
import os
from pathlib import Path

ns = {}  # This single namespace persists throughout the tutorial

def require_anthropic_api_key():
    """Fail fast if the Claude API key is not configured."""
    if not os.getenv('ANTHROPIC_API_KEY'):
        raise RuntimeError(
            "Missing ANTHROPIC_API_KEY. Set it in your environment to run llm_query()/rlm_run() cells."
        )

1. Core RLM Loop

The llm_query() function delegates a question to Claude and stores the result.

from rlm.core import llm_query

require_anthropic_api_key()

# Use shared ns - result will persist
result = llm_query("What is 2+2? Answer with just the number.", ns, name='math')
print(f"Result: {result}")
print(f"Stored as: ns['math'] = {ns.get('math', 'not found')}")
Result: 4
Stored as: ns['math'] = 4

The rlm_run() function runs the full RLM loop: the model emits code, executes it in a REPL, and iterates until it finds an answer.

from rlm.core import rlm_run

require_anthropic_api_key()

# Continue using shared ns
answer, iterations, ns = rlm_run(
    "Calculate the sum of squares of 1, 2, and 3.",
    "You can use Python to calculate.",
    ns=ns,
    max_iters=3
)
print(f"Answer: {answer}")
print(f"Iterations: {len(iterations)}")
print(f"ns still has 'math': {ns.get('math', 'not found')}")
Answer: 14
Iterations: 1
ns still has 'math': 4

2. Ontology Loading

Load RDF ontologies and explore them with bounded view functions. The key insight: we never dump the full graph into context.

from rlm.ontology import setup_ontology_context

# Add PROV ontology to shared ns
setup_ontology_context('ontology/prov.ttl', ns, name='prov')
print(ns['prov_meta'].summary())
print(f"\nns now contains: {[k for k in ns.keys() if not k.startswith('_')]}")
Graph 'prov': 1,664 triples
Classes: 59
Properties: 89
Individuals: 1
Namespaces: brick, csvw, dc, dcat, dcmitype, dcterms, dcam, doap, foaf, geo, odrl, org, prof, qb, schema, sh, skos, sosa, ssn, time, vann, void, wgs, owl, rdf, rdfs, xsd, xml, prov

ns now contains: ['math', 'context', 'llm_query', 'llm_query_batched', 'FINAL_VAR', 'llm_res', 'analysis', 'sum_of_squares', 'prov', 'prov_meta', 'prov_graph_stats', 'prov_search_by_label', 'prov_describe_entity', 'prov_search_entity', 'prov_probe_relationships', 'prov_find_path', 'prov_predicate_frequency', 'graph_stats', 'search_by_label', 'describe_entity', 'search_entity', 'probe_relationships', 'find_path', 'predicate_frequency']
# Search for classes related to "Activity"
results = ns['prov_search_by_label']('Activity', limit=5)
for uri, label in results:
    print(f"{label}: {uri}")
Activity: http://www.w3.org/ns/prov#Activity
ActivityInfluence: http://www.w3.org/ns/prov#ActivityInfluence
activity: http://www.w3.org/ns/prov#activity
hadActivity: http://www.w3.org/ns/prov#hadActivity
activityOfInfluence: http://www.w3.org/ns/prov#activityOfInfluence
# Get bounded description of Activity class
desc = ns['prov_describe_entity']('http://www.w3.org/ns/prov#Activity', limit=10)
print(f"Label: {desc['label']}")
print(f"Types: {desc['types']}")
print(f"Comment: {desc['comment'][:100] if desc['comment'] else 'None'}...")
print(f"Outgoing triples (sample): {len(desc['outgoing_sample'])}")
Label: Activity
Types: ['http://www.w3.org/2002/07/owl#Class']
Comment: None...
Outgoing triples (sample): 10

3. RLM with Ontology Exploration

Combine the RLM loop with ontology tools for intelligent exploration. The model uses bounded views to progressively discover information.

from rlm.core import rlm_run
from rlm.ontology import setup_ontology_context

require_anthropic_api_key()

# PROV is already loaded in ns from previous section
query = "What is prov:Activity? Use search_by_label and describe_entity."
context = ns['prov_meta'].summary()

answer, iterations, ns = rlm_run(
    query,
    context,
    ns=ns,
    max_iters=3,
    verbose=False
)

print(f"Answer: {answer[:500] if answer else 'No answer'}...")
print(f"Iterations: {len(iterations)}")
Answer: [Max iterations] Last output: Description of prov:Activity:
{'uri': 'http://www.w3.org/ns/prov#Activity', 'label': 'Activity', 'types': ['http://www.w3.org/2002/07/owl#Class'], 'comment': None, 'outgoing_sample': [('http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://www.w3.org/2002/07/owl#Class'), ('http://www.w3.org/2000/01/rdf-schema#isDefinedBy', 'http://www.w3.org/ns/prov-o#'), ('http://www.w3.org/2000/01/rdf-schema#label', 'Activity'), ('http://www.w3.org/2002/07/owl#disjointWith', '...
Iterations: 3
# Show what code the LLM executed
for i, it in enumerate(iterations):
    if it.code_blocks:
        print(f"Iteration {i}:")
        for cb in it.code_blocks:
            print(f"  Code: {cb.code[:100]}...")
Iteration 0:
  Code: print("Context content:")
print(context)
print("\n" + "="*50)
print(f"Context type: {type(context)}"...
Iteration 1:
  Code: # Search for "Activity" using search_by_label
activity_search = search_by_label("Activity")
print("S...
Iteration 2:
  Code: # Describe the prov:Activity entity
activity_description = describe_entity("http://www.w3.org/ns/pro...

4. Dataset Memory

Store discovered facts in an RDF Dataset with provenance tracking. Facts persist within the same namespace/session. Use snapshot_dataset() and load_snapshot() APIs for persistence across sessions.

from rlm.dataset import setup_dataset_context

# Add dataset to shared ns (alongside previously loaded ontology)
setup_dataset_context(ns)
print(ns['dataset_stats']())
print(f"\nPROV ontology still accessible: 'prov_meta' in ns = {'prov_meta' in ns}")
Dataset 'ds' (session: f7322b86)
mem: 0 triples
prov: 0 events
work graphs: 0
onto graphs: 0

PROV ontology still accessible: 'prov_meta' in ns = True
# Add a fact we discovered
ns['mem_add'](
    'http://example.org/myAnalysis',
    'http://www.w3.org/ns/prov#wasGeneratedBy',
    'http://example.org/rlmSession1'
)

# Check stats
print(ns['dataset_stats']())
Dataset 'ds' (session: f7322b86)
mem: 1 triples
prov: 7 events
work graphs: 0
onto graphs: 0
# Query the memory graph
results = ns['mem_query']("""
    SELECT ?s ?p ?o WHERE { ?s ?p ?o }
""")
for r in results:
    print(r)
{'s': 'http://example.org/myAnalysis', 'p': 'http://www.w3.org/ns/prov#wasGeneratedBy', 'o': 'http://example.org/rlmSession1'}

5. SPARQL Result Handles

Query results return handles with metadata, not raw data dumps. Handles support bounded sampling (e.g., rows[:n]) and summary statistics.

Note: Results are still fetched into memory; handles provide metadata-first access patterns rather than true server-side pagination.

from rlm.sparql_handles import SPARQLResultHandle

# Simulating a large result set
handle = SPARQLResultHandle(
    rows=[{'name': f'Item{i}', 'value': i} for i in range(100)],
    result_type='select',
    query='SELECT ?name ?value WHERE { ... }',
    endpoint='local',
    columns=['name', 'value'],
    total_rows=100
)

print(handle.summary())
print(f"First 3 rows: {handle.rows[:3]}")
SELECT: 100 rows, columns=['name', 'value']
First 3 rows: [{'name': 'Item0', 'value': 0}, {'name': 'Item1', 'value': 1}, {'name': 'Item2', 'value': 2}]

6. Procedural Memory

Store and retrieve methods learned from past trajectories. Uses BM25 for similarity-based retrieval.

from rlm.procedural_memory import MemoryStore, MemoryItem, retrieve_memories
from datetime import datetime, timezone
import uuid

store = MemoryStore()

# Add a learned procedure
item = MemoryItem(
    id=str(uuid.uuid4()),
    title='Find Activity classes in PROV',
    description='How to discover Activity-related classes',
    content='1. Use search_by_label("Activity")\n2. Use describe_entity() on results',
    source_type='success',
    task_query='find activities in PROV',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['prov', 'ontology', 'exploration']
)
store.add(item)

print(f"Store has {len(store.memories)} memories")
Store has 1 memories
# Retrieve relevant memories for a new task
retrieved = retrieve_memories(store, 'how to explore PROV ontology activities', k=1)
for mem in retrieved:
    print(f"Title: {mem.title}")
    print(f"Content:\n{mem.content}")
Title: Find Activity classes in PROV
Content:
1. Use search_by_label("Activity")
2. Use describe_entity() on results

7. SHACL Shape Indexing

Detect and index SHACL shapes for schema discovery and constraint inspection.

Note: This provides shape detection and constraint inspection (targets, properties, cardinalities), not runtime validation. Use a SHACL validator for actual data validation.

from rlm.shacl_examples import detect_shacl, build_shacl_index, search_shapes
from rdflib import Graph

# Load DCAT-AP shapes
g = Graph()
g.parse('ontology/dcat-ap/dcat-ap-SHACL.ttl')

# Detect SHACL content
detection = detect_shacl(g)
print(f"Node shapes: {detection['node_shapes']}")
print(f"Property shapes: {detection['property_shapes']}")
Node shapes: 42
Property shapes: 0
# Build index and search
index = build_shacl_index(g)
results = search_shapes(index, 'dataset', limit=3)

for r in results:
    print(f"{r['uri'].split('#')[-1]}: targets {r['targets']}")
dcat:CatalogShape: targets ['http://www.w3.org/ns/dcat#Catalog']
dcat:DatasetShape: targets ['http://www.w3.org/ns/dcat#Dataset']
dcat:DataServiceShape: targets ['http://www.w3.org/ns/dcat#DataService']

8. Full Integration: Multi-Ontology Comparison

Putting it all together: load multiple ontologies, build sense documents, and use RLM to answer complex questions.

from rlm.ontology import build_sense

require_anthropic_api_key()

# Build PROV sense document in shared ns
build_sense('ontology/prov.ttl', name='prov_sense', ns=ns)
print("PROV sense document built")
print(f"Summary length: {len(ns['prov_sense'].summary)} chars")
PROV sense document built
Summary length: 6275 chars
from rlm.core import rlm_run

require_anthropic_api_key()

# Build sense for SIO in shared ns
build_sense('ontology/sio/sio-release.owl', name='sio_sense', ns=ns)

# Context as dict - model can inspect context['prov'] / context['sio'] directly
context = {
    'prov': ns['prov_sense'].summary[:2000],  # Truncate for demo
    'sio': ns['sio_sense'].summary[:2000]
}

query = "What are the key differences between PROV and SIO ontologies?"

# Pass dict context directly (not str(context)) for progressive disclosure
answer, iterations, ns = rlm_run(
    query,
    context,  # Keep dict structure for model inspection
    ns=ns,
    max_iters=3,
    verbose=False
)

print(f"Answer:\n{answer[:800] if answer else 'No answer'}...")
print(f"\nIterations: {len(iterations)}")
Answer:
[Max iterations] Last output: # Comprehensive Comparison: PROV vs SIO Ontologies

## 1. Primary Purposes and Domains

### **PROV Ontology**
- **Purpose**: Specialized ontology for **provenance tracking** - documenting the history, lineage, and accountability of resources
- **Focus**: Temporal chains of causation, responsibility, and influence
- **Domain**: Cross-domain provenance (applicable to any field requiring audit trails)
- **Philosophical stance**: Process-centric view of how things came to be

### **SIO Ontology**
- ......

Iterations: 3

Summary

This tutorial demonstrated:

  1. Core RLM loop: llm_query() and rlm_run() for LLM-driven exploration
  2. Ontology loading: Bounded views prevent context overflow
  3. Progressive disclosure: Start small, explore as needed
  4. Dataset memory: Persist discovered facts with provenance (within a session/namespace)
  5. SPARQL handles: Metadata-first result handling with bounded sampling
  6. Procedural memory: Learn and reuse exploration strategies
  7. SHACL indexing: Schema discovery and constraint inspection through shape search

Environment requirements: Sections using llm_query() or rlm_run() require network access and ANTHROPIC_API_KEY set in your environment. Non-LLM sections work offline.