# RLM Tutorial: Progressive Disclosure Over RDF Graphs


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

``` python
# Shared namespace for the entire notebook - demonstrating REPL persistence
import sys
import os
from pathlib import Path

ns = {}  # This single namespace persists throughout the tutorial

def require_anthropic_api_key():
    """Fail fast if the Claude API key is not configured."""
    if not os.getenv('ANTHROPIC_API_KEY'):
        raise RuntimeError(
            "Missing ANTHROPIC_API_KEY. Set it in your environment to run llm_query()/rlm_run() cells."
        )
```

## 1. Core RLM Loop

The `llm_query()` function delegates a question to Claude and stores the
result.

``` python
from rlm.core import llm_query

require_anthropic_api_key()

# Use shared ns - result will persist
result = llm_query("What is 2+2? Answer with just the number.", ns, name='math')
print(f"Result: {result}")
print(f"Stored as: ns['math'] = {ns.get('math', 'not found')}")
```

    Result: 4
    Stored as: ns['math'] = 4

The `rlm_run()` function runs the full RLM loop: the model emits code,
executes it in a REPL, and iterates until it finds an answer.

``` python
from rlm.core import rlm_run

require_anthropic_api_key()

# Continue using shared ns
answer, iterations, ns = rlm_run(
    "Calculate the sum of squares of 1, 2, and 3.",
    "You can use Python to calculate.",
    ns=ns,
    max_iters=3
)
print(f"Answer: {answer}")
print(f"Iterations: {len(iterations)}")
print(f"ns still has 'math': {ns.get('math', 'not found')}")
```

    Answer: 14
    Iterations: 1
    ns still has 'math': 4

## 2. Ontology Loading

Load RDF ontologies and explore them with bounded view functions. The
key insight: we never dump the full graph into context.

``` python
from rlm.ontology import setup_ontology_context

# Add PROV ontology to shared ns
setup_ontology_context('ontology/prov.ttl', ns, name='prov')
print(ns['prov_meta'].summary())
print(f"\nns now contains: {[k for k in ns.keys() if not k.startswith('_')]}")
```

    Graph 'prov': 1,664 triples
    Classes: 59
    Properties: 89
    Individuals: 1
    Namespaces: brick, csvw, dc, dcat, dcmitype, dcterms, dcam, doap, foaf, geo, odrl, org, prof, qb, schema, sh, skos, sosa, ssn, time, vann, void, wgs, owl, rdf, rdfs, xsd, xml, prov

    ns now contains: ['math', 'context', 'llm_query', 'llm_query_batched', 'FINAL_VAR', 'llm_res', 'analysis', 'sum_of_squares', 'prov', 'prov_meta', 'prov_graph_stats', 'prov_search_by_label', 'prov_describe_entity', 'prov_search_entity', 'prov_probe_relationships', 'prov_find_path', 'prov_predicate_frequency', 'graph_stats', 'search_by_label', 'describe_entity', 'search_entity', 'probe_relationships', 'find_path', 'predicate_frequency']

``` python
# Search for classes related to "Activity"
results = ns['prov_search_by_label']('Activity', limit=5)
for uri, label in results:
    print(f"{label}: {uri}")
```

    Activity: http://www.w3.org/ns/prov#Activity
    ActivityInfluence: http://www.w3.org/ns/prov#ActivityInfluence
    activity: http://www.w3.org/ns/prov#activity
    hadActivity: http://www.w3.org/ns/prov#hadActivity
    activityOfInfluence: http://www.w3.org/ns/prov#activityOfInfluence

``` python
# Get bounded description of Activity class
desc = ns['prov_describe_entity']('http://www.w3.org/ns/prov#Activity', limit=10)
print(f"Label: {desc['label']}")
print(f"Types: {desc['types']}")
print(f"Comment: {desc['comment'][:100] if desc['comment'] else 'None'}...")
print(f"Outgoing triples (sample): {len(desc['outgoing_sample'])}")
```

    Label: Activity
    Types: ['http://www.w3.org/2002/07/owl#Class']
    Comment: None...
    Outgoing triples (sample): 10

## 3. RLM with Ontology Exploration

Combine the RLM loop with ontology tools for intelligent exploration.
The model uses bounded views to progressively discover information.

``` python
from rlm.core import rlm_run
from rlm.ontology import setup_ontology_context

require_anthropic_api_key()

# PROV is already loaded in ns from previous section
query = "What is prov:Activity? Use search_by_label and describe_entity."
context = ns['prov_meta'].summary()

answer, iterations, ns = rlm_run(
    query,
    context,
    ns=ns,
    max_iters=3,
    verbose=False
)

print(f"Answer: {answer[:500] if answer else 'No answer'}...")
print(f"Iterations: {len(iterations)}")
```

    Answer: [Max iterations] Last output: Description of prov:Activity:
    {'uri': 'http://www.w3.org/ns/prov#Activity', 'label': 'Activity', 'types': ['http://www.w3.org/2002/07/owl#Class'], 'comment': None, 'outgoing_sample': [('http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 'http://www.w3.org/2002/07/owl#Class'), ('http://www.w3.org/2000/01/rdf-schema#isDefinedBy', 'http://www.w3.org/ns/prov-o#'), ('http://www.w3.org/2000/01/rdf-schema#label', 'Activity'), ('http://www.w3.org/2002/07/owl#disjointWith', '...
    Iterations: 3

``` python
# Show what code the LLM executed
for i, it in enumerate(iterations):
    if it.code_blocks:
        print(f"Iteration {i}:")
        for cb in it.code_blocks:
            print(f"  Code: {cb.code[:100]}...")
```

    Iteration 0:
      Code: print("Context content:")
    print(context)
    print("\n" + "="*50)
    print(f"Context type: {type(context)}"...
    Iteration 1:
      Code: # Search for "Activity" using search_by_label
    activity_search = search_by_label("Activity")
    print("S...
    Iteration 2:
      Code: # Describe the prov:Activity entity
    activity_description = describe_entity("http://www.w3.org/ns/pro...

## 4. Dataset Memory

Store discovered facts in an RDF Dataset with provenance tracking. Facts
persist within the same namespace/session. Use `snapshot_dataset()` and
`load_snapshot()` APIs for persistence across sessions.

``` python
from rlm.dataset import setup_dataset_context

# Add dataset to shared ns (alongside previously loaded ontology)
setup_dataset_context(ns)
print(ns['dataset_stats']())
print(f"\nPROV ontology still accessible: 'prov_meta' in ns = {'prov_meta' in ns}")
```

    Dataset 'ds' (session: f7322b86)
    mem: 0 triples
    prov: 0 events
    work graphs: 0
    onto graphs: 0

    PROV ontology still accessible: 'prov_meta' in ns = True

``` python
# Add a fact we discovered
ns['mem_add'](
    'http://example.org/myAnalysis',
    'http://www.w3.org/ns/prov#wasGeneratedBy',
    'http://example.org/rlmSession1'
)

# Check stats
print(ns['dataset_stats']())
```

    Dataset 'ds' (session: f7322b86)
    mem: 1 triples
    prov: 7 events
    work graphs: 0
    onto graphs: 0

``` python
# Query the memory graph
results = ns['mem_query']("""
    SELECT ?s ?p ?o WHERE { ?s ?p ?o }
""")
for r in results:
    print(r)
```

    {'s': 'http://example.org/myAnalysis', 'p': 'http://www.w3.org/ns/prov#wasGeneratedBy', 'o': 'http://example.org/rlmSession1'}

## 5. SPARQL Result Handles

Query results return handles with metadata, not raw data dumps. Handles
support bounded sampling (e.g., `rows[:n]`) and summary statistics.

**Note:** Results are still fetched into memory; handles provide
metadata-first access patterns rather than true server-side pagination.

``` python
from rlm.sparql_handles import SPARQLResultHandle

# Simulating a large result set
handle = SPARQLResultHandle(
    rows=[{'name': f'Item{i}', 'value': i} for i in range(100)],
    result_type='select',
    query='SELECT ?name ?value WHERE { ... }',
    endpoint='local',
    columns=['name', 'value'],
    total_rows=100
)

print(handle.summary())
print(f"First 3 rows: {handle.rows[:3]}")
```

    SELECT: 100 rows, columns=['name', 'value']
    First 3 rows: [{'name': 'Item0', 'value': 0}, {'name': 'Item1', 'value': 1}, {'name': 'Item2', 'value': 2}]

## 6. Procedural Memory

Store and retrieve methods learned from past trajectories. Uses BM25 for
similarity-based retrieval.

``` python
from rlm.procedural_memory import MemoryStore, MemoryItem, retrieve_memories
from datetime import datetime, timezone
import uuid

store = MemoryStore()

# Add a learned procedure
item = MemoryItem(
    id=str(uuid.uuid4()),
    title='Find Activity classes in PROV',
    description='How to discover Activity-related classes',
    content='1. Use search_by_label("Activity")\n2. Use describe_entity() on results',
    source_type='success',
    task_query='find activities in PROV',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['prov', 'ontology', 'exploration']
)
store.add(item)

print(f"Store has {len(store.memories)} memories")
```

    Store has 1 memories

``` python
# Retrieve relevant memories for a new task
retrieved = retrieve_memories(store, 'how to explore PROV ontology activities', k=1)
for mem in retrieved:
    print(f"Title: {mem.title}")
    print(f"Content:\n{mem.content}")
```

    Title: Find Activity classes in PROV
    Content:
    1. Use search_by_label("Activity")
    2. Use describe_entity() on results

## 7. SHACL Shape Indexing

Detect and index SHACL shapes for schema discovery and constraint
inspection.

**Note:** This provides shape detection and constraint inspection
(targets, properties, cardinalities), not runtime validation. Use a
SHACL validator for actual data validation.

``` python
from rlm.shacl_examples import detect_shacl, build_shacl_index, search_shapes
from rdflib import Graph

# Load DCAT-AP shapes
g = Graph()
g.parse('ontology/dcat-ap/dcat-ap-SHACL.ttl')

# Detect SHACL content
detection = detect_shacl(g)
print(f"Node shapes: {detection['node_shapes']}")
print(f"Property shapes: {detection['property_shapes']}")
```

    Node shapes: 42
    Property shapes: 0

``` python
# Build index and search
index = build_shacl_index(g)
results = search_shapes(index, 'dataset', limit=3)

for r in results:
    print(f"{r['uri'].split('#')[-1]}: targets {r['targets']}")
```

    dcat:CatalogShape: targets ['http://www.w3.org/ns/dcat#Catalog']
    dcat:DatasetShape: targets ['http://www.w3.org/ns/dcat#Dataset']
    dcat:DataServiceShape: targets ['http://www.w3.org/ns/dcat#DataService']

## 8. Full Integration: Multi-Ontology Comparison

Putting it all together: load multiple ontologies, build sense
documents, and use RLM to answer complex questions.

``` python
from rlm.ontology import build_sense

require_anthropic_api_key()

# Build PROV sense document in shared ns
build_sense('ontology/prov.ttl', name='prov_sense', ns=ns)
print("PROV sense document built")
print(f"Summary length: {len(ns['prov_sense'].summary)} chars")
```

    PROV sense document built
    Summary length: 6275 chars

``` python
from rlm.core import rlm_run

require_anthropic_api_key()

# Build sense for SIO in shared ns
build_sense('ontology/sio/sio-release.owl', name='sio_sense', ns=ns)

# Context as dict - model can inspect context['prov'] / context['sio'] directly
context = {
    'prov': ns['prov_sense'].summary[:2000],  # Truncate for demo
    'sio': ns['sio_sense'].summary[:2000]
}

query = "What are the key differences between PROV and SIO ontologies?"

# Pass dict context directly (not str(context)) for progressive disclosure
answer, iterations, ns = rlm_run(
    query,
    context,  # Keep dict structure for model inspection
    ns=ns,
    max_iters=3,
    verbose=False
)

print(f"Answer:\n{answer[:800] if answer else 'No answer'}...")
print(f"\nIterations: {len(iterations)}")
```

    Answer:
    [Max iterations] Last output: # Comprehensive Comparison: PROV vs SIO Ontologies

    ## 1. Primary Purposes and Domains

    ### **PROV Ontology**
    - **Purpose**: Specialized ontology for **provenance tracking** - documenting the history, lineage, and accountability of resources
    - **Focus**: Temporal chains of causation, responsibility, and influence
    - **Domain**: Cross-domain provenance (applicable to any field requiring audit trails)
    - **Philosophical stance**: Process-centric view of how things came to be

    ### **SIO Ontology**
    - ......

    Iterations: 3

## Summary

This tutorial demonstrated:

1.  **Core RLM loop**: `llm_query()` and `rlm_run()` for LLM-driven
    exploration
2.  **Ontology loading**: Bounded views prevent context overflow
3.  **Progressive disclosure**: Start small, explore as needed
4.  **Dataset memory**: Persist discovered facts with provenance (within
    a session/namespace)
5.  **SPARQL handles**: Metadata-first result handling with bounded
    sampling
6.  **Procedural memory**: Learn and reuse exploration strategies
7.  **SHACL indexing**: Schema discovery and constraint inspection
    through shape search

**Environment requirements:** Sections using `llm_query()` or
`rlm_run()` require network access and `ANTHROPIC_API_KEY` set in your
environment. Non-LLM sections work offline.
