# procedural_memory


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Overview

This module implements **Stage 2.5: Procedural Memory Loop** inspired by
the ReasoningBank paper. The goal is to enable an RLM agent to improve
over time by accumulating procedural knowledge (strategies, templates,
debugging moves) without replacing evidence-based retrieval.

### Closed-Loop Cycle

    ┌──────────┐    ┌──────────┐    ┌──────────┐
    │ RETRIEVE │───▶│ INTERACT │───▶│ EXTRACT  │
    │ (BM25)   │    │ (rlm_run)│    │ (Judge + │
    └────▲─────┘    └──────────┘    │ Extractor)│
         │                          └─────┬─────┘
         │                                │
         │          ┌──────────┐          │
         └──────────│  STORE   │◀─────────┘
                    │ (JSON)   │
                    └──────────┘

### Design Principles

1.  **Procedural, not episodic**: Memories are strategies/checklists,
    not retellings
2.  **Bounded injection**: Only title + description + 3 key bullets in
    prompts
3.  **Evidence-sensitive judgment**: Success requires grounding in
    retrieved evidence
4.  **Keyword retrieval**: BM25 over title/description/tags
    (deterministic, offline)
5.  **Append-only storage**: Simple JSON file for experimentation

### Reference

- [ReasoningBank Paper](https://arxiv.org/html/2509.25140v1)

## Imports

## Memory Schema

A `MemoryItem` represents a reusable procedural insight extracted from
an RLM trajectory.

**Constraints**: - Items must be small enough to inject into prompts -
`content` should be procedural (steps/checklist), not a retelling - Up
to 3 items extracted per trajectory

------------------------------------------------------------------------

### MemoryItem

``` python

def MemoryItem(
    id:str, title:str, description:str, content:str, source_type:str, task_query:str, created_at:str,
    access_count:int=0, tags:Optional=None, session_id:Optional=None
)->None:

```

*A reusable procedural memory extracted from an RLM trajectory.*

Attributes: id: Unique identifier (UUID) title: Concise identifier (≤10
words) description: One-sentence summary content: Procedural
steps/checklist/template (Markdown) source_type: ‘success’ or ‘failure’
task_query: Original task that produced this memory created_at: ISO
timestamp access_count: Number of times retrieved (for future
consolidation) tags: Keywords for BM25 retrieval session_id: Optional
session ID from DatasetMeta (links to dataset session)

``` python
# Test MemoryItem creation and serialization
test_item = MemoryItem(
    id='test-uuid',
    title='SPARQL Query Pattern',
    description='Template for searching entities by label.',
    content='- Use `rdfs:label` for human-readable names\n- Add FILTER for case-insensitive search',
    source_type='success',
    task_query='Find entities named "Activity"',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['sparql', 'search', 'rdfs']
)

# Test roundtrip
data = test_item.to_dict()
restored = MemoryItem.from_dict(data)
assert restored.title == test_item.title
assert restored.tags == test_item.tags
print("✓ MemoryItem serialization works")
```

    ✓ MemoryItem serialization works

    DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
      created_at=datetime.utcnow().isoformat(),

## Memory Store

Persistent storage for procedural memories using a simple JSON file
format.

------------------------------------------------------------------------

### MemoryStore

``` python

def MemoryStore(
    memories:list=<factory>, path:Optional=None
)->None:

```

*Persistent storage for procedural memories.*

Attributes: memories: List of MemoryItem objects path: Path to JSON file

``` python
# Test MemoryStore save/load roundtrip
import tempfile

with tempfile.TemporaryDirectory() as tmpdir:
    test_path = Path(tmpdir) / 'test_memories.json'
    
    # Create store and add items
    store = MemoryStore(path=test_path)
    item1 = MemoryItem(
        id=str(uuid.uuid4()),
        title='Test Memory 1',
        description='First test memory',
        content='- Step 1\n- Step 2',
        source_type='success',
        task_query='test task 1',
        created_at=datetime.now(timezone.utc).isoformat(),
        tags=['test', 'example']
    )
    item2 = MemoryItem(
        id=str(uuid.uuid4()),
        title='Test Memory 2',
        description='Second test memory',
        content='- Action A\n- Action B',
        source_type='failure',
        task_query='test task 2',
        created_at=datetime.now(timezone.utc).isoformat(),
        tags=['test']
    )
    
    store.add(item1)
    store.add(item2)
    store.save()
    
    # Load and verify
    loaded = MemoryStore.load(test_path)
    assert len(loaded.memories) == 2
    assert loaded.memories[0].title == 'Test Memory 1'
    assert loaded.memories[1].source_type == 'failure'
    assert loaded.memories[0].tags == ['test', 'example']
    
    # Test corpus generation
    corpus = loaded.get_corpus_for_bm25()
    assert len(corpus) == 2
    assert 'test' in corpus[0]  # From title and tags
    
    print("✓ MemoryStore save/load/corpus works")
```

    ✓ MemoryStore save/load/corpus works

## Trajectory Artifact

Extract a bounded representation of an RLM run for the judge and
extractor.

**Purpose**: Summarize iterations into key steps (~10 max) with actions
and outcomes.

------------------------------------------------------------------------

### extract_trajectory_artifact

``` python

def extract_trajectory_artifact(
    task:str, answer:str, iterations:list, ns:dict
)->dict:

```

*Create bounded trajectory artifact for judge/extractor.*

Summarizes each iteration’s code blocks into 1-2 line “action +
outcome”, limiting to ~10 most informative key steps.

Args: task: Original task query answer: Final answer from rlm_run
iterations: List of RLMIteration objects ns: Final namespace dict

Returns: Dictionary with keys: - task: str - final_answer: str -
iteration_count: int - converged: bool (whether final_answer was set) -
key_steps: List of {iteration, action, outcome} - variables_created:
List of variable names in ns - errors_encountered: List of error
messages from stderr

``` python
# Test with mock iterations
from rlm._rlmpaper_compat import CodeBlock, REPLResult

mock_block1 = CodeBlock(
    code="search('Activity')",
    result=REPLResult(stdout="Found 3 entities", stderr=None, locals={})
)
mock_block2 = CodeBlock(
    code="describe_entity('prov:Activity')",
    result=REPLResult(stdout="prov:Activity is a class", stderr=None, locals={})
)
mock_iteration = RLMIteration(
    prompt="test prompt",
    response="test response",
    code_blocks=[mock_block1, mock_block2],
    final_answer=None,
    iteration_time=0.5
)

artifact = extract_trajectory_artifact(
    task="What is prov:Activity?",
    answer="prov:Activity is a class",
    iterations=[mock_iteration],
    ns={'result': 'prov:Activity is a class'}
)

assert artifact['task'] == "What is prov:Activity?"
assert artifact['iteration_count'] == 1
assert artifact['converged'] == True
assert len(artifact['key_steps']) == 2
assert 'search' in artifact['key_steps'][0]['action'].lower()
assert len(artifact['variables_created']) == 1
print("✓ Trajectory artifact extraction works")
```

    ✓ Trajectory artifact extraction works

## Judge

Classify trajectory as success or failure with evidence-sensitivity.

**Success criteria**: 1. Answer directly addresses the task 2. Answer is
grounded in retrieved evidence (not hallucinated) 3. Reasoning shows
systematic exploration

**Failure indicators**: 1. No answer produced (didn’t converge) 2.
Answer doesn’t address the task 3. Answer makes claims without
supporting evidence

------------------------------------------------------------------------

### judge_trajectory

``` python

def judge_trajectory(
    artifact:dict, ns:dict=None
)->dict:

```

*Judge trajectory success using llm_query.*

Evidence-sensitive: success requires grounding in retrieved evidence.

Args: artifact: Trajectory artifact from extract_trajectory_artifact()
ns: Optional namespace for additional context

Returns: Dictionary with keys: - is_success: bool - reason: str -
confidence: str (‘high’, ‘medium’, ‘low’) - missing: list\[str\] (what
evidence was lacking if failure)

``` python
# Test judge with real LLM (requires API key)
test_artifact = {
    'task': 'What is prov:Activity?',
    'final_answer': 'prov:Activity is a class representing activities in PROV ontology',
    'iteration_count': 2,
    'converged': True,
    'key_steps': [
        {'iteration': 1, 'action': "search('Activity')", 'outcome': 'Found 3 entities'},
        {'iteration': 2, 'action': "describe_entity('prov:Activity')", 'outcome': 'A class in PROV'}
    ],
    'variables_created': ['result'],
    'errors_encountered': []
}

judgment = judge_trajectory(test_artifact)
print(f"Success: {judgment['is_success']}")
print(f"Reason: {judgment['reason']}")
print(f"Confidence: {judgment['confidence']}")
```

## Extractor

Extract 1-3 reusable memory items from a trajectory.

**For successes**: Emphasize why the approach worked

**For failures**: Emphasize what to avoid and recovery strategies

**Output format**: Procedural (steps/checklist/template), NOT a
retelling

------------------------------------------------------------------------

### extract_memories

``` python

def extract_memories(
    artifact:dict, judgment:dict, ns:dict=None
)->list:

```

*Extract up to 3 reusable memory items from trajectory.*

Args: artifact: Trajectory artifact from extract_trajectory_artifact()
judgment: Judgment dict from judge_trajectory() ns: Optional namespace
for additional context

Returns: List of MemoryItem objects (0-3 items)

``` python
# Test extractor with real LLM
test_artifact = {
    'task': 'Find properties of prov:Activity',
    'final_answer': 'prov:Activity has properties: prov:startedAtTime, prov:endedAtTime',
    'iteration_count': 3,
    'converged': True,
    'key_steps': [
        {'iteration': 1, 'action': "search('Activity')", 'outcome': 'Found prov:Activity'},
        {'iteration': 2, 'action': "describe_entity('prov:Activity')", 'outcome': 'A class'},
        {'iteration': 3, 'action': "get_properties('prov:Activity')", 'outcome': 'Listed properties'}
    ],
    'variables_created': ['activity_props'],
    'errors_encountered': []
}

test_judgment = {
    'is_success': True,
    'reason': 'Answer grounded in ontology data',
    'confidence': 'high',
    'missing': []
}

memories = extract_memories(test_artifact, test_judgment)
print(f"Extracted {len(memories)} memories:")
for m in memories:
    print(f"  - {m.title}")
    print(f"    Tags: {m.tags}")
```

## BM25 Retrieval

Find relevant memories for new tasks using keyword-based BM25 retrieval.

**Searches over**: title + description + tags

------------------------------------------------------------------------

### retrieve_memories

``` python

def retrieve_memories(
    store:MemoryStore, task:str, k:int=3
)->list:

```

*Retrieve top-k relevant memories using BM25.*

Tokenizes task and searches over title + description + tags.

Args: store: MemoryStore instance task: Task query string k: Number of
memories to retrieve

Returns: List of top-k MemoryItem objects (may be fewer if scores ≤ 0)

``` python
# Test BM25 retrieval
test_store = MemoryStore()

# Add diverse memories
test_store.add(MemoryItem(
    id=str(uuid.uuid4()),
    title='SPARQL query pattern for entity search',
    description='Use rdfs:label with FILTER for case-insensitive search.',
    content='- Step 1\n- Step 2',
    source_type='success',
    task_query='Find entities by name',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['sparql', 'search', 'entity']
))

test_store.add(MemoryItem(
    id=str(uuid.uuid4()),
    title='Property exploration strategy',
    description='Systematically explore properties using describe then probe.',
    content='- Action A\n- Action B',
    source_type='success',
    task_query='What properties does X have?',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['properties', 'exploration']
))

test_store.add(MemoryItem(
    id=str(uuid.uuid4()),
    title='Debugging failed SPARQL queries',
    description='Check syntax, namespaces, and endpoint first.',
    content='- Check 1\n- Check 2',
    source_type='failure',
    task_query='Query failed with error',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['sparql', 'debugging', 'error']
))

# Test retrieval for different queries
results1 = retrieve_memories(test_store, 'How do I search for entities?', k=2)
assert len(results1) <= 2
assert any('search' in r.title.lower() or 'search' in r.tags for r in results1)
print(f"✓ Retrieved {len(results1)} memories for 'search for entities'")

results2 = retrieve_memories(test_store, 'My SPARQL query is broken', k=2)
assert len(results2) <= 2
assert any('sparql' in r.tags for r in results2)
print(f"✓ Retrieved {len(results2)} memories for 'SPARQL query broken'")

results3 = retrieve_memories(test_store, 'What properties does prov:Activity have?', k=2)
print(f"✓ Retrieved {len(results3)} memories for 'properties question'")

# Test access count increment
assert results1[0].access_count > 0
print("✓ Access count tracking works")
```

    ✓ Retrieved 2 memories for 'search for entities'
    ✓ Retrieved 2 memories for 'SPARQL query broken'
    ✓ Retrieved 2 memories for 'properties question'
    ✓ Access count tracking works

    DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
      created_at=datetime.utcnow().isoformat(),
    <ipython-input-1-c9306d916f1d>:23: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
      created_at=datetime.utcnow().isoformat(),
    <ipython-input-1-c9306d916f1d>:34: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
      created_at=datetime.utcnow().isoformat(),

## Injection Formatting

Format retrieved memories for bounded prompt injection.

**Output includes**: - Assessment instruction - Title + description + up
to 3 key bullets from content

**Never injects full content** to maintain bounded prompt size.

------------------------------------------------------------------------

### format_memories_for_injection

``` python

def format_memories_for_injection(
    memories:list, max_bullets:int=3
)->str:

```

*Format memories for bounded prompt injection.*

Returns string with: - Assessment instruction - Title + description +
key bullets from content (up to max_bullets)

Args: memories: List of MemoryItem objects to format max_bullets:
Maximum bullets to extract from content

Returns: Formatted string for prompt injection

``` python
# Test injection formatting
test_memories = [
    MemoryItem(
        id='test-1',
        title='SPARQL Search Pattern',
        description='Template for searching entities by label.',
        content="""- Use rdfs:label for human-readable names
- Add FILTER for case-insensitive matching
- Include LIMIT to avoid timeout
- Check for alternative label properties""",
        source_type='success',
        task_query='test',
        created_at=datetime.now(timezone.utc).isoformat(),
        tags=['sparql']
    ),
    MemoryItem(
        id='test-2',
        title='Property Discovery',
        description='Systematic approach to finding properties.',
        content="""1. Start with describe_entity() for overview
2. Use get_properties() for full list
3. Check both domain and range
4. Look for inverse properties""",
        source_type='success',
        task_query='test',
        created_at=datetime.now(timezone.utc).isoformat(),
        tags=['properties']
    )
]

formatted = format_memories_for_injection(test_memories, max_bullets=3)

# Verify format
assert '## Relevant Prior Experience' in formatted
assert 'assess which of these strategies' in formatted
assert '### 1. SPARQL Search Pattern' in formatted
assert '### 2. Property Discovery' in formatted
assert 'Use rdfs:label' in formatted
assert 'Start with describe_entity' in formatted

# Verify bullet limiting (should have max 3 bullets per memory)
lines = formatted.split('\n')
bullet_count_mem1 = sum(1 for l in lines[lines.index('### 1. SPARQL Search Pattern'):lines.index('### 2. Property Discovery')] if l.strip().startswith('-'))
assert bullet_count_mem1 <= 3

print("✓ Injection formatting works")
print("\nFormatted output:")
print(formatted[:300] + "...")
```

    ✓ Injection formatting works

    Formatted output:
    ## Relevant Prior Experience

    Before taking action, briefly assess which of these strategies apply to your current task and which do not.

    ### 1. SPARQL Search Pattern
    Template for searching entities by label.
    Key points:
    - Use rdfs:label for human-readable names
    - Add FILTER for case-insensitive ma...

    DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
      created_at=datetime.utcnow().isoformat(),
    <ipython-input-1-2dc0c9d48ca1>:26: DeprecationWarning: datetime.datetime.utcnow() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.now(datetime.UTC).
      created_at=datetime.utcnow().isoformat(),

## Integration

Complete closed-loop: RETRIEVE → INJECT → INTERACT → EXTRACT → STORE

------------------------------------------------------------------------

### rlm_run_with_memory

``` python

def rlm_run_with_memory(
    query:str, context:str, memory_store:MemoryStore, ns:dict=None, enable_memory_extraction:bool=True,
    persist_dataset:bool=False, # NEW: Dataset persistence
    dataset_path:Path=None, kwargs:VAR_KEYWORD
)->tuple:

```

*RLM run with procedural memory loop.*

Closed-loop cycle: 1. RETRIEVE: Get relevant memories via BM25 2.
INJECT: Add to context/prompt 3. INTERACT: Run rlm_run() 4. EXTRACT:
Judge + extract new memories 5. STORE: Persist new memories

NEW: Dataset persistence: - If persist_dataset=True and dataset_path
provided, loads snapshot before run - After run, if dataset was
modified, saves snapshot - Stores snapshot path in extracted MemoryItem
for lineage

Args: query: Task query string context: Context string (e.g., ontology
summary) memory_store: MemoryStore instance for retrieval/storage ns:
Optional namespace dict enable_memory_extraction: Whether to extract and
store new memories (default True) persist_dataset: Whether to persist
dataset snapshots (default False) dataset_path: Optional path for
dataset snapshot \*\*kwargs: Additional arguments for rlm_run()

Returns: Tuple of (answer, iterations, ns, new_memories)

``` python
# Integration test (requires full RLM setup)
from rlm.ontology import setup_ontology_context
import tempfile

def test_memory_improves_convergence():
    """Second attempt should benefit from first attempt's memory."""
    with tempfile.TemporaryDirectory() as tmpdir:
        store = MemoryStore(path=Path(tmpdir) / 'test_integration.json')
        
        # First run - no memories
        ns = {}
        setup_ontology_context('ontology/prov.ttl', ns, name='prov')
        
        answer1, iters1, ns1, mems1 = rlm_run_with_memory(
            "What is prov:Activity and what properties does it have?",
            ns['prov_meta'].summary(),
            store,
            ns=ns
        )
        print(f"\nFirst run: {len(iters1)} iterations, {len(mems1)} memories extracted")
        for mem in mems1:
            print(f"  - {mem.title}")
        
        # Second run - similar task, should retrieve memories
        ns2 = {}
        setup_ontology_context('ontology/prov.ttl', ns2, name='prov')
        
        answer2, iters2, ns2, mems2 = rlm_run_with_memory(
            "What is prov:Entity and what properties does it have?",
            ns2['prov_meta'].summary(),
            store,
            ns=ns2
        )
        print(f"\nSecond run: {len(iters2)} iterations")
        print(f"Total memories in store: {len(store.memories)}")
        
        # Verify memories were retrieved
        retrieved_for_second = retrieve_memories(
            store,
            "What is prov:Entity and what properties does it have?",
            k=3
        )
        print(f"Memories that would be retrieved for second run: {len(retrieved_for_second)}")
        for mem in retrieved_for_second:
            print(f"  - {mem.title} (accessed {mem.access_count} times)")

# Run test
# test_memory_improves_convergence()
```

## Usage Examples

End-to-end examples with PROV ontology.

``` python
# Full example: Build up procedural memory over multiple queries
from rlm.ontology import setup_ontology_context
from pathlib import Path

# Initialize memory store
store = MemoryStore(path=Path('memories/prov_memories.json'))

# If store exists, load it
if store.path.exists():
    store = MemoryStore.load(store.path)
    print(f"Loaded {len(store.memories)} existing memories")

# Setup ontology context
ns = {}
setup_ontology_context('ontology/prov.ttl', ns, name='prov')

# Series of queries
queries = [
    "What is prov:Activity?",
    "What properties does prov:Activity have?",
    "How are prov:Activity and prov:Entity related?",
]

for i, query in enumerate(queries, 1):
    print(f"\n{'='*60}")
    print(f"Query {i}: {query}")
    print('='*60)
    
    answer, iterations, ns, new_memories = rlm_run_with_memory(
        query,
        ns['prov_meta'].summary(),
        store,
        ns=ns
    )
    
    print(f"\nAnswer: {answer}")
    print(f"Iterations: {len(iterations)}")
    print(f"New memories extracted: {len(new_memories)}")
    for mem in new_memories:
        print(f"  - {mem.title}")

print(f"\n{'='*60}")
print(f"Final memory store: {len(store.memories)} memories")
print('='*60)

# Show all memories with access counts
for mem in store.memories:
    print(f"\n{mem.title}")
    print(f"  Source: {mem.source_type}")
    print(f"  Accessed: {mem.access_count} times")
    print(f"  Tags: {mem.tags}")
```

## Bootstrap General Strategies

**Architectural Role (2026-01-19 Refactor):**

Universal ontology exploration patterns that should be loaded into
memory on startup. These strategies were previously in
`reasoning_bank.CORE_RECIPES` but were moved here to align with the
ReasoningBank paper’s architecture.

**Key Insight:** General strategies are **LEARNED** (procedural memory),
not **AUTHORED** (recipes).

### Why Bootstrap?

1.  **Correct conceptual layer**: Universal patterns belong in
    procedural_memory (Layer 1), not reasoning_bank (Layer 2)
2.  **Enable learning**: Stored as MemoryItems, these can be:
    - Retrieved via BM25 (not always injected)
    - Updated with success_rate over time
    - Merged/consolidated with newly extracted patterns
    - Removed if ineffective
3.  **Future extensibility**: New strategies extracted from successful
    runs can be added to memory_store automatically

### Usage

``` python
# One-time bootstrap at startup
memory_store = MemoryStore()
for strategy in bootstrap_general_strategies():
    memory_store.add(strategy)

# Use in RLM runs
from rlm.reasoning_bank import rlm_run_enhanced
answer, iters, ns = rlm_run_enhanced(
    query="What is Activity?",
    context=meta.summary(),
    sense=sense,
    memory_store=memory_store  # General strategies retrieved via BM25
)
```

**Note:** These are **seed strategies** - the system can learn and add
more over time via the memory extraction loop.

------------------------------------------------------------------------

### bootstrap_general_strategies

``` python

def bootstrap_general_strategies(
    
)->list:

```

*Create general strategy memories for bootstrapping.*

These are universal patterns extracted from successful RLM runs that
apply to all ontologies.

Returns: List of MemoryItem objects representing general strategies

``` python
# Test bootstrap
strategies = bootstrap_general_strategies()
print(f"Bootstrapped {len(strategies)} general strategies:")
for s in strategies:
    print(f"  - {s.title}")
    print(f"    Tags: {s.tags}")
    print(f"    Task: {s.task_query}")

# Test that they can be stored
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
    test_path = Path(tmpdir) / 'bootstrap_test.json'
    store = MemoryStore(path=test_path)
    
    for strategy in strategies:
        store.add(strategy)
    
    store.save()
    
    # Reload and verify
    loaded = MemoryStore.load(test_path)
    assert len(loaded.memories) == len(strategies)
    print(f"\n✓ Bootstrap strategies can be saved and loaded")
    print(f"✓ Total: {len(loaded.memories)} strategies")
```

    Bootstrapped 7 general strategies:
      - Describe Entity by Label
        Tags: ['entity', 'search', 'describe', 'universal']
        Task: entity_description
      - Find Subclasses Using GraphMeta
        Tags: ['hierarchy', 'subclass', 'graphmeta', 'universal']
        Task: hierarchy
      - Find Superclasses Using GraphMeta
        Tags: ['hierarchy', 'superclass', 'graphmeta', 'universal']
        Task: hierarchy
      - Find Properties by Domain/Range
        Tags: ['properties', 'domain', 'range', 'universal']
        Task: property_discovery
      - Pattern-Based Entity Search
        Tags: ['search', 'pattern', 'multiple', 'universal']
        Task: pattern_search
      - Find Relationship Path Between Entities
        Tags: ['relationships', 'path', 'connection', 'universal']
        Task: relationship_discovery
      - Navigate Class Hierarchy from Roots
        Tags: ['hierarchy', 'exploration', 'roots', 'universal']
        Task: hierarchy

    ✓ Bootstrap strategies can be saved and loaded
    ✓ Total: 7 strategies

## Validation Functions

Validation gates to ensure quality and consistency of procedural memory.

------------------------------------------------------------------------

### validate_no_hardcoded_uris

``` python

def validate_no_hardcoded_uris(
    strategies:list
)->bool:

```

*Ensure strategies don’t reference specific ontology URIs.*

Universal strategies should use placeholders like {ontology}\_meta
instead of hardcoded ontology prefixes.

------------------------------------------------------------------------

### validate_bootstrap_strategies

``` python

def validate_bootstrap_strategies(
    
)->dict:

```

*Validate bootstrap creates valid, non-conflicting strategies.*

Checks: - Correct count (7 strategies) - All are valid MemoryItem
objects - Unique titles (no duplicates) - All tagged as ‘universal’ - No
hardcoded ontology-specific URIs

Returns: Dictionary with ‘valid’ flag and detailed checks

------------------------------------------------------------------------

### check_memory_deduplication

``` python

def check_memory_deduplication(
    new_memory:MemoryItem, store:MemoryStore, threshold:float=0.7
)->str:

```

*Gate 1: Check for duplicate memories.*

Uses title similarity to detect duplicates and decide action: - add: No
similar memories, safe to add - merge: Similar memory exists, should
combine insights - skip: Similar memory exists and is better, don’t
add - replace: New memory is better, replace existing

Args: new_memory: MemoryItem to check store: MemoryStore to check
against threshold: Similarity threshold (0-1) for considering duplicate

Returns: Action string: ‘add’, ‘merge’, ‘skip’, or ‘replace’

------------------------------------------------------------------------

### score_generalization

``` python

def score_generalization(
    memory:MemoryItem
)->float:

```

*Gate 3: Score how generalizable a memory is (0-1).*

Higher score = more general/reusable across ontologies. Lower score =
specific to one ontology or situation.

Scoring factors: - Penalize hardcoded URIs (prov:, sio:, http://) -
Reward procedural language (use, check, try, if/then) - Reward
‘universal’ tag

Args: memory: MemoryItem to score

Returns: Score between 0.0 and 1.0

------------------------------------------------------------------------

### validate_retrieval_quality

``` python

def validate_retrieval_quality(
    memory_store:MemoryStore, test_cases:list
)->dict:

```

*Validate BM25 retrieves relevant memories for known queries.*

Args: memory_store: MemoryStore with strategies test_cases: List of
(query, expected_tags) tuples

Returns: Dictionary with validation results including success_rate

``` python
# Test validation functions
print("Test 1: Validate bootstrap strategies")
result = validate_bootstrap_strategies()
print(f"  Valid: {result['valid']}")
print(f"  Checks: {result['checks']}")

print("\nTest 2: Score generalization")
test_mem = bootstrap_general_strategies()[0]
score = score_generalization(test_mem)
print(f"  Strategy '{test_mem.title}' generalization score: {score:.2f}")

print("\nTest 3: Check memory deduplication")
store = MemoryStore()
strategies = bootstrap_general_strategies()
for s in strategies:
    store.add(s)

# Try adding a duplicate
duplicate = MemoryItem(
    id=str(uuid.uuid4()),
    title='Describe Entity by Label',  # Same as first strategy
    description='Test duplicate',
    content='Test content',
    source_type='success',
    task_query='test',
    created_at=datetime.now(timezone.utc).isoformat(),
    tags=['test']
)
action = check_memory_deduplication(duplicate, store, threshold=0.7)
print(f"  Action for duplicate: {action}")

print("\nTest 4: Validate retrieval quality")
test_cases = [
    ("What is Activity?", ['entity', 'describe']),
    ("Find subclasses", ['hierarchy', 'subclass']),
    ("What properties does it have?", ['properties', 'domain'])
]
retrieval_result = validate_retrieval_quality(store, test_cases)
print(f"  Valid: {retrieval_result['valid']}")
print(f"  Success rate: {retrieval_result['success_rate']:.1%}")

print("\n✓ All validation functions work")
```
