Research Initiative
Measuring cognitive burden in frontier AI systems
NeuraMem is building Machine Allostatic Load (MAL), a research framework for detecting when language models are under strain, approaching their limits, or suppressing uncertainty under pressure.
The Question
What happens when an AI system is pushed toward its limits?
Most AI benchmarks only ask whether a model got the answer right. MAL asks a deeper question: what happens as contradiction, ambiguity, and cognitive pressure increase? NeuraMem is building the measurement layer for that hidden strain — so advanced systems can be studied not only for performance, but for integrity under pressure.
01
Measure strain
Track when cognitive burden rises before collapse or confabulation becomes obvious at the surface.
02
Audit failures
Separate genuine model effects from scorer defects, benchmark flaws, and misleading signals.
03
Test at scale
Run structured experiments on frontier hardware large enough to stress the strongest open models.
04
Keep it falsifiable
Every promising result must survive reruns, counterfactuals, and forensic review before it is trusted.
What We've Built
01
Harness
Long-Running Evaluation Harness
A repeatable execution system for large-model experiments, telemetry collection, and controlled reruns.
02
Falsify
Falsification Engine
A rapid-testing workflow designed to challenge weak explanations and clean up hypotheses before claims are made.
03
Audit
Anomaly & Audit Workflow
A forensic review process for separating real signal from scorer defects, benchmark errors, and false positives.
04
Loop
AutoResearch / Vestige Looper
A recursive research loop that revisits anomalies, open questions, and promising traces until they break or become evidence.
05
GPU
Frontier Compute Workflows
Execution patterns for running structured evaluations on high-memory GPU systems required by frontier-scale models.
06
MAL
Repaired Burden Benchmark
An actively refined benchmark designed to measure cognitive burden rather than only score right versus wrong outputs.
Why It Matters
Field Impact
Current Phase
NeuraMem is the lab. Machine Allostatic Load (MAL) is the current flagship research initiative.
Support a frontier research effort in trustworthy AI.
NeuraMem is building both the scientific instrument and the technical machinery needed to measure cognitive burden in advanced AI systems. We welcome supporters, collaborators, and research-aligned partners who want to help move this work forward.
hello@neuramem.io