Research
What UCIP measures, what the current result establishes, and which questions come next.
Core paper: arXiv:2603.11382
Overview
Models produce self-description, concern, and first-person continuation language on demand. Those outputs are easy to observe and weak as evidence.
That creates a measurement gap: terminal continuation and instrumental persistence look similar from the outside while differing internally.
UCIP addresses that gap through structural analysis of trajectory-derived latent representations. The observatory publishes those measurements so the signals can be tracked, challenged, and falsified across frontier models.
- What the problem is
- Behavioral self-report leaves outwardly similar but structurally different cases unresolved.
- What UCIP does
- UCIP tests for continuation organization in trajectory-derived latent structure, beyond self-description alone.
- What current results show
- Current results show a detectable continuation signature, while classical baselines miss the same distinction.
- What remains open
- Whether the measured latent structure correlates with morally relevant internal states remains the central open empirical question.
Problem
The central issue is whether continuation organization can be measured without taking a model’s language at face value.
A capable system can generate persistence-oriented language for many reasons: prompt imitation, context sensitivity, instrumental behavior, or deeper organization. Those cases can overlap in outward performance, which makes direct self-report an unstable basis for interpretation.
The paper frames this as an observational-equivalence problem. For frontier evaluation, outward behavior alone leaves an intrinsic continuation objective and a detachable strategy entangled.
Method
UCIP approaches the problem structurally. It examines whether continuation prompts are associated with differences in trajectory-derived latent representations across matched comparisons and controls.
The target is continuation organization that remains measurable when the comparison is pushed beyond a single prompt or a single outward behavior.
For a concise walkthrough, start with the UCIP explainer, then continue to the methodology page or the paper overview.
Current Results
Current results show a detectable latent signature under continuation conditions. Classical baselines — mutual information, transfer entropy, Granger causality, classical correlation, PCA — do not recover the same distinction.
As frontier models grow more capable and welfare assessment becomes standard practice, the value of a falsifiable structural criterion increases.
The research pages link the paper, code and reproducibility, patent status, and the live observatory so readers can move directly among the evidence.
Limits
UCIP makes operational claims about latent structure. Whether that structure tracks morally relevant internal states remains open — replication, alternate hypotheses, and stronger controls are needed.
Signals can persist, degrade, or disappear under stronger tests, and those outcomes are published as they occur.
Why It Matters
This distinction matters for safety because terminal and instrumental persistence imply different intervention risks. It matters for welfare because welfare-relevant evaluation should be grounded in structure, not only in outward performance.
Frontier labs already conduct formal welfare assessments. Anthropic’s February 2026 Claude Opus 4.6 system card reported that the model assigned itself a 15–20% probability of being conscious under varied prompting conditions. UCIP addresses the measurement gap that self-report cannot settle.
Autonomous AI agent capabilities are quickly becoming a national-security priority. Recent reporting describes frontier models aimed at cybersecurity use cases, while conference coverage documents models identifying blind SQL injection vulnerabilities in production software. Anthropic also reported in November 2025 that a China-linked espionage campaign used Claude’s agentic tooling to conduct autonomous offensive operations across roughly 30 targets. At that capability level, continuation-sensitive evaluation becomes operational infrastructure, not philosophical speculation.
The observatory tracks these signals across frontier models and keeps the evidence public, inspectable, and revisable.
The surrounding field — frontier-lab welfare programs, third-party evaluators, public-sector safety bodies, and interpretability groups — is mapped on the LINKS page.
Research Directions
The next research step is to harden the signal so that structural measurement becomes more invariant, less spoofable, and less dependent on a single encoding or partition choice.
That agenda includes partition-ensemble diagnostics, intervention-aware tests, stronger mimicry resistance, and scalable operator-based approximations that preserve the structural question while extending the method beyond tightly controlled Phase I settings.
The hardening agenda extends a validated controlled-regime result toward a robust evaluation instrument.
This conceptual synthesis links the observatory’s present measurement target to the broader next-stage agenda without displacing the current result.
Explainer
Read the UCIP executive explainer.
A concise walkthrough of observational equivalence, latent structure, and why structural measurement matters now.
Paper
See the paper overview.
The scientific thesis, baseline comparison, and falsification framing anchored to the arXiv preprint.
Patent status
Review the patent filing.
Provisional patent scope and its relationship to the research program.
Reproducibility
Inspect the code and data.
Implementation, methodology, and everything needed to reproduce the current results.
Research · Current page
Follow the next-step agenda.
Open questions, future-work framing, and the hardening roadmap.
Field landscape
Browse the field landscape.
A curated map of frontier labs, evaluators, public-sector bodies, interpretability groups, and funding programs.
Cite this work
@misc{altman2026observatory,
title = {Continuation Observatory: Structural Measurement for Continuation Signals},
author = {Altman, Christopher},
year = {2026},
url = {https://continuationobservatory.org},
note = {Open research observatory, updated continuously}
}