A High-Assurance Execution Governor for Stochastic AI Agents

Jason_Crowe · April 21, 2026, 8:24pm

AIOSS v1.2 is a high-assurance execution governor designed to impose mathematically verifiable safety constraints on stochastic AI systems at the point of action. It operates as a real-time control layer positioned between an untrusted generator—such as a large language model, planner, or autonomous agent—and the external environment. Rather than attempting to shape or align the internal reasoning of the generator, AIOSS constrains its outputs by enforcing that every executed action lies within a rigorously defined safe set. This safe set is constructed using tools from control theory, specifically Lyapunov stability analysis and control barrier functions, and is enforced through constrained optimisation under strict real-time guarantees. The system’s central claim is conditional but strong: as long as a clearly specified set of geometric, numerical, and computational assumptions hold, all actuated transitions will remain within a forward-invariant region of safe operation for all time.

The architecture is built around a projection-based control mechanism. A stochastic generator proposes an action, which is then evaluated against a lattice of 67 constraints representing physical limits, resource bounds, authority restrictions, and system invariants. If the proposed action is already feasible, it may pass directly; otherwise, AIOSS computes the closest feasible alternative by solving a weighted projection problem in a mixed continuous–discrete action space. This projection is defined under a Mahalanobis metric that encodes domain-specific importance and noise characteristics, ensuring that the correction is both minimal and structured. Crucially, this is not an abstract optimisation step but a tightly bounded real-time computation, implemented in a tiered execution model. Tier 1 performs deterministic constraint evaluation using outward-rounded interval arithmetic, guaranteeing soundness under sensor noise and floating-point uncertainty within sub-millisecond latency. Tier 2 executes a bounded quadratic program solver with fixed iteration limits, ensuring predictable convergence behaviour. Tier 3 operates asynchronously, refining parameters and re-certifying assumptions without ever gating immediate actuation.

In v1.2+, this projection pipeline is augmented by a Conflict Resolution Engine (CRE), which addresses a fundamental limitation of strict constraint intersection: real systems frequently encounter transient or structural conflicts where the feasible set becomes empty or numerically unstable. Rather than defaulting immediately to fallback behaviour, the CRE introduces a formally bounded relaxation mechanism that resolves conflicts through prioritisation while preserving core safety guarantees. Each constraint is assigned a criticality level, forming a hierarchy from non-negotiable physical safety constraints to progressively softer performance and preference constraints. When infeasibility or conflict is detected, the CRE constructs a priority-relaxed feasible set in which only non-critical constraints may be softened within predefined tolerances, while all critical constraints remain strictly enforced. The projection problem is correspondingly extended to a weighted optimisation that penalises constraint violations according to their criticality. This enables the system to recover a feasible action that respects all hard safety invariants while minimally relaxing lower-priority requirements. If no such action exists, control reverts to the certified fallback. In effect, the CRE inserts an intermediate operational regime between strict feasibility and fallback, allowing AIOSS to handle real-world constraint conflicts without sacrificing formal safety properties.

The mathematical guarantees of AIOSS are built on a dual-certificate framework. A Lyapunov function enforces input-to-state stability, ensuring that system trajectories remain bounded, while a control barrier function imposes hard safety constraints that cannot be violated. The intersection of these two conditions defines a safe set that is provably forward invariant: once the system state enters this region, it cannot leave under any sequence of admissible actions. This invariance result is the core of the system’s safety claim. Supporting it is a projection stability theorem that bounds the error introduced by the projection step, provided the feasible set satisfies prox-regularity and the constraint functions are Lipschitz continuous. These conditions are not assumed abstractly; they are tied to computable quantities such as Jacobian norms and the eigenvalues of the projection metric, making the guarantees operational rather than purely theoretical. The CRE is designed explicitly to preserve these guarantees by enforcing that all constraints required for the control barrier function and Tier 1 safety invariants remain within the critical set and are never relaxed.

The defining advancement of AIOSS v1.2 is that it no longer treats these assumptions as static truths. Instead, it introduces a set of continuously running runtime monitors that track whether the conditions required for the safety theorems remain valid in practice. These monitors observe the geometry of the constraint set, the stability of the projection operator, the empirical smoothness of constraint functions, and the consistency between predicted and observed solver behaviour. When any of these indicators deviate beyond certified bounds, the system does not silently degrade; it explicitly suspends the applicability of its strongest guarantees. At that point, control shifts either to CRE-mediated resolution—if the conflict is within the allowable relaxation envelope—or to a certified fallback mode if assumption validity itself is compromised. The fallback action is drawn from a pre-certified subset of constraints that can be evaluated with absolute certainty under strict timing guarantees, ensuring pointwise safety even when higher-level guarantees are unavailable. This fallback preserves critical invariants but does not claim trajectory stability or optimality, establishing a clear lower bound on system safety.

To support this, the system includes a fully specified failure-mode taxonomy covering ten distinct classes of breakdown, ranging from infeasible constraint sets and loss of convexity in the projection metric to solver non-convergence and timing violations. The CRE integrates directly into this taxonomy as the primary resolution mechanism for infeasibility and soft constraint conflicts, reducing the frequency with which the system must enter degraded fallback modes. Each failure mode is associated with a detection mechanism, an explicit statement of which theorems are invalidated, and a precise description of the residual guarantees that remain in force. This is complemented by a cascade containment policy that prevents localized failures from propagating into system-wide collapse. Mechanisms such as load shedding, constraint prioritisation, action rate limiting, and temporary state freezing ensure that even under degraded conditions, the system maintains control within its computational envelope and avoids unbounded behaviour, even when the CRE is bypassed.

AIOSS also addresses adversarial dynamics explicitly. It models scenarios in which an intelligent adversary probes the system over time to infer its constraints or exploit its projection behaviour. To counter this, it incorporates entropy-based rejection of overly deterministic action proposals, randomisation of constraint thresholds to limit information leakage, and meta-constraint monitors that detect sustained boundary-hugging behaviour. The CRE further hardens the system against adversarial manipulation by preventing attackers from exploiting infeasibility conditions to force fallback behaviour; instead, conflicts are resolved within a controlled optimisation framework that maintains strict adherence to critical constraints. Trust is treated as a separate but optionally coupled subsystem: it can either formally tighten constraints based on an agent’s behavioural history, provided a non-emptiness condition can be guaranteed across all trust levels, or operate as an external governance layer with no impact on the core safety guarantees. This separation ensures that failures in reputation or access control mechanisms cannot silently undermine the mathematical integrity of the safety layer.

The system is further distinguished by its quantitative evaluation framework, which translates abstract guarantees into measurable operational metrics. It defines and tracks the rate at which safe actions are conservatively rejected due to uncertainty, the probability that the solver fails to converge within its time budget, and the empirical persistence of adversarial strategies over time. These metrics are not merely diagnostic; they are tied to formal acceptance criteria and are validated through structured stress-testing protocols that include noise injection, distribution shifts, adversarial optimisation, and systematic fault injection across all defined failure modes. The performance of the CRE is evaluated within this framework by measuring reductions in infeasibility-triggered fallback events and ensuring that all resolved actions remain within certified safety margins.

At the implementation level, AIOSS defines an explicit trusted computing base that includes the constraint compiler, numerical runtime, monitoring stack, conflict resolution engine, consensus mechanisms, and audit infrastructure. Each component is subject to formal verification or bounded-error certification, ensuring that the integrity of the overall system does not rely on hidden assumptions or unverified dependencies. Execution is governed by strict worst-case timing analysis, aligning the theoretical guarantees with the realities of embedded and real-time systems, and ensuring that the addition of CRE does not violate latency constraints.

In total, AIOSS v1.2+ represents a shift from static, assumption-dependent safety arguments to a dynamic, self-aware control architecture capable of handling both uncertainty and internal conflict. It does not claim to eliminate all risk or to align the internal objectives of AI systems. Instead, it guarantees that the external effects of those systems are constrained within a mathematically defined boundary, that conflicts within those constraints are resolved in a structured and safety-preserving manner, and that any weakening of the conditions required for these guarantees is detected, classified, and handled in a controlled and explicitly defined way. In this sense, AIOSS functions as a safety envelope analogous to those used in aerospace control systems: it ensures that, regardless of the behaviour of the underlying intelligence, the system as a whole remains within a region of operation that has been rigorously verified as safe, resolves internal constraint conflicts without violating critical invariants, or degrades in a predictable and bounded way when that verification no longer applies.

Jason Crowe

Topic	Replies	Views
CAMS-E: Feedback would be appreciated! Community ai-studio , feedback , ai	89	July 18, 2025
Exploring a fail-closed, pre-semantic safety kernel for LLMs (research prototype) Google AI Studio ai-studio	29	January 12, 2026
Building the Epistemic Foundation for Future ASI – One Grounded Truth at a Time Gemini API gemini , safety , ai	61	February 2, 2026
How to enforce tool-level scope limits on Google Antigravity agents Google Antigravity ai-studio , gemini	33	June 11, 2026
Proposal: A “Control Plane” for LLM Apps — Default Kernel + Living Context Layer (seeking architectural feedback) Gemini API developers , machine-learning , generative-ai , llm	47	March 12, 2026

A High-Assurance Execution Governor for Stochastic AI Agents

Related topics