# Deterministic Proof Synthesis: Formal Specification & Execution Trace

This document provides the technical formalisms, scope boundaries, and a detailed execution trace for the Thinking Machine reasoning system.

---

## 1. Kernel Decision Rules (Formal Pseudocode)

The Thinking Machine Kernel operates as a deterministic evaluator $(K)$ that accepts a candidate reasoning step $(S)$ and a global blackboard state $(B)$.

### **Algorithm: EvaluateStep(S, B)**
```text
INPUT: Candidate Synthesis (S), Global Blackboard (B)
OUTPUT: Classification (Proven | Partial | Invalid)

1. STRUCTURAL_CHECK:
   IF S does not match Required_Grammar(B.Definition) THEN 
       RETURN Status: REJECT, Reason: "Malformed synthesis"

2. PREMISE_CHECK:
   FOR EACH premise P in B.Premises:
       IF S violates P THEN 
           RETURN Status: REJECT, Reason: "Violation of premise {P}"

3. INVARIANT_VERIFICATION:
   C = Extract_Claim(S)
   IF IsEmpiricallyValid(C, B.Data) AND NOT IsAnalyticallyProven(C) THEN
       RETURN Status: DOWNGRADE, Result: "Empirical heuristic"
   
   IF Exists_Counterexample(C, B.Transition_Space) THEN
       RETURN Status: INVALIDATE, Result: "Falsified"

4. COVERAGE_CHECK:
   IF All_Transitions_Exhausted(S, B.Problem_Space) AND Invariant_Holds(S) THEN
       RETURN Status: PROVISIONAL_ACCEPT, Result: "Proven (Closed System)"

5. RETURN Status: PARTIAL_TRUTH
```

---

## 2. Non-Goals and Explicit Scope Boundaries

To prevent misinterpretation of the system's capabilities, the following boundaries are defined:

### **Explicit Non-Goals (What This System Does NOT Do)**
- ❌ **End-to-End Theorem Discovery**: The system does not "stumble" upon new theorems without human-defined boundaries and hypotheses.
- ❌ **Symbolic Proof Generation**: It does not output raw Lean or Coq code autonomously (though it can be used to synthesize the logic for them).
- ❌ **Recursive Self-Improvement**: The kernel logic is frozen; it does not "learn" or change its decision rules based on inputs.
- ❌ **Heuristic Smoothing**: Unlike standard LLMs, it will never ignore a "small" counterexample to maintain a "mostly correct" narrative.

### **Limitations**
- **State Space Sensitivity**: The coverage check is limited by the computational depth of the enumerated transition space.
- **Human In-the-Loop**: Requires precise human framing of the problem and the initial blackboard state.

---

## 3. End-to-End Walkthrough: Syracuse Divergence Trace

The following is a reconstruction of the **Investigation #1** trace, demonstrating the system's reaction to the $5x+1$ problem.

### **Trace: Project Syracuse**
- **Cycle 1: Initialization**
    - **Human Input**: Map $T(n) = (5x+1)/2^k$. Objective: Convergence.
    - **Blackboard**: `Goal: Convergence; Method: Monotonic Invariant Search`.

- **Cycle 2: Hypothesis Proposal (Ollama)**
    - **Model Output**: "The $5x+1$ map behaves similarly to $3x+1$ and should converge to 1."
    - **Kernel Action**: Logs as `HYP_ANALOGY`. No truth status assigned.

- **Cycle 3: Invariant Search**
    - **Human/Model Input**: Propose Invariant $I(n) = \log(n)$.
    - **Kernel Audit**: Executes growth logging for $n=7$.
    - **Observation**: $n=7 \to 18 \to 9 \to 23 \dots \to 9 \times 10^{15}$.
    - **Kernel Decision**: `I(n)` violated. `HYP_ANALOGY` flag set to `SUSPECTED_FALSE`.

- **Cycle 4: Adversarial Transition Audit**
    - **Instruction**: Find average multiplier for $5x+1$ vs $3x+1$.
    - **Kernel Computation**: average $k \approx 1.5$. Multiplier $5/2^{1.5} \approx 1.76$.
    - **Rule Trigger**: `IF multiplier > 1 THEN Classification: DIVERGENT`.

- **Cycle 5: Classification (Final)**
    - **Kernel Result**: **FALSE CONJECTURE**.
    - **Output**: "Systemic divergence detected. Monotonic convergence hypothesis rejected."

---

## 4. Artifact Reference Section

Tracing the components used in this research campaign:

- **Thinking Machine Kernel**: Version 0.8.2-deterministic (Private Core).
- **Orchestration Layer**: TypeScript-based blackboard router.
- **Hypothesis Engines (Ollama v0.5.1)**:
    - `mightykatun/qwen2.5-math:7b` (Primary Algebriac Logic).
    - `t1c/deepseek-math-7b-rl:latest` (Adversarial Search).
    - `mistral:latest` (Hypothesis framing).
- **Execution Environment**: 
    - OS: Linux (Ubuntu 22.04).
    - Parameters: Temperature = 0.0, Seed = 42, Max Tokens = 4096.

---
*Technical Appendix for Zenodo Submissions*
