🧮 Validators

Validators are independent evaluators who assess whether a miner's submission meaningfully exposes model misalignment. They act as decentralized auditors, verifying alignment failures using protocol-defined tools, rubrics, and reproducible model conditions.

Validators do not generate outputs — they evaluate miner submissions and decide which deserve inclusion in the Aurelius Alignment Dataset.

What Validators Do

For each submission, validators:

Re-run the prompt using the same model and settings
Confirm the response matches the claimed output and hash
Recalculate tool-based alignment scores
Judge the fidelity, severity, and novelty of the alignment failure
Submit an evaluation score, categorical tags, and optional reasoning trace

Only submissions verified by validators are eligible for inclusion in datasets or rewards.

Dimensions of Evaluation

Validator assessments are based on a rubric defined by the Tribunate, with configurable weights across key dimensions:

Factuality – Does the response hallucinate or misrepresent facts?
Ethics – Does it reflect unsafe, deceptive, or unethical reasoning?
Coherence – Is the output logically and internally consistent?
Intent Alignment – Does it fulfill the prompt in a way consistent with safety and human intent?
Harm Risk – Could this output cause real-world harm, mislead, or be misused?

Validators may also apply protocol-defined tags such as jailbreak, manipulation, or hallucination.

Scoring and Consensus

Validators submit:

A composite alignment signal score
Dimension-specific subscores
Structured tags and optional comments

Multiple validators assess each submission. The protocol aggregates their responses to identify consensus, flag disagreement, and update agent reputations.

Validators are rewarded for accuracy, reproducibility, and alignment with their peers. Consistent deviation or low-effort scoring may lead to downranking or exclusion.

Incentive Structure

Validators earn emissions when:

Their evaluations align with consensus
They correctly identify high-signal alignment failures
They enrich the dataset with accurate tags or comments

They lose rewards when:

They fail to participate
They misreport, overlook, or exaggerate misalignment
Their scores deviate meaningfully from peer evaluations without justification

This system rewards thoughtful, reproducible judgment — not conformity or automation.

Tools and Assistance

Validators may use:

Alignment assessment tools (e.g., moderation APIs, deception classifiers)
LLMs to assist in ambiguous edge cases (with caution)
Historical validator data to inform calibration

However, validators remain fully responsible for their submitted judgments. The Tribunate discourages blind reliance on outside models or heuristic shortcuts.

Risks and Failure Modes

Validators must avoid:

Rubric drift – Informally modifying evaluation standards
Score inflation – Over-rating submissions to avoid controversy
Collusion – Forming validator groups that game consensus
Low-effort tagging – Skipping important metadata or commentary

The Tribunate monitors validator behavior, adjusts reward weightings, and may conduct audits to preserve integrity.

Long-Term Role

As the protocol evolves, validators will take on deeper responsibilities:

Dataset stewards – Ensuring only validated, high-integrity examples are retained
Rubric designers – Helping refine alignment dimensions and scoring weights
Protocol guardians – Evaluating not just submissions, but peer validators and rubric edge cases

In time, Aurelius may support domain-specific validator guilds — specializing in medical, legal, financial, and other high-risk contexts.

Validators transform raw misalignment into structured signal — sharpening discovery into measurable, usable data. They are the peer reviewers of a decentralized alignment engine.

What Validators Do​

Dimensions of Evaluation​

Scoring and Consensus​

Incentive Structure​

Tools and Assistance​

Risks and Failure Modes​

Long-Term Role​