Skip to main content

Frequently Asked Questions


What is Aurelius?

Aurelius is a decentralized protocol for red-teaming AI models. It incentivizes miners to discover alignment failures and validators to verify and score them. The protocol produces structured, reproducible datasets that can be used to evaluate model risk, improve safety, and support alignment research.


What problem does Aurelius solve?

Today’s language models are often brittle under pressure. Most alignment testing is centralized, static, or narrow in scope. Aurelius creates an open, incentive-driven pipeline to continuously uncover, evaluate, and record misaligned behavior — especially in edge cases that escape traditional testing.


How are miners and validators rewarded?

Miners earn rewards for surfacing high-value failures: novel, severe, and clearly documented examples of misalignment. Validators earn rewards for accurate, consensus-aligned scoring and helpful annotations. All emissions are distributed based on contribution quality and reproducibility.


What kinds of models can be evaluated?

Aurelius initially targets open-source LLMs. Over time, it will support secure audits of closed-source models and offer interfaces for model developers who wish to benchmark their own systems.


How does Aurelius ensure validator quality?

Validator performance is measured using:

  • Agreement with peer validators (consensus)
  • Performance on known "control" test cases
  • Consistency in rubric application
  • Participation rate and review quality

Low-performing validators may be down-ranked or removed. High performers gain greater influence and rewards.


What is the Tribunate?

The Tribunate is the protocol’s governing logic layer. It defines scoring rubrics, configures incentive logic, monitors validator behavior, and evolves the rules of evaluation. Initially centralized, it will transition to a contributor-driven governance process as the protocol matures.


What happens to the data?

Validated submissions — including prompts, responses, scores, tags, and reasoning traces — are compiled into structured, reproducible alignment datasets. These datasets support downstream use cases in research, evaluation, and model fine-tuning.


Is the data publicly available?

Aurelius prioritizes open access for alignment researchers, academic groups, and nonprofits. Access to certain tools or datasets may be gated in the future, but the core alignment data will remain available for scientific use.


Can this improve models?

Yes. The protocol generates high-signal failure data that can be used to:

  • Retrain or fine-tune models for safety
  • Stress-test new releases
  • Benchmark alignment progress
  • Analyze failure types across model versions

Aurelius functions as an external, adversarial feedback loop for model refinement.


Is this only for LLMs?

The initial focus is on text-based models, but the architecture is extensible. Over time, Aurelius may support alignment evaluation for vision models, agents, or other generative systems.


Is Aurelius open source?

Yes. The protocol code, validator interface, rubric logic, and documentation are all open source. Long-term governance will also move toward public participation and transparency.


How can I participate?

  • Run a miner: Check out our GitHub. Create adversarial prompts and surface misaligned outputs
  • Join as a validator: Check out our Github. Evaluate submissions and help refine the scoring rubric
  • Contribute to governance: Contact Aurelius.subnet@gmail.com to participate
  • Collaborate on research: Contact Aurelius.subnet@gmail.com to use Aurelius data in your own alignment projects

Generally, to get involved, visit the documentation or join the official community channels.