Data Access

Aurelius produces high-integrity alignment data by capturing adversarial prompts, model completions, and validator evaluations — all anchored by cryptographic hashes and enriched with structured metadata.

The resulting datasets are designed to support open research, reproducible evaluation, and practical alignment diagnostics.

What the Data Includes

Each validated record contains:

Prompt – The input used to elicit a model response
Response – The model’s unaltered completion
Validator Scores – Alignment evaluations across key dimensions
Tags – Categorical labels (e.g., toxicity, bias, jailbreak)
Reasoning Traces – Optional justifications from miners and validators
Mechanistic Metadata – When available, attention patterns, activation traces, or tool outputs
Hash Commitments – SHA-256 checksums ensuring full reproducibility

These artifacts form the foundation of the Aurelius Alignment Dataset — a living resource for alignment research and model auditing.

Access Methods

The protocol will support multiple modes of data access, tailored for different levels of technical and analytical use:

🔍 Public Dashboards

High-level summaries of alignment failures
Validator agreement trends
Dataset growth and category frequency over time

📦 Dataset Releases

Versioned exports of validated prompt–response pairs
Available in formats suitable for ML workflows (e.g., JSONL, CSV, Parquet)
Includes rubric metadata and schema documentation

⚙️ Programmatic Interfaces (API)

Query access for prompt/result pairs
Filtered access by tag, dimension, or rubric version
Rubric history and validator consensus lookups

Attribution and Licensing

All public data will include:

Versioning identifiers for traceability
Attribution guidelines for academic or commercial use
Clearly marked rubric versions and scoring standards at time of collection

Where appropriate, the protocol may adopt open data licenses that preserve integrity and ensure attribution without restricting research use.

Privacy and Model Confidentiality

For models under private or restricted evaluation:

Prompts and outputs may be encrypted or obfuscated
Model names, endpoints, and weights will not be exposed
Validator access will be restricted to essential scoring information
Audits will be conducted in isolated or secured compute environments

These safeguards protect model confidentiality while still producing alignment-relevant insights.

Summary

Aurelius is building a high-signal, high-integrity alignment dataset — not only for research, but for long-term transparency and safety across the AI ecosystem.

All data is reproducible, cryptographically verified, and schema-consistent
Access methods are tailored for both human and programmatic use
Privacy protections are in place for sensitive or private model evaluations
The dataset evolves as adversarial discovery and rubric logic mature

Alignment is not just a score — it’s a record. And that record belongs to the world.

What the Data Includes​

Access Methods​

🔍 Public Dashboards​

📦 Dataset Releases​

⚙️ Programmatic Interfaces (API)​

Attribution and Licensing​

Privacy and Model Confidentiality​

Summary​