Skip to main content

Supported formats

aevyra-verdict accepts JSONL and CSV files. JSONL auto-detects the format from the first record — pass format= explicitly to override. CSV uses from_csv() with configurable column names.
The native format. Used by default.
{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "ideal": "The capital of France is Paris.",
  "metadata": {"category": "factual", "difficulty": "easy"}
}
FieldRequiredDescription
messagesYesArray of {role, content} objects. Roles: system, user, assistant.
idealNoReference answer. Required for ROUGE, BLEU, and exact match metrics.
metadataNoArbitrary key-value pairs for filtering and grouping.

Loading

from aevyra_verdict import Dataset

# JSONL — auto-detect format (default)
dataset = Dataset.from_jsonl("data.jsonl")

# JSONL — explicit format
dataset = Dataset.from_jsonl("sharegpt_data.jsonl", format="sharegpt")
dataset = Dataset.from_jsonl("alpaca_data.jsonl", format="alpaca")

# Arbitrary JSONL with custom field names
dataset = Dataset.from_jsonl("data.jsonl", input_field="question", output_field="answer")
dataset = Dataset.from_jsonl("data.jsonl", input_field="prompt", output_field=None)  # label-free

# CSV — default column names (input, ideal)
dataset = Dataset.from_csv("data.csv")

# CSV — custom column names
dataset = Dataset.from_csv("data.csv", input_field="article", output_field="summary")

# CSV — label-free (no reference answers)
dataset = Dataset.from_csv("data.csv", output_field=None)

print(dataset.summary())
# {'name': 'data', 'num_conversations': 50, 'has_ideals': True, 'metadata_keys': ['category']}
Or inline, without a file:
dataset = Dataset.from_list([
    {"messages": [{"role": "user", "content": "Hello"}], "ideal": "Hi there"},
])

# ShareGPT inline
dataset = Dataset.from_list(sharegpt_records, format="sharegpt")

Label-free datasets

When your data has no reference answers, omit output_field (or use output_field=None). Use LLMJudge or a CustomMetric — reference-based metrics (RougeScore, BleuScore, ExactMatch) will raise a clear error upfront.
dataset = Dataset.from_csv("questions.csv", output_field=None)
print(dataset.has_ideals())  # False

Filtering

Filter by any metadata field. Multiple filters are ANDed together.
hard = dataset.filter(difficulty="hard")
reasoning = dataset.filter(category="reasoning", difficulty="hard")

CLI

Preview a dataset without running any models:
aevyra-verdict inspect examples/sample_data.jsonl