Dataset
Dataset.from_jsonl(path, name=None, format="auto")
Load a dataset from a JSONL file.
| Parameter | Default | Description |
|---|---|---|
path | — | Path to the JSONL file. |
name | filename stem | Display name for the dataset. |
format | "auto" | Input format: "auto", "openai", "sharegpt", or "alpaca". |
Dataset.from_list(items, name="inline", format="auto")
Create a dataset from a list of dicts (same schema as JSONL lines).
| Parameter | Default | Description |
|---|---|---|
items | — | List of records in OpenAI, ShareGPT, or Alpaca format. |
name | "inline" | Display name for the dataset. |
format | "auto" | Input format: "auto", "openai", "sharegpt", or "alpaca". |
dataset.summary()
Return a dict with name, sample count, whether ideals are present, and metadata keys.
dataset.has_ideals()
Return True if every conversation has an ideal field.
Conversation
Each item in a dataset is aConversation:
| Property | Type | Description |
|---|---|---|
messages | list[Message] | The conversation messages. |
ideal | str | None | The reference answer. |
metadata | dict | Arbitrary metadata. |
prompt_messages | list[dict] | Messages as plain dicts, ready to send to a provider. |
last_user_message | str | None | The last user message in the conversation. |