OptimizationResult

Returned by PromptOptimizer.run(). Contains the optimized prompt, scores, iteration history, and analysis.

Properties

Property	Type	Description
`best_prompt`	`str`	The highest-scoring prompt found
`best_score`	`float`	The score of the best prompt
`iterations`	`list[IterationRecord]`	All iteration records
`converged`	`bool`	Whether the score threshold was reached
`baseline`	`EvalSnapshot \| None`	Baseline eval snapshot (on held-out test set if split is enabled)
`final`	`EvalSnapshot \| None`	Final verification snapshot (on held-out test set if split is enabled)
`train_size`	`int`	Number of training examples used for optimization (0 if no split)
`test_size`	`int`	Number of held-out test examples used for baseline and final eval (0 if no split)
`val_size`	`int`	Number of validation examples tracked per-iteration (0 if `val_ratio=0`)
`val_trajectory`	`list[float]`	Val set mean score after each optimization iteration (empty if no val split)
`early_stopped`	`bool`	`True` if optimization was stopped early because val score plateaued
`batch_size`	`int`	Per-iteration mini-batch size (0 = full training set was used)
`p_value`	`float \| None`	p-value from paired significance test (Wilcoxon or t-test). `None` if fewer than 2 samples or scipy not installed
`is_significant`	`bool \| None`	`True` if `p_value < 0.05`. `None` when `p_value` is unavailable
`total_eval_tokens`	`int`	Total tokens used by the eval model across the run
`total_reasoning_tokens`	`int`	Total tokens used by the reasoning model across the run
`strategy_name`	`str \| None`	Strategy that was used
`phase_history`	`list[dict] \| None`	Auto mode phase breakdown

Computed properties

Property	Type	Description
`score_trajectory`	`list[float]`	Score at each iteration
`improvement`	`float \| None`	Absolute score improvement (final − baseline)
`improvement_pct`	`float \| None`	Percentage improvement

Methods

`summary()`

Returns a formatted string with scores, trajectory, strategy analysis, prompt diff, and before/after example.

print(result.summary())

`to_dict()`

Serialize to a dictionary.

data = result.to_dict()

`to_json(path)`

Save full results to a JSON file.

from pathlib import Path
result.to_json(Path("results.json"))

`save_best_prompt(path)`

Write the optimized prompt to a text file.

from pathlib import Path
result.save_best_prompt(Path("best_prompt.md"))

IterationRecord

A single optimization iteration.

Property	Type	Description
`iteration`	`int`	Iteration number
`system_prompt`	`str`	The prompt used in this iteration
`score`	`float`	Overall score
`scores_by_metric`	`dict[str, float]`	Per-metric scores
`reasoning`	`str`	Agent’s reasoning for the change
`eval_tokens`	`int`	Tokens used by the eval model this iteration
`reasoning_tokens`	`int`	Tokens used by the reasoning model this iteration
`change_summary`	`str`	One-line description of what the agent changed (e.g. “Added output format constraints”)
`val_score`	`float \| None`	Validation set score for this iteration (`None` when `val_ratio=0`)

EvalSnapshot

Scores from a single eval run (baseline or final).

Property	Type	Description
`mean_score`	`float`	Mean score across all samples (and across runs when `eval_runs > 1`)
`std_score`	`float`	Std dev of per-run mean scores. `0.0` when `eval_runs=1`
`n_runs`	`int`	Number of eval passes averaged to produce `mean_score`. `1` by default
`scores_by_metric`	`dict[str, float]`	Per-metric mean scores
`system_prompt`	`str`	The system prompt used
`samples`	`list[SampleSnapshot]`	Per-sample results (scores averaged across runs when `eval_runs > 1`)
`total_tokens`	`int`	Total tokens used by the eval model in this snapshot

SampleSnapshot

A single sample’s input, output, and score.

Property	Type	Description
`input`	`str`	The input prompt
`response`	`str`	The model’s response
`ideal`	`str`	The reference answer
`score`	`float`	Score for this sample

Getting started

Guides

Tutorials

API reference

OptimizationResult

OptimizationResult

Properties

Computed properties

Methods

`summary()`

`to_dict()`

`to_json(path)`

`save_best_prompt(path)`

IterationRecord

EvalSnapshot

SampleSnapshot

​OptimizationResult

​Properties

​Computed properties

​Methods

​summary()

​to_dict()

​to_json(path)

​save_best_prompt(path)

​IterationRecord

​EvalSnapshot

​SampleSnapshot

OptimizationResult

Properties

Computed properties

Methods

`summary()`

`to_dict()`

`to_json(path)`

`save_best_prompt(path)`

IterationRecord

EvalSnapshot

SampleSnapshot