AgentX
Menu

Drafting Framework

The drafting framework implements multi-model generation strategies that trade off speed, quality, and cost. Strategies are defined in drafting/drafting_strategies.yaml.

Drafting is disabled by default (AgentConfig.enable_drafting = False).

Strategies

Speculative Decoding

A fast draft model generates tokens that a stronger target model verifies, accepting or rejecting each batch.

sequenceDiagram
    participant D as Draft Model (fast)
    participant T as Target Model (strong)

    loop Until done or max iterations
        D->>D: Generate N draft tokens
        D->>T: Send draft for verification
        T->>T: Score each token
        T-->>D: Accept/reject (threshold)
    end
ConfigDescription
draft_modelFast model (e.g., gpt-3.5-turbo, llama3.2)
target_modelStrong model (e.g., gpt-4-turbo, claude-3.5-sonnet)
draft_tokensTokens per draft batch (20–30)
acceptance_thresholdMinimum score to accept (0.7–0.8)
max_iterationsMaximum draft-verify cycles

Pre-configured strategies: fast_accurate, local_cloud, claude_fast

Pipeline

Multi-stage generation where each stage uses a different model with a specific role.

Stage RoleDescription
analyze / codeInitial generation
critique / reviewCritical review
refineIncorporate feedback
summarizeFinal synthesis

Each stage has its own model, system prompt, and temperature.

Pre-configured strategies: code_review (generate → review → refine), writing_pipeline (outline → draft → edit → polish), analysis_pipeline (decompose → research → synthesize)

Candidate Generation

Generate multiple candidates and select the best using a scoring method.

Scoring MethodDescription
majority_voteMost common answer wins
verifierSeparate model scores each candidate
length_preferencePrefer longer/shorter responses

Pre-configured strategies: consensus (multi-model vote), best_of_n (N candidates + verifier), diverse_ensemble (varied models), self_consistency (same model, multiple samples)

Result Structure

DraftResult contains:

FieldTypeDescription
contentstringFinal output
strategystringStrategy name
statusDraftStatus"complete" or "failed"
draft_tokensintTokens drafted
accepted_tokensintTokens accepted (speculative)
models_usedlist[string]All models involved
stages_completedintPipeline stages run
candidates_generatedintCandidates produced
estimated_costfloatEstimated USD cost
total_time_msfloatElapsed time

Task Defaults

The defaults section in drafting_strategies.yaml maps task types to strategies:

TaskStrategy
generalfast_accurate
codecode_review
writingwriting_pipeline
analysisanalysis_pipeline
consensusconsensus