AI paper index
SePO: Self-Evolving Prompt Agent for System Prompt Optimization
One-line summary
An AI research paper on SePO: Self-Evolving Prompt Agent for System Prompt Optimization.
Engineering notes
Engineering notes will be added by the aipentium editorial team.
Chinese explanation / 中文解读
中文解读待补充:本站会优先为大语言模型、生成式AI、ChatGPT相关技术、计算机视觉、深度学习等高价值论文补充中文说明。
Original abstract
System prompt optimization improves agent behavior without modifying the underlying model, yielding human-readable, model-agnostic instructions. Existing methods build a prompt agent that refines task agents' system prompts, yet leave the prompt agent's own system prompt hand-engineered and fixed. We propose Self-Evolving Prompt Optimization (SePO), which treats the prompt agent's own system prompt as an optimization target alongside task agents' system prompts. SePO adopts a self-referential design. A single prompt agent improves both task agents' system prompts and its own under an open-ended evolutionary search that maintains an archive of candidate prompts as stepping stones. Training proceeds in two stages: pre-training evolves the prompt agent on a multi-task pool, and fine-tuning then applies it to a target task. Across five benchmarks spanning math (AIME'25), abstract reasoning (ARC-AGI-1), graduate-level science (GPQA), code generation (MBPP), and logic puzzles (Sudoku), SePO consistently outperforms Manual-CoT, TextGrad, and MetaSPO, improving the average accuracy by 4.49 points compared to Manual-CoT. The prompt optimization skill from pre-training also generalizes to tasks beyond the pre-training mixture, rather than memorizing per-task prompts.
Links and sources
Need this topic turned into a technical roadmap?
aipentium can prepare a custom AI literature review, code map, dataset map, and B2B technology assessment.
Request B2B AI research
Comments