System prompts are crucial for defining an agent's behavior, persona, and operational guidelines within an agent-architecture. They establish the initial instructions for a large language model (LLM), setting its role, constraints, and how it should process subsequent inputs. As a core component of context-engineering, system prompts significantly influence agent performance, reliability, and robustness.
Prompt Design and Optimization
System prompts can be systematically improved through automated optimization, specific techniques, and structured frameworks.
Automated and Online Optimization
Ctx2Skill is a self-evolving framework that autonomously discovers, refines, and selects context-specific skills for language models.
It uses a multi-agent self-play loop with a Challenger, Reasoner, and Judge to evolve skills.
Proposer and Generator agents analyze failure cases to synthesize targeted skill updates.
A Cross-time Replay mechanism helps prevent adversarial collapse and ensures robust skill evolution.
Prompting Techniques and Frameworks
Collab-REC is a multi-agent framework that uses LLM-based agents with distinct roles (e.g., Personalization, Popularity, Sustainability) to generate diverse recommendations.
A non-LLM moderator merges and refines proposals from these agents through iterative constrained refinement.
LLM-assisted reranking can operationalize nuanced objectives in recommender systems beyond traditional engagement or accuracy metrics.
Without constraints, naive prompts in reranking can inadvertently amplify exposure to ideologically extreme or conspiratorial content.
Lightweight prompt-level regularization can reduce the promotion of extreme content and increase ideological diversity with modest relevance loss.
LLMs rerank based on statistical regularities in language rather than a semantic understanding of ideology.
Prompt design should be considered a value-laden rather than a neutral default process.
Prompt Fidelity and Artifacts
When prompting small language models (SLMs) for psychometric assessments, prompt artifacts can frequently overpower the semantic signal.
Prompt artifacts include variations in personas, instructions, items, and option symbols.
In such cases, models predominantly reflect prompt compliance rather than simulated psychological traits.
A prompt variation framework can be used as a diagnostic tool to identify destructive artifacts and isolate semantic understanding.
Persona and Role Design
Anthropomorphization in LLM-based conversational agents involves ascribing human-like qualities to non-human entities.
LLMs routinely generate interactional and linguistic cues, such as first-person self-reference and affective expressions, which can increase user engagement.
While anthropomorphization raises ethical concerns like deception, overreliance, and exploitative relationship framing, some argue it can support autonomy, well-being, and inclusion.
Internal Coherence Maximization (ICM) can generate persona-specific in-context examples to steer a model toward a target group's values without human supervision.
More coherent examples generalize substantially better than incoherent ones, even when individual label accuracy is held constant.
LLM agents can be created with various roles and specialized prompts, accessing different information parts to generate synthetic data.
Listener LLM agents can be conditioned on finetuned conversation goals to cover diverse scenarios.
Model Capabilities and Calibration
LLM refusals to follow benign instructions, like adopting a political position or persona, can signal a capability deficit rather than just safety guardrails.
"Ideological depth" is proposed as a property comprising a model's ability to follow political instructions (steerability) and the feature richness of its internal political representations.
Models with higher ideological depth (more steerable) activate significantly more distinct political features.
Instruction-tuned LLMs are less calibrated than base pre-trained models, and the chat template further aggravates this issue.
LLMs exhibit an "ownership bias," being significantly more confident in their own answers than in identical answers provided by a user.
An inference-time strategy of framing the model's answer as user input during confidence elicitation can reduce overconfidence and improve calibration by up to 26%.
Skill Portability and Security
LLM agents increasingly rely on reusable skills, but these lack portability due to agent frameworks' sensitivity to prompt formatting.
SkCC is a compiler for LLM agents that introduces classical compilation design to skill development.
SkCC uses SkIR, a strongly-typed intermediate representation, to decouple skill semantics from framework-specific formatting, enabling portable deployment across agent frameworks.
A static Optimizer within SkCC enforces security constraints, blocking vulnerabilities before deployment.
SkCC improves pass rates on benchmarks like claude-code and Kimi CLI.
Security and Robustness: Prompt Injection
Prompt injection (PI) attacks pose a significant threat to LLM-based applications, including automatic grading systems.
Attackers can exploit PI vulnerabilities to manipulate systems into assigning artificially high scores, regardless of actual answer quality.
Current LLM-based automatic grading systems remain highly vulnerable to prompt injection attacks.