2. Benchmark initialization
Maps the upstream call chain, records honest baselines at every tier, picks metrics + thresholds for the concordance gate, profiles to find hotspots.
In a fresh coding-agent session from the task directory, paste:
Time: 20–60 minutes (depends on baseline cost; the agent runs the slow function across tiers).
What to watch for:
task.yaml::metricsfilled in with thresholds the agent justified from upstream behaviorreference_outputs/<tier>/populated for every tiermemory/discoveries.mdhas the profiling notes
Stop here and run the Auditor. Copy the initialization auditor prompt below and paste it into a fresh coding-agent session from this task directory. Continue only after the verdict is PASS or WEAK.
Optional: reflect before you move on. While this session is still open, run the Reflection agent right here (no fresh session) to report anything confusing or missing about initialization. Recommended before you contribute.
Next -> Auditor agent.