◆ Claude 4.5 Sonnet (Claude Code) · ChatGPT (GPT-5) + Cursor · Gemini 2.5 Pro Deep Think
Code Debugging + Refactoring.
Intent — Diagnose stack traces, refactor monster functions
The hottest LLM battleground. All three frontier models post strong SWE-bench numbers but real-world usage diverges hugely — Claude Code can agentically operate on a repo / GPT-5 has the deepest Cursor integration / Gemini 2.5 Pro digests an entire codebase before making architectural decisions.
01
Claude 4.5 Sonnet (Claude Code)
◆ Recommended prompt
Here's a TypeScript Next.js project. The /api/comments endpoint is throwing 500 sometimes — see attached stack trace + the route handler. Diagnose root cause, propose fix, then implement it across all relevant files. Run typecheck after.Open in Claude (paste it)
✓ Strengths
- **Claude Code agentic mode**:直接 read/write/run,免來回貼 code
- 對程式碼 "intent" 理解深 — 抓得到 race condition / TOCTOU 等微妙 bug
- Long-context refactor 連貫性最強
✗ Weaknesses
- Claude Code 是 CLI / IDE plugin,不像 Cursor 內建編輯體驗
- Web 搜尋整合需要 Computer Use
- Pro 訂閱有訊息額度限制
When to use
全 stack debugging / refactor / agentic 任務(讓它自己跑 test、自己修)。
02
ChatGPT (GPT-5) + Cursor
◆ Recommended prompt
// In Cursor: select the failing function and use Cmd+K Fix this race condition in the rate limiter — the SET-IF-NOT-EXISTS check has a TOCTOU window. Use atomic INCR with TTL instead.Open in ChatGPT (paste it)
✓ Strengths
- **Cursor / Copilot 整合最深** — IDE 內 inline 完成、ghost text、CMD+K
- GPT-5 在 LiveCodeBench / SWE-bench 持續領先
- o1 / o1 Pro 的深度思考對複雜邏輯題特別強
✗ Weaknesses
- Long-context refactor 不如 Claude 連貫
- Code Interpreter / Canvas 有時 over-engineer
- Pro plan 才有 o1 Pro
When to use
IDE 內 daily coding / inline 補全 / 短任務:bug fix / 加 feature flag / 寫測試。
03
Gemini 2.5 Pro Deep Think
◆ Recommended prompt
[Attach: full 50k-line codebase via Repo extension] This Next.js app is migrating from Pages Router to App Router. Read the entire codebase, identify all API routes / page components / shared utilities, and produce: (1) migration order with dependency graph (2) breaking changes list (3) per-file refactor checklist. Don't skip any file.Open in Gemini (paste it)
✓ Strengths
- **1M token context**:可看完整 50k 行 codebase 再給 architectural advice
- 對 monorepo / cross-package 依賴的 understanding 比另兩個強
- Deep Think 模式對 "先想再答" 任務有幫助
✗ Weaknesses
- Inline IDE 整合弱於 GPT + Cursor
- Coding agentic 工具鏈不如 Claude Code 成熟
- 回應較慢(Deep Think 模式)
When to use
大型 refactor / migration / architecture review / 看完整 codebase 後做計畫的工作。
◆ Bottom line
Agentic repo operations → Claude Code. Cursor / Copilot inline completion → GPT-5. Refactoring a large codebase after full understanding → Gemini 2.5 Pro.