Claude Code interview questions are showing up at AI-forward engineering teams the same way "do you use Git?" used to. The format isn't standardized — you're graded on real fluency with Anthropic's terminal coding agent, not docs recitation.
This guide is the breakdown: what gets asked, what the interviewer is measuring, and how to prep in a week. For timed mocks, Interview Coder runs the same drills.
What Is Claude Code and Why Companies Are Asking
Claude Code is Anthropic's CLI agent for coding. It lives in your terminal, reads your repo, runs commands, edits files, and ships PRs without you leaving the shell. It's been around since early 2025 and adoption at engineering teams has gone vertical.
Why interviewers care: if your team pays for a Max plan or burns API credits on agentic workflows, they want engineers who can drive the thing instead of chatting with it like ChatGPT. A senior who knows hooks, subagents, and MCP ships a week of work in an afternoon. A junior who doesn't know /plan burns tokens and breaks prod.
When a hiring manager asks about Claude Code, they're checking three things: can you use it, do you understand how it thinks, and can you measure if it helps.
The 3 Question Categories
Some teams do this as one panel. Others split across a recruiter screen, a working session, and an architecture round.
Usage Proficiency: 10 Questions
Warmups. Fumble these and the interview is over before the design round.
1. Walk me through your .claude/settings.json
Answer: Settings lives at ~/.claude/settings.json globally and .claude/settings.json per-project. I set my default model, output style, and lock permissions.allow for safe commands like git status and bun lint. Per-project is where I add hooks and MCP servers.
What they're testing: Whether you've opened the file or clicked through install.
2. What does /plan mode do and when do you use it?
Answer: Plan mode makes Claude Code think through the change before touching code. It outputs a numbered plan, you edit it, then it executes. I use it for anything touching more than two files.
What they're testing: The cost of letting the agent run blind.
3. Explain hooks and give one you'd actually write
Answer: Hooks are shell scripts that run on lifecycle events (PreToolUse, PostToolUse, Stop). First one I write on every machine: a PreToolUse hook on Bash that blocks rm -rf, git push --force, and DROP TABLE. Two lines, saves a bad afternoon.
What they're testing: That you take agent safety seriously.
4. What is MCP and why does it matter?
Answer: MCP is Model Context Protocol, the open standard for connecting LLMs to external tools. MCP servers let Claude Code talk to Notion, Linear, your database, your browser. I run Notion + Postgres + browser MCP on most projects.
What they're testing: Whether you've extended the agent past built-in tools.
5. How do you customize slash commands?
Answer: Slash commands live in .claude/commands/ as markdown files. Filename becomes the command. I keep a /ship that runs lint, build, test, commit, push in one keystroke.
What they're testing: Whether you've made the tool yours or use it stock.
6. What's a subagent and when does Claude Code spawn one?
Answer: A subagent is a child Claude instance with its own context window. The main agent dispatches one when a task is independent (codebase search, long investigation, boilerplate across 5 files). Output comes back as a summary so the parent's context stays clean.
What they're testing: Context window economics.
7. How do you handle long tasks without blowing context?
Answer: Dispatch subagents for search-heavy work, use /compact when the conversation gets long, write intermediate state to a markdown file so I can /clear and restart fresh.
What they're testing: Real operational experience.
8. Difference between /clear and /compact?
Answer: /clear wipes the conversation. /compact summarizes it into a brief and keeps going. Compact when I'm 60% through a task. Clear when starting something unrelated.
What they're testing: Surface area knowledge.
9. How do you give Claude Code persistent project context?
Answer: CLAUDE.md at the project root. The agent reads it on every session start. I put dev commands, architectural notes, code conventions, and "don't touch these files" rules in there.
What they're testing: Persistence vs ephemeral context.
10. Has the agent ever done something you didn't expect?
Answer: First time I let it loose, it rm -rf'd a build dir I needed. I added a hook to gate destructive commands and switched default permissions from auto-accept to ask-first for bash(rm:*).
What they're testing: Humility and real usage.
Architecture Understanding: 8 Questions
Where seniors get separated from juniors. Answers reveal whether you've thought about the system or just used it.
11. How does Claude Code decide which tool to call?
Answer: The model gets a system prompt describing each tool and its schema. On each turn it picks zero or more tools based on context. There's no router — the model does tool selection in-line, which is why prompt clarity and tool descriptions matter.
What they're testing: Tool selection is part of generation, not a separate layer.
12. When does the agent dispatch a subagent vs handle inline?
Answer: Subagents get dispatched when the task is independent, output can be summarized cheaply, and the work would pollute the parent's context. Heuristic: if I'd spawn a worker thread in real code, the agent probably spawns a subagent.
What they're testing: Whether you can predict agent behavior.
13. How is context managed across a long session?
Answer: Every tool call output goes back into context. Long reads and verbose command output eat tokens fast. The agent doesn't auto-prune; you compact, clear, or write to disk. Subagents help because their transcript stays out of the parent.
What they're testing: That you've felt the context ceiling.
14. What happens when Claude Code hallucinates a file path or API?
Answer: It errors and usually self-corrects next turn. The real failure is when it edits a file on a wrong assumption and never re-reads. I keep PostToolUse hooks that run bun lint after every Edit so type errors catch bad edits before they compound.
What they're testing: That you've debugged a real failure.
15. How do you reason about cost on a session?
Answer: Two drivers: input tokens (every file read, every command output) and output tokens. Max plan is flat-rate. On the API, a long agentic session hits 200k input tokens easily — I watch /cost and compact when working on big repos.
What they're testing: Unit economics.
16. What are output styles and when do you change them?
Answer: Output styles control verbosity and format. Default is fine for most things. I switch to concise for rapid iteration, explanatory when onboarding a junior or learning a new codebase.
What they're testing: Past-defaults exploration.
17. How does Claude Code differ architecturally from Cursor or Copilot?
Answer: Cursor is an IDE wrapper with inline completions and a chat sidebar. Copilot is mostly inline autocomplete with an agentic chat bolted on. Claude Code is terminal-first, fully agentic, with first-class tool use and hooks. Mental model: Copilot autocompletes, Cursor pairs, Claude Code ships.
What they're testing: Articulating tradeoffs.
18. What's the security model when the agent runs shell commands?
Answer: Permissions are gated by settings.json — allow, ask, or deny patterns like bash(git:*) or read(.env). Hooks add another layer: a PreToolUse hook can inspect the command and exit non-zero to block. For prod-adjacent work I run in a sandboxed container with no real credentials.
What they're testing: Blast radius thinking.
Practical Scenarios: 7 Questions
Where candidates leak. The interviewer hands you a scenario and watches how you drive the agent.
19. "Refactor this 800-line auth handler into smaller modules." Walk me through it.
Answer: /plan first so we agree on the breakdown. I'd ask it to map current responsibilities, propose 4-6 modules, list test coverage gaps. Execute one module at a time with lint + tests between each. No bulk rewrites — each module ships as its own commit.
What they're testing: That you don't yolo a big refactor.
20. "Design a workflow for shipping a new Stripe webhook handler."
Answer: Add a CLAUDE.md note on our webhook conventions (idempotency, signature verification, error handling). /plan with test cases first. Let the agent generate handler and tests. Use my /ship slash command for lint, type check, tests, PR. 30-45 min vs 2 hours by hand.
What they're testing: Repeatable workflow design.
21. The agent keeps editing files you don't want it to touch.
Answer: Add a permissions.deny block for those paths. Softer: note in CLAUDE.md "never edit files under src/legacy/". Hardest gate: a PreToolUse hook on Edit that blocks the path.
What they're testing: Knowing the three layers of control.
22. Common pitfall when devs first adopt Claude Code?
Answer: Letting it run unattended on big changes. Second: not customizing it — they use it stock, decide it's mid, uninstall. The agent is a fraction as useful without CLAUDE.md, hooks, and slash commands.
What they're testing: Field experience.
23. The agent loops on a broken test. Debug it.
Answer: Stop it. Read the test myself. Either the test is wrong, or the agent has the wrong mental model of the codebase. If the latter, add missing context to CLAUDE.md and /clear. Loops are almost always a context problem.
What they're testing: Diagnostic instinct.
24. The agent ran out of context mid-task. What now?
Answer: Have it write current state to WIP.md. /clear. Start a new session that reads WIP.md first. Also a moment to ask whether the task should be split into smaller commits.
What they're testing: Operational reflex.
25. How do you keep Claude Code from over-engineering?
Answer: A line in CLAUDE.md: "no premature abstraction, no helpers until used twice, no new dependencies without asking." Plus /plan for anything non-trivial so I can strip gold-plating before it ships.
What they're testing: Taste.
Evaluation Questions: 5 Questions
These catch people off guard because they're not technical.
26. How do you measure if Claude Code makes your team faster?
Answer: Three metrics: PRs per week per dev, PR review cycle time, incident rate. First two should go up, third should stay flat. Qualitative pulse: ask devs how much of their last PR they'd attribute to the agent. Under 30% means they're not using it well yet.
What they're testing: Outcomes thinking, not vibes.
27. How does Claude Code compare to Cursor for a team?
Answer: Cursor wins for devs who live in their IDE and want pair-programming. Claude Code wins for terminal-first devs who want automation. On our team most seniors converged on Claude Code, most juniors started on Cursor.
What they're testing: A recommendation without dogma.
28. What's the ROI conversation with finance?
Answer: Max plan is ~$200/dev/month. One extra PR per month is 10-20x return at typical loaded engineering cost. Harder question is whether outputs hit the same quality bar — that's what review processes are for.
What they're testing: A budget conversation, not just a tech one.
29. Where does Claude Code make engineers worse?
Answer: It atrophies debugging skills if you always reach for it first. Juniors stop reading stack traces. Mitigation: a "no-agent Wednesday" to keep manual debugging instincts sharp.
What they're testing: Honesty. Most candidates pitch it as pure upside.
30. Given a week to evaluate Claude Code for our team, what would you do?
Answer: Day 1: install for 3 volunteers, write a starter CLAUDE.md. Day 2-4: each dev ships one real feature. Day 5: retro. Day 6: writeup with PR throughput data. Day 7: present. Concrete beats "we'll try it out."
What they're testing: Running a structured evaluation.
How to Prep: A 1-Week Plan
Don't read docs for a week. Use the thing.
Day 1 — Install and configure. Install Claude Code. Write a CLAUDE.md for a real project. Set up ~/.claude/settings.json with model, allowed commands, a couple denies. Read the quickstart once.
Day 2 — Ship a feature. Pick a small thing in your side project. Use /plan end to end. Note where it struggled and where it crushed.
Day 3 — Hooks and slash commands. Write a PreToolUse hook that blocks destructive shell commands. Write a /ship slash command for lint/test/commit/push.
Day 4 — MCP. Install at least one MCP server (Notion or browser are easy wins). Use it on a real task.
Day 5 — Subagents and context. Run a task big enough to hit your context limit. Force yourself to use /compact and subagent dispatch.
Day 6 — Compare. Same task in Cursor and Claude Code. Note the difference in time, output quality, cognitive load. That's your interview answer.
Day 7 — Narrate. Record yourself walking through your workflow out loud. Play it back. First time hurts. That's the point.
For structured mock sessions, Interview Coder runs timed agent-workflow drills.
FAQ
How much does Claude Code cost?
Max plan is ~$200/month for heavy use. Pro is $20/month with lower limits. Or run on API credits per token — fine for occasional use, expensive at scale. Anthropic pricing here.
Terminal vs IDE — does it matter?
For a lot of devs, yes. Terminal-first means Claude Code composes with everything in your shell — tmux, vim, scripts, CI. IDE agents are easier on day one and harder to extend on day thirty.
Max plan vs API — which?
Max plan if you use it daily. API if it's experimental. Most engineers hit Max plan economics within a week of real adoption.
How does it compare to Copilot Agent or Cursor Composer?
Copilot Agent is tied to the GitHub editor. Cursor Composer is the sharpest IDE agent right now. Claude Code wins on extensibility (hooks, MCP, custom skills) and terminal workflows. The Anthropic Engineering blog covers internal adoption patterns.
Will these questions become standard?
At AI-forward shops, they already are. At traditional enterprises, 12-18 months. Either way, "can you drive an agent well" matters for the next decade of hiring.
Related Reading
Want to practice agent-driven coding interviews with real-time feedback? Try Interview Coder free and run timed mocks that mirror what AI-forward teams ask.