Most people who claim to do TDD are actually doing something I’d call test-decorated development: they write the code, it works, then they write tests to confirm it works. This is fine as a confidence mechanism. It is not TDD.
Real TDD runs the cycle the other way: write a failing test first, then write code to make it pass. The failing test forces you to think about what you actually want the code to do - its public interface, its behavior, its edge cases - before you’ve committed to any implementation details.
But there’s a subtler mistake that even genuine TDD practitioners make, and it’s the difference between horizontal slicing and vertical slicing.
Horizontal vs Vertical is a distinction that cctually matters
Horizontal slicing means writing all the tests for a layer before implementing it. Write all the model tests. Then write all the service tests. Then write all the controller tests. Then wire them together. It sounds organized. It produces garbage.
Here’s why: when you write tests for a component in isolation before knowing how the other components will actually behave, you’re making assumptions. Those assumptions get baked into mocks. The mocks lie. The tests pass. The integration is broken. You’ve spent a week on a testing pyramid that doesn’t model reality.
Vertical slicing means taking one user-visible behavior and implementing the full stack to make it work — one test at a time, from the public interface inward. Not “test all the parsing logic” but “test that when a device connects and sends a telemetry frame, the session records a lap.” One behavior. Full stack. Working and tested.
The practical result is that you always have working software. After the first vertical slice, you have one thing that works end-to-end. After the second, you have two. At no point do you have a beautifully tested data layer that can’t actually be used because the service layer isn’t done yet.
For agent-driven development, vertical slicing is not optional - it’s load-bearing. When you hand an agent an issue, it needs to produce a working, tested vertical slice. If the instruction is implement the session parsing layer”, you get a pile of code that may or may not integrate correctly. If the instruction is “implement: when a session is synced from a device, it should be queryable from the sessions screen”, you get code that either works end-to-end or fails an integration test that tells you exactly why.
Tests become the contract between the issue description and the implementation. If the agent’s
code passes the tests, the behavior is correct. If not, the test output is the feedback loop.