I spent three months recording every coaching session with my clients' permission. Not because I thought I'd go back and listen to the recordings. I never do, and that's not the point. I wanted the transcript. Specifically, I wanted to be able to paste sections into Margaret and ask questions I couldn't ask in real time: What did she keep circling back to? Where did her language shift? What did she say at minute eight that she quietly contradicted by minute forty? That's the real case for AI transcription for coaches, not the recording itself, but what becomes possible when you have the text and a good AI layer on top of it.
Here's what I learned about which tools actually work, what to do with the output, and where I've drawn hard lines.
A transcript you never use is just a longer version of notes you never wrote
I want to start here because it's where most coaches stall out. You sign up for a transcription tool, you record a few sessions, you get the transcript in your inbox, you skim it, you feel vaguely responsible about having it, and then you never open it again. I know because that was my first three weeks.
The transcript by itself is almost useless. It's a wall of text with every "um" and false start included, and reading through a full session transcript takes nearly as long as the session itself. Nobody has time for that. What makes transcription worth doing is what you do with the transcript afterward, and for me that means running specific parts of it through Margaret with very specific questions.
I wrote about my session notes workflow in an earlier piece, the context sandwich prompt that takes raw notes and turns them into something I can actually use before the next call. Transcription feeds that same process, but with much richer raw material. Instead of working from my own shorthand and a sixty-second voice memo, I can pull the exact passage where a client's language changed and ask Margaret what she notices. The difference in note quality is significant.
But it only matters if you build the workflow. The tool alone does nothing.
Getting consent right
This has to come first. Recording a coaching session without explicit, informed consent isn't just ethically wrong, it would destroy the trust that makes the work possible. I want to be specific about how I handle this because the legalistic version ("by continuing this session you consent to recording") is exactly the wrong posture for a coaching relationship.
Here's roughly what I say, usually in our first session or when I introduce the practice:
"I've started recording our sessions so I can capture things more accurately in my notes afterward. I don't listen back to the recordings. What I use is the transcript, and only to help me prepare for our next conversation. The recording is deleted after I've processed the notes. If you'd rather I not record, that's completely fine and it won't change anything about how we work together. And if you say yes now and change your mind later, just tell me."
That's it. No release form (though I do note their verbal consent in my records). No lengthy explanation of the technology. The key is the last two sentences. Making it easy to say no, and making it easy to change their mind. About two out of every ten clients have declined. That's fine. For those sessions, I use my old approach: shorthand notes and a voice memo on my walk afterward.
One thing I've learned: clients are more comfortable when you tell them what you're NOT doing with the recording. "I don't listen back" and "the recording gets deleted" matter more to people than any explanation of what you are doing.
Fathom: the one I actually use
I'll tell you what I recommend and then I'll tell you what's wrong with it.
Fathom is the transcription tool I've settled on after testing four options over those ninety days. It runs quietly in the background during Zoom sessions, produces a transcript and an AI-generated summary, and stays out of the way. The summary is decent. Not as good as what I get when I run the transcript through Margaret with my own prompts, but surprisingly useful as a first pass.
What makes Fathom work for a coaching practice specifically: it doesn't announce itself loudly. Some transcription tools join the call as a visible participant, which creates an awkward "who's that?" moment. Fathom integrates into Zoom more quietly. My clients mostly forget it's there after the first session, which is what you want.
The free tier gives you enough to decide if it's useful. The paid plan (around $19/month as of early 2026) adds better search across past transcripts and longer storage, which matters if you want to reference something from three months ago.
Here's what's wrong with it: Fathom is Zoom-dependent. If you do phone sessions, it doesn't work. If you meet clients in person, it doesn't work. And the AI summary, while decent, tends to flatten the emotional texture of a conversation into something that reads like meeting minutes. It catches the topics. It misses the weight. That's why I don't rely on Fathom's summary. I use Fathom for the transcript and Margaret for the thinking.
I mentioned Fathom in my piece on coaching apps, and my position hasn't changed. It's the AI layer I'd add before any coaching-specific platform. For the cost and the time it saves, the math is simple.
Otter: the one you've probably already heard of
Otter.ai is the name that comes up most often when coaches ask about transcription. It's been around longer, it has broader brand recognition, and it works across more platforms than Fathom.
The transcription quality is comparable. Where Otter differs is in how it handles the output. Otter wants to be your meeting productivity platform. It generates action items, highlights, summaries, and it's built around the assumption that you're in meetings all day and need to extract tasks. For someone in a corporate role taking eight calls a day, that makes sense. For a coaching practice, most of those features are noise.
I tested Otter for about five weeks. The transcripts were good. The experience around the transcripts was cluttered. I kept having to ignore features that weren't relevant to what I was doing. And the AI summary was more aggressive about interpreting the conversation, which in a coaching context sometimes produced summaries that felt editorialized in ways I didn't want in a client record.
Otter is fine. It works. If you're already using it and you're happy, there's no strong reason to switch. But if you're starting fresh, Fathom is a cleaner fit for how coaching sessions actually work.
Fireflies: for the back-to-back days
Fireflies.ai is the most automation-oriented of the four tools I tested. It can automatically join your scheduled calls, record, transcribe, and file the output without you doing anything. For coaches who run six or seven sessions in a day and don't want to remember to start and stop a recording tool each time, that's a real benefit.
The transcription quality is solid. The summaries are roughly on par with Otter. Where Fireflies gets interesting is in its search and filtering, you can search across all your transcripts for a specific phrase or topic. I can see this being genuinely useful for a coach tracking themes across a caseload, though I haven't integrated it deeply enough to confirm that.
What gave me pause: Fireflies joins calls as a named participant ("Fireflies.ai Notetaker" or similar). Several of my clients noticed and asked about it, which interrupted the opening of the session in a way I didn't love. You can rename it, and you should, but the default experience signals "this is being recorded by a third party" in a way that works against the consent conversation I described above.
I used Fireflies for about three weeks before switching to Fathom. Not because it was bad. Because the automation solved a problem I don't actually have. I see four, maybe five clients a day. Clicking "record" isn't the bottleneck. The bottleneck is what happens after.
Notta: the budget option that's better than expected
Notta surprised me. It's less well-known, the interface is less polished, and the marketing doesn't have the venture capital sheen of the others. But the core transcription is genuinely good, and the free tier is generous enough to use for a small practice without paying anything.
Where Notta stands out: it handles audio file uploads well, which means if you record a session using your phone's voice memo app (for in-person sessions, or as a backup), you can upload that file to Notta and get a transcript. That flexibility matters for coaches who aren't exclusively on Zoom.
The AI summary features are basic compared to Fathom or Otter. But if you're using the transcript as raw material for Margaret anyway (which is what I recommend), that barely matters. You don't need the tool's summary to be brilliant. You need the transcript to be accurate and the price to be reasonable.
If you're on a tight budget and seeing fewer than ten clients a week, Notta is worth a serious look before you commit to a paid plan elsewhere.
What to actually do with the transcript
This is the part that matters more than which tool you pick. Here's my actual workflow, including the prompts I use with Margaret.
After a session, Fathom gives me a transcript. I don't read the whole thing. I skim it for the sections I remember feeling significant, usually the passages where a client went quiet, where their language shifted, or where something surfaced that surprised me. I copy those sections.
Then I open Margaret and use one of three prompts depending on what I need.
For pattern detection across sessions:
Here is a transcript excerpt from today's session with [initial]. Here are excerpts from our previous three sessions [pasted from context file]. What language patterns or themes show up repeatedly? Where do you notice shifts in how they talk about [specific topic]? Flag anything they said today that contradicts or evolves something from a previous session.
For generating my pre-session prep before our next meeting:
Based on this transcript excerpt and the running context file for [initial], what are the three most important threads to hold for next session? What question would be worth opening with? Write in my voice: direct, specific, first person. Don't use clinical language.
For catching what I missed:
Here's the transcript from today's session with [initial]. I noticed [specific thing]. What else stands out that I might not have caught in the moment? Focus on places where their language or energy seemed to shift.
That last one is the most useful and the most humbling. Margaret regularly surfaces things I missed because I was doing what coaches do during a session: being present, tracking emotion, holding the space. You can't do that and simultaneously catalog every word choice. The transcript lets me go back and catch the things that presence naturally trades away.
This feeds directly into the context sandwich I described in my session notes article. The transcript gives me richer raw material for the middle layer. Instead of "client mentioned board tensions" from my shorthand, I have the exact sentence, the hedging language, the way they started to say something and then backed off. That specificity makes the notes better, which makes the next session better.
Where I've drawn the line
There are things I will not run through a transcript, and this is a line I feel strongly about.
If a client discloses something deeply personal, something about their health, their marriage, their family, and it comes out in a moment of real vulnerability, I don't paste that section into any AI tool. I note it by hand in my own shorthand, in my encrypted files, and that's where it stays. The transcript for that session gets used around that moment, not through it.
I also don't use transcription with the two clients I mentioned in my session notes article, the ones whose situations are sensitive enough that I keep everything offline. For them, it's notebook and voice memo only.
And I never, under any circumstances, let a client's full transcript sit in a tool's cloud storage longer than I need it. Once I've processed my notes, I delete the recording and the transcript from the platform. My notes live in Notion, where I control them. The raw material doesn't need to persist somewhere else.
The honest comparison
| | Fathom | Otter | Fireflies | Notta |
|---|---|---|---|---|
| Best for | Solo coaches on Zoom | Coaches already using it | High-volume, back-to-back days | Budget-conscious, mixed formats |
| Transcription quality | Strong | Strong | Strong | Good |
| AI summary quality | Good (but use Margaret instead) | Decent, sometimes editorialized | Decent | Basic |
| Intrusiveness during session | Low | Low | Medium (visible bot participant) | Low |
| Works with phone/in-person | No | Limited | No | Yes (audio upload) |
| Free tier | Usable | Usable | Limited | Generous |
| Paid plan | ~$19/mo | ~$17/mo | ~$19/mo | ~$14/mo |
| Where it falls short | Zoom-only | Cluttered features | Bot visibility | Less polished interface |
What I'm still working on
The thing I haven't cracked yet is real-time pattern detection. Right now I process transcripts after the session, which means the insights come between sessions, not during them. There's a version of this where something could surface a quiet note mid-session: "She used this same phrase three sessions ago in a completely different context." I don't want that in the room while I'm coaching. But I wonder about having it available on a screen I could glance at during a pause. I'm not sure if that would sharpen my presence or fracture it. Probably the latter, honestly.
I'm also still iterating on how much transcript to feed Margaret at once. A full sixty-minute transcript produces a lot of noise. The sections I select manually tend to produce better results than dumping everything in. But selecting the right sections requires me to remember what mattered, which is the problem transcription was supposed to solve in the first place. There's a circular quality to it that I haven't resolved.
What I do know: three months in, my session notes are meaningfully better. Not because the AI is insightful. Because the transcript holds what my memory can't, and Margaret organizes it in a way that serves the next conversation. The forty minutes I used to spend on Sunday trying to reconstruct what "board dynamics, loss of trust, consider 360?" meant? That's gone. And the notes I have instead actually help me show up more prepared, which is the whole point. The craft is still the hour in the room. Everything else is just making sure I'm ready for it.