How to Evaluate an AI Medical Scribe: 9 Questions
By Patient Square Team · · 9 min read
Most AI scribe demos look identical. They all draft a clean note in two minutes on a scripted visit. The differences that matter sit underneath the demo: what the tool does with your audio, whether it drafts prescriptions and how it checks them, whether it suggests codes or pretends to bill them, which languages it handles, whether you can take your notes and leave, and what the price is after year one. Nine questions sort the field. This is the scorecard.
Take it into every demo. A vendor who answers all nine cleanly, in plain sentences, is rare, and that's the point.
Key takeaways
- Nine questions separate AI scribes faster than any feature grid: audio handling, Rx drafting, Rx safety, coding honesty, languages, export rights, BAA/DPDP posture, pricing transparency, and trial quality.
- The fastest disqualifier is the audio answer. A vendor who can't say where the recording lives and for how long, in one sentence, has answered the question.
- Primary-care physicians log a median of 36 minutes of EHR time per 30-minute visit, so the tool you pick runs your evenings, not just your notes.
- Make the scorecard your demo agenda. Ask the same nine of every vendor and compare the answers side by side.
questions that actually separate AI scribes, underneath an identical-looking demo
of EHR time logged per 30-minute primary-care visit (AMA / JAMA Network Open)
sentence: how long a good vendor needs to answer "where does the audio live?"
The 9-question scorecard, at a glance
Score each vendor 0 to 2 per row: 0 if the answer is evasive or wrong, 1 if partial, 2 if it's a clean, specific, in-writing yes. A tool worth trialing clears most of these without a follow-up email.
| # | Question | What a strong answer sounds like |
|---|---|---|
| 1 | What happens to the visit audio, and when is it deleted? | "Processed in memory, discarded when the note drafts. No archive." |
| 2 | Does it draft prescriptions, and what checks them? | "Rx draft, run through a safety screener; you review and sign." |
| 3 | Does it claim to code or bill? | "ICD-10 suggestions you confirm. Not a coding or billing engine." |
| 4 | Which languages does it actually capture? | A named list that matches your patients, with notes in clean English. |
| 5 | Can I export and delete any visit, anytime? | "Yes and yes, self-serve, no fee, no support ticket." |
| 6 | Will you sign a BAA (US) / how do you map to DPDP (India)? | "BAA for any size." / "Consent-first, purpose-limited handling." |
| 7 | What's the price after year one? | The full ladder in writing: launch, annual, month-to-month. |
| 8 | What's the real failure mode on a noisy, multi-speaker visit? | An honest answer, not "it's perfect." |
| 9 | Can I trial it on my own visits before I commit? | "Yes, 7 days, your real clinic, no card." |
The rest of this guide is one section per question, so you know what a good answer looks like and where vendors quietly fail.
Question 1: what happens to the visit audio?
Ask this before you ask about price. A visit recording is more revealing than the note, because it holds everything that didn't make the note. Vendors handle it very differently. Some retain audio for days or weeks. Some let you opt out. Some are vague, which is its own answer.
This isn't theoretical. A proposed class action filed against Sharp HealthCare in late 2025 alleges patients were recorded by an ambient AI tool without consent, with more than 100,000 patients potentially affected. The cleanest defense against that whole category of risk is not retaining the audio at all.
Our position: audio is processed in memory and discarded the moment the note is drafted. No archive, not for us, not for the practice, not for anyone. What survives is the note you reviewed and signed. If you want the privacy and consent angle in depth, our psychiatry privacy guide walks the behavioral-health version of this question, where the stakes are highest.
Question 2: does it draft prescriptions, and what checks them?
A lot of scribes stop at the note. Some draft a prescription too, which saves real time, but a prescription draft an LLM wrote is still a draft an LLM wrote. The question isn't only "does it draft Rx," it's "what catches a bad one."
AI Scribe by Patient Square is an ambient AI medical scribe that listens during the visit and hands back a structured SOAP note, ICD-10 suggestions, and a prescription draft, ready to review and sign about two minutes after the visit. The prescription draft passes a deterministic safety screener that checks drug interactions, renal dosing, and pregnancy flags. It re-runs at sign time and hard-blocks unsafe combinations unless you override with an attestation. It's a draft, not e-prescribing, and you stay the prescriber. The mechanics are in our prescription draft safety explainer.
Question 3: does it claim to code or bill your notes?
Here's a line that separates honest vendors from the rest. A scribe can suggest ICD-10 codes from the conversation. It should not claim to be a coding engine, an E&M leveler, or a billing system. Coding is a human-confirmed clinical-and-financial decision, and a tool that pretends otherwise is selling risk.
We offer ICD-10 suggestions to speed your coding. A coder or clinician confirms them. That's the whole claim. We don't ship CPT, E&M, or HCC automation, and we won't pretend to. The honest version of this distinction is the entire point of our ICD-10 suggestions explainer. If a vendor blurs "suggests codes" into "codes your notes," ask exactly what gets submitted and by whom.
Question 4: which languages does it actually capture?
Read the language list literally, then test it. Several popular US scribes publish language lists that name no Indian language at all, which is a gap for one of the largest bodies of clinical conversation on earth. "Multilingual" without a named list is marketing.
We built India-first: English, Hindi, and 20+ Indian languages, including mid-sentence code-mixing, with notes always returned in clean clinical English. Input can be multilingual; the output note is English. In low-signal clinics, capture works offline with on-device encryption and syncs later. If your patients switch between Hindi and English in the same sentence, that's the test case to run on day one of a trial, not after procurement.
Question 5: can you export and delete any visit, anytime?
This is the lock-in question, and it's easy to skip until you want to leave. Ask three things. Can I export any visit? Can I delete any visit? Do you sell or share clinical data? The answers you want are yes, yes, and no, and you want them in the contract, not the sales deck.
Our answer: notes belong to your practice, you can export or delete any visit at any time, and we never sell or share clinical data. If export needs a support ticket or carries a per-record fee, that's friction designed to keep you. For the buyer who wants the full ownership and exit checklist, run the same questions a switching practice runs.
Question 6: BAA in the US, DPDP posture in India
The compliance floor differs by region, and a vendor who can't meet it is disqualified regardless of how good the note looks.
In the US, the floor is a signed BAA, available to any practice size, plus encryption in transit and at rest and access you can audit. Software is never "HIPAA certified," because no such certificate exists, so treat that phrase as a red flag. We map our safeguards to the HIPAA Security Rule and offer a BAA to every US customer.
In India, the frame is the DPDP Act 2023: consent-first, purpose-limited handling with reasonable safeguards. We handle data to those standards, our SOC 2 Type II audit is underway, and ABDM integration is on our roadmap, not live today. We say roadmap because it's roadmap. A vendor claiming ABDM certification is overclaiming.
Question 7: what's the price after year one?
Get the full ladder, per clinician, in writing: the launch rate, the annual-commitment rate, and the month-to-month rate. Three numbers, no asterisks. Vendors that route a per-seat subscription to "contact sales" are telling you the renewal will be a negotiation.
We publish the whole ladder: $89 per clinician per month in the US on annual billing, ₹1,199 per clinician per month in India ex-GST (that's ₹1,415 with 18% GST), no feature gating between tiers. The full table is on our pricing page. Compare cost per visit, not the sticker: $89 across 400 visits a month is about $0.22 a visit. For the head-to-head across the self-serve field, our best AI medical scribes roundup lays out the honest comparison.
Question 8: what's the honest failure mode?
Every scribe has one. The vendor who admits it is more trustworthy than the one who claims perfection. The two failure modes worth probing:
Noisy, multi-speaker rooms. A quiet US consult room is the easy case. A crowded OPD with a relative answering half the questions, or a pediatric visit with a caregiver in the room, is the real test, and quality varies sharply across products. If your setting is loud or multi-party, make the demo use a recording that sounds like your room.
Drafts contain errors. Models mishear drug names, compress two complaints into one, occasionally write something plausible that didn't happen. This is why review-and-sign is load-bearing, why our Rx drafts pass a safety screen, and why you should distrust any pitch that implies you can skip reading the note.
Question 9: can you trial it on your own visits?
The trial is the decision. A scripted demo flatters every scribe equally; your Tuesday clinic doesn't. Run a 7-day trial on a normal day, with your patient mix, your languages, your interruptions. Grade three real notes against what you'd have written. If the tool survives your worst visit of the week, it'll survive the rest.
We offer a 7-day free trial in both regions, no card to start. The structured way to run one, with a baseline day and a note-grading rubric, is in our how to run a trial guide. And if you're about to roll a tool out across a small clinic, the 2-week implementation plan turns the trial into a rollout.
Turn the scorecard into a demo agenda
Don't watch nine demos and try to remember the differences. Bring these nine questions to each one, write the answers in a row, and compare. The vendor with the most 2s, especially on audio, Rx safety, coding honesty, and export, is the one to trial.
When you're ready to see how AI Scribe by Patient Square answers all nine against your own visit type, book a demo. Then run the 7-day trial on a real clinic week. The scorecard tells you what to ask; the trial tells you what's true.
Common questions
What is the single most important question to ask an AI scribe vendor?
What happens to the visit audio, and when is it deleted? It separates vendors faster than price or features. Accept only a one-sentence answer with a timeline. AI Scribe by Patient Square processes audio in memory and discards it the moment the note is drafted, so there is no recording to retain, leak, or subpoena.
Should I trust a scribe that advertises an accuracy percentage?
Be skeptical. There is no standard benchmark for medical-note accuracy, so a published percentage is marketing, not measurement. A more honest answer is "test it on your own visits." Ask what the vendor does instead of quoting a number, then run a trial on your real accents, languages, and interruptions.
Does an AI scribe code or bill my notes?
It should suggest, not decide. A scribe can surface ICD-10 suggestions to speed your coding, but a person still confirms them. Be wary of any tool claiming to be a coding or billing engine. AI Scribe by Patient Square offers ICD-10 suggestions you review, not automated coding, E&M leveling, or claim submission.
How do I check whether the notes are really mine to keep?
Ask three things: can I export any visit, can I delete any visit, and do you sell or share clinical data. The answers should be yes, yes, and no. If export needs a support ticket or there is a per-record fee, that is lock-in. Read the contract, not the sales deck.
What does a 7-day trial actually prove?
A trial proves whether the note quality survives your real clinic, which a scripted demo never tests. Run it on a normal day, with your patient mix and your languages, and grade three notes against what you would have written. The trial, not the feature list, is the decision.
Is a free EHR-bundled scribe enough?
Sometimes. If you only need note generation and you live inside one EHR, a bundled free scribe may be the right call. A standalone usually adds Rx drafting, ICD-10 suggestions, an Rx safety screener, and multilingual capture. Evaluate against the work you actually do, not the longest feature list.
How should I compare prices across vendors?
Get the whole ladder in writing: launch rate, annual commitment, and month-to-month, per clinician. Vendors that say "contact sales" for a per-seat subscription are telling you about the renewal conversation. Published list pricing keeps everyone honest. Compare cost per visit, not just the monthly sticker.
Sources
- American Medical Association: Primary care visits run a half hour. Time on the EHR? 36 minutes.
- Lukac P, et al. Ambient AI Scribes in Clinical Practice: A Randomized Trial (UCLA / Nabla). NEJM AI, 2025.
- Patient sues Sharp HealthCare over ambient AI use (consent class action, 2025-26).
- Freed: published pricing (fetched June 2026).
- EkaScribe: published India pricing (fetched June 2026).