<aside>

How to read this doc: 🆕 NEW RULE — added by customer, not in the previous guidelines ⚠️ CHANGED — overrides a rule from the previous guidelines

</aside>

What's changed at a glance

Area	Previous	New	Status
Core philosophy	Not stated explicitly	Preserve Everything — when in doubt, keep it	🆕 NEW
Partial-word stutters (e.g., "Wh-wh-what")	Do not transcribe — drop the fragments → "What makes you think that?"	Keep the fragments with hyphens → "Wh-wh-what makes you think that?"	⚠️ CHANGED
Complete-word repetitions ("I I I went")	Keep without hyphens	Keep without hyphens	Unchanged
Hesitations (um, uh, er, ah, mm)	Treated as filler/crutch words — keep, comma-offset	Explicit category — keep all, comma-offset where natural	🆕 Clarified
Brand & product names	Concatenate — Xbox360, iPhone13Pro	Use official written form — Xbox 360, iPhone 13 Pro, Windows 11, PlayStation 5	⚠️ CHANGED
Arbitrary alphanumeric codes	Concatenate — D48AA	Concatenate — D48AA, RX7, B2B, 3DPrinting (regardless of pacing)	Unchanged (expanded examples)
Spoken "dash" inside a code	Not specified	Render as a literal dash → A-37B, KL-902	🆕 NEW
Personal data formats (phone, email, address, SSN, credit card, dates)	General "transcribe all PII" rule, no formatting spec	Use official/standard format regardless of how speaker paces it	🆕 NEW
Spelling out a word (e.g., a name)	Not specified	Capital letters separated by dashes → S-M-I-T-H	🆕 NEW
Unintentional repetitions ("the the problem")	Offset with commas	Keep all; tight repetitions may render without commas	⚠️ CHANGED
Background speech (TV, bystanders, etc.)	Not specified	Insert [BG SPEECH] tag in place of the speech	🆕 NEW
Filler words (like, you know, I mean)	Keep, comma-offset	Keep, comma-offset	Unchanged
False starts / self-corrections	Two hyphens (--) + space	Two hyphens (--) + space	Unchanged
Informal contractions (tryna, gonna, dunno)	Keep as spoken	Keep as spoken	Unchanged
Emphasized / elongated words	Don't change spelling	Don't change spelling	Unchanged
Grammar mistakes	Don't correct	Don't correct	Unchanged
Pronunciation mistakes	Correct	Correct	Unchanged
Speaker identification (Speaker 1, Speaker 2)	Required formatting		Unchanged
Speakers' sounds — (laughing), (gasping), etc	Lower-case present continuous, in round brackets, in-line		Unchanged
Foreign language tagging — [foreign language 00:00:02]	Italicize uncommon foreign words; tag long stretches		Unchange

General Information

What is verbatim for AI training?

Verbatim for AI training is a customer-specific service level of transcription which is different from HappyScribe's standard verbatim guidelines.

<aside> 🧑‍🤝‍🧑

Quality tip! The following guidelines are an addition to the general guidelines for transcription formatting, sentence structure, symbols, italics, etc.

However, the instructions below must always be followed when working on these verbatim files unless otherwise specified by Admin.

</aside>

<aside> 🆕

NEW RULE — Core Philosophy: Preserve Everything. The goal is to retain as much of the target speaker's content as possible. Downstream the customer can always remove disfluencies, fillers, and hesitations — but they cannot recover information that was never transcribed. When in doubt, keep it.

</aside>

Languages and services

Transcriptions will be required in the following languages:

English 🇬🇧

Speaker identification

To make sure speakers can be distinguished while remaining anonymous:

Use Speaker 1, Speaker 2, etc. as labels. Do not use their name.

If it's a company entity or celebrity, you can still write their name.

Transcription Logic

Complete words

If it is a complete word, transcribe it.

What's changed at a glance

General Information

Languages and services

Speaker identification

Transcription Logic

Complete words

Partial words / False starts