Echo Lab Style Guide
Echo Lab Style Guide
Echo Lab Style Guide
Hello, and congratulations! If you’re reading this, it means you’ve passed Echo's
rigorous transcription exams with a score that puts you in the 99% percentile of
everyone who applied.
As part of our team working to provide the highest accuracy and with genuine care
for accessibility, we value your feedback on everything from platform features to
functionality. Our work surpasses traditional transcription to provide quality
captioning and ensure equitable access, which is why our contractors make 2-3x
more than other transcription roles.
You have been offered a freelance contract with the potential to work up to 40 hours
a week as jobs become available. With more universities requesting
captioning/transcription services each week, jobs will continue to be added to the
dashboard daily after being initially transcribed through AI. The Development team
is continually working to increase the accuracy of capturing with AI so you have a
complete transcription to work with.
The goal of the style guide is to proactively answer any questions you might have as
a new transcriber about formatting so, together, we can provide a consistent
experience to the schools we work with. We want them to feel that their 100th video
was captioned with the same care and attention as their very first - something that
requires tremendous coordination between all of us, at scale. If you have any
questions or clarifications on the items below, don’t hesitate to reach out to the
internal team directly on Discord or drop a question in the community channels.
TABLE OF CONTENTS
Legal Requirements
1
Quality Captioning
Speaker Identification
Atmospherics
Common Captions
Helpful Hacks
FAQ
Glossary
2
On the other hand, transcription involves converting spoken words from audio or
video content into written text. It aims to provide a textual representation of the
spoken content, making it accessible to individuals who may have difficulty hearing
or understanding the audio.
JOB RESPONSIBILITIES
HELPFUL RESOURCES
● Using a mouse and headset are HIGHLY recommended for synching captions
with audio
● The Google Image reverse feature can assist in identifying speakers or other
content displayed in the video
● Find and Replace extensions on your browser can improve speed for editing
3
LEGAL REQUIREMENTS
Accuracy: Captions must relay the speaker’s exact words with correct spelling,
punctuation, and grammar with 99% accuracy and no paraphrasing. Captions must
honor the original tone and intent of the speaker. Captions must match background
noises and other sounds to the fullest extent possible.
Time Synchronization: Captions must align with their corresponding spoken words
and sounds to the greatest extent possible. Captions must not proceed too quickly
for the viewer to read.
Program Completeness: Captions must be included from the beginning to the end
of the program to the fullest extent possible.
QUALITY CAPTIONING
Accurate: We aim for errorless captions.
Equal: The meaning and intention are completely preserved to maintain equal
access.
● Caption: the converted audio dialogue and sounds into text that appears on a
video and is synchronized with audio to provide equitable access and
meaning
● Caption Segment/Line: a transcribed line of dialogue that is less than
64-characters
● Caption Stacking: Splitting a caption segment/line into two sections, each is
32-characters or less in length.
● Color Block: The corresponding audio recording for each caption.
● Red Line: The red line indicates the 32-character line limit. Any text over this
line should be stacked or segmented appropriately. There should be NO text
beyond the red line at any time, for any reason.
5
4.
6
5. A score of 2-3, will receive a strike against your three-chances. The grader will
add robust feedback for improvement and you still have two more chances to
improve. Errors that are scored automatically as 2-3 include:
a. text or characters going over the red line (spaces do not count)
b. no or incorrect speaker tags
c. no atmospherics
d. inappropriate captioning (not captioning exactly what was said)
e. overuse of (unintelligible)
6. When a transcriber reaches three “strikes,” they will lose their freelance
contract with Echo Labs.
TURNAROUND TIME
1. Every video comes with an associated ‘time allotment’ which is the total
amount of time you have to complete a video. You will have a timer running
when accepting the job. We do this to ensure that all videos are delivered to
students within 24 hours of submission (See Time Chart).
2. If work is not finished within the time allotment (when the timer expires) any
work completed will be erased and partial payment is not possible, this is
why we’ve extended the time allotment to 600% of the video length itself.
3. You may request an extension for extenuating circumstances on the
appropriate Discord channel.
For accepting and completing a job with Echo Labs, follow these steps. We
recommend using a mouse and headset to simplify your editing.
SPEAKER IDENTIFICATION
Whether or not a speaker can be identified using visual cues, we must take the time
to include accurate speaker tags for students who rely on end transcripts as a guide.
Here’s the best way to prepare speaker tags for all users.
1. Speaker tags should always be enclosed in brackets with the speaker title
within it being written in full uppercase, e.g. [RACHEL ADAMS]
2. Please attempt to find the speaker’s full name by searching on the university
webpage, professional websites, or completing an internet search.
3. Maintain the same speaker tag throughout the video. Do not truncate the
speaker tag after the first use.
4. Please don’t use [SPEAKER ONE, TWO, THREE, ETC.] labels. This will result in
an automatic strike. Most of the time, you can see the names in the video,
glossary, or – if not – can infer from the video their position (e.g. STUDENT,
INSTRUCTOR, STUDENT ONE, TWO, etc.)
5. Use full names whenever possible, and for each speaker switch. Do not use
first names alone when first and last are available.
6. Speaker tags are required each time the speaker switches. If one speaker
continues talking across multiple caption segments, you only need to include
a speaker tag in the first caption segment of the running speech.
7. When numbering speakers, write out one through ten and then switch to
numerals for speakers 11+.
8. Follow this hierarchy for labeling speaker tags: 1) Full Name, 2) Role either
implicit or intuited.
9. When there are several speakers in one video, use the appropriate tag
according to their identity, not their speaking order in the video. For example,
if speaker one is named [DR. SMITH] and speaker two is not named specifically
they will be tagged with number one such as [STUDENT ONE]. Subsequent
speakers will be tagged either by their name if given, or the subsequent
number.
8
10. Include given titles, if known, and use appropriate abbreviations, e.g. DR. MRS.,
MS., MR., PROF., etc.
● [NARRATOR] ● [PROFESSOR]
● [HOST] ● [INSTRUCTOR]
● [COMMENTATOR] ● [LECTURER]
● [INTERVIEWER] ● [FACULTY]
● [INTERVIEWEE] ● [SCHOLAR]
● [ACADEMIC]
● [EDUCATOR]
● [TEACHER]
● [ADVISOR]
● [STUDENT] ● [MODERATOR]
● [LEARNER] ● [FACILITATOR]
● [PUPIL] ● [CHAIRPERSON]
● [SCHOLAR] ● [MEDIATOR]
● [UNDERGRAD] ● [COORDINATOR]
● [GRADUATE] ● [HOST]
● [APPRENTICE] ● [PRESENTER]
● [TRAINEE] ● [DISCUSSION LEADER]
● [PARTICIPANT] ● [CONVENOR]
● [EMCEE]
● [VOICE OVER]
Special Roles ● [NEWS ANCHOR]
● [GUEST]
● [EXPERT]
● [PANELIST]
● [CONSULTANT]
● [KEYNOTE SPEAKER]
● [TUTOR]
● [AUDIENCE MEMBER]
9
● As part of recent updates, we’ve made it easier to know how long each
segment should be, the red line indicates when you have reached
32-characters. If the caption is over the red line, it will need to be stacked
accordingly.
● When the caption moves beyond the red line, it must be split into two lines.
This is why we’ve created ‘caption stacking’, which you activate by clicking
SHIFT + ENTER while typing. The segment will be broken into two lines to
make it more readable. The caption segment should NOT be split into more
than two lines.
● Deciding where to stack captions is very important. The goal is to stack
captions so that whole phrases, nouns, sentences, and dialogue flow are
interrupted as minimally as possible.
● Readability is extremely important to segmenting and stacking. Make sure to
watch and review the video captions to ensure the segments and caption
stacks are not divided over too many screens visually which can impede the
student’s readability.
When a sentence is broken into two lines of captions, it should be broken at a logical
point where speech normally pauses and follows standard grammar constructions.
The following guidelines provide examples of appropriate segmentation.
Inappropriate
Appropriate
Mark pushed
10
his black truck.
Inappropriate
Appropriate
Mary scampered
under the table.
● Do not break a person's name or a title from the name with which it is
associated. Example:
Inappropriate
Appropriate
Inappropriate
Appropriate
Inappropriate
Appropriate
11
Mom said I could have gone
to the movies.
● Never end a sentence and begin a new sentence on the same line
unless they are short, related sentences containing one or two words.
Example:
Inappropriate
This is an example.
You're not supposed to do this.
Appropriate
Inappropriate
Appropriate
Task's assigned,
and (smacks hands) done.1
1
Adapted from the “Captioning Key.” Described and Captioned Media Program. 2024.
12
2. Lightly edit spoken content to avoid filler words, such as ‘uh’ and ‘um’. You
may remove filler words, such as “you know,” “well…” or “um” and other
non-essential information.
3. If the filler or sound word adds meaning, please caption it. The chart
below indicates common filler words that might communicate specific
meanings.
And Kind of
10. Googling with a bit of context from your video/audio is also helpful such as
speaker’s names, or academic discipline.
11. URLs, hashtags, social media tags should be captioned using common
convention: www.el.ai/#echolabscaptions/@echolabs
13. Use an appropriate atmospheric for the sound heard when the word is
censored, e.g. (beep).
,
Used when listing, to separate clauses, after filler words,
before quotes, and when addressing someone. No space
before and one space after.
[]
Use for speaker tags.
.!?
Used at the end of whole sentences. No space before
and one space after.
‘
Used in contractions and to indicate a possessive. No
space before or after.
‘’ Use for short quotes, answers, and media titles. Also, for
emphasis when a speaker is emphasizing a specific
term, phrase, or quote. Wrap words at the beginning
and the end of the quote.
“”
Use for long and direct quotes. Wrap words at the
beginning and end of the quote.
–
Use an Em-dash/double hyphen for additions or asides,
such as when a speaker interrupts themselves or
changes the direction of the conversation. Use a space
before and a space after.
-
Use a single dash/hyphen for hyphenated words.
…
Use when there is a significant pause, (longer than 5
seconds) or interruption by another speaker. Use only
when necessary.
:
Use to give emphasis, present dialogue, introduce lists
or text, and clarify composition titles.
;
DO NOT USE.
17
COMMAS
Inappropriate
Appropriate
ELLIPSES
1. Use an ellipsis when a caption has a significant pause, longer than 5 seconds.
2. Do not use an ellipsis to indicate that the sentence continues into the next
caption.
QUOTATION MARKS
1. Use quotation marks for on-screen readings from a song, poem, book, play,
journal, or letter. However, use quotation marks and italics for offscreen readings
or voice-overs.
SPACING
1. Spaces should not be inserted before the ending punctuation, after opening
and before closing parentheses and brackets, or before/between/after the periods
of an ellipsis.
● A space should be inserted after the beginning music icon (♪) and
before the ending music icon(s). Example:
19
♪ There's a bad moon rising ♪
MUSIC
Incorrect
Correct
Correct Examples
20
[Ella Fitzgerald singing
“Old MacDonald Had a Farm”]
● Caption lyrics with music icons (♪). Use one music icon at the beginning
and end of each caption within a song, but use two music icons at the
end of the last line of a song. A space should be inserted after the
beginning music icon (♪) and before the ending music icon(s).
Correct Examples
Correct Examples
NUMBERS
● Unless otherwise specified below, spell out all numbers from one to
ten, but use numerals for all numbers beyond ten. Examples:
21
Inappropriate
Appropriate
Inappropriate
Appropriate
● Spell out any number that begins a sentence as well as any related
numbers. Example:
Inappropriate
50000
Appropriate
50,000
Inappropriate
Appropriate
22
Steven has 21 books, 11 oranges, and 3 cats.
Building 2 page 31
Channel 5 size 12
LOCATION NUMBERS
● Use numerals for location numbers such as phone numbers and zip
codes. Example:
Inappropriate
One, one, one, five, five five, one five, one five
Appropriate
(111) 555-1515
20048
23
DATES
● Use the numeral plus the lowercase "th," "st," or "nd" when a day of the
month is mentioned by itself (no month is referred to). Example:
Original Narration
"ninth"
Captioned As
● When the day precedes the month, use the numeral plus the lowercase
"th," "st," or "nd" if the ending is spoken. Example:
Original Narration
"seventeenth"
Captioned As
● Use the numeral alone when the day follows the month. Example:
Original Narration
"nine" or "ninth"
Captioned As
● When the month, day, and year are spoken, use the numeral alone for
the day, even if an ending ("th," "st," or "nd") is spoken. Example:
Original Narration
"six" or "sixth"
Captioned As
1907
Original Narration
June of 1999
Captioned As
June of ‘99
TIME
I awoke at 5:17.
I awoke at 4 o'clock.
PERIODS OF TIME
● A decade should be Captioned As "the 1980s" (not "the 1980's") and "the
'50s" (not "the 50's").
25
● Do not use hyphens if a decade or century is in noun form. Example:
FRACTIONS
● Either spell out or use numerals for fractions, keeping this rule
consistent throughout the media. If using numerals, insert a space
between a whole number and its fraction. Example:
Numeral Used
● Do not mix numerals and spelled-out words within the same sentence.
Example:
Inappropriate
Appropriate
● If a fraction is used with "million," "billion," "trillion," etc., spell out the
fraction. Example:
Inappropriate
26
3/10ths
Appropriate
3/10
DOLLAR AMOUNTS
● Use the numeral plus "cents" or "¢" for amounts under one dollar.
Examples:
I need 15 cents.
● Use the dollar sign plus the numeral for dollar amounts under one
million. For whole-dollar amounts of one million and greater, spell out
"million," "billion," etc. Examples:
He owes $13,656,000.
● Use the word "dollar" only once for a range up to ten. Example:
● Use the dollar sign and numerals when captioning a range of currency
over ten dollars. Example:
MEASUREMENTS
I'm 5'8".
● For whole numbers, use numerals. For example, caption "3 cups of
sugar" instead of "three cups of sugar."
DECIMALS
● Spell out decimals, such as one point four or nine point nine seven.
Unless the content is math specific. Then use integers such as 1.4 or
9.97.
● Follow the conventions for ordinal place. For example: “Zeroth”, “first in
line”, “tenth place”, “21st century”, “100th time”.
● For anything that’s not a number / digit or percentage, write out the full
word instead of the symbol. Example:
GRAPHING TERMS
● Write it out as the speaker says it, following basic number conventions
and using integers instead of writing out the words (-10,3).
● Quadrants are labeled with Roman numerals, such as “quadrant IV”
● Axes and coordinate references are hyphenated as follows: x-coordinate,
y-axis.
ACRONYMS
Appropriate
Gigabytes
Inappropriate
gigs or GB
29
● If a speaker spells out a word, use the format “W-O-R-D”.
NON-LETTER SYMBOLS
Appropriate
2 pi r
2 times pi times r
Inappropriate
Twopir
● Spell out all units (joule, gram, ampere, volt, meter, pascal, kelvin, hertz,
coulomb and newton.)
● Spell out all functions such as “f of x” instead of f(x). When referring to
notations such as dy/dx (and all other related derivative references) in calculus,
engineering, etc, have the captions reflect what the speaker says (e.g., “dy dx”,
including a space in between). Example:
● Treat the subscript or superscript term the same way you might another term
like pi, tau, or sine, all together. Denote these as the speaker reads it. Example:
x2
x sub 2
30
xj2
● Anything to a power should be written as spoken. Use the integer plus the
lowercase "th," "st," or "nd" if the ending is spoken. Example:
9 squared
y to the 10th
12 to the 2nd
PROGRAMMING LANGUAGE
ATMOSPHERICS
Captions need to indicate sounds heard on screen. We call these identifiers
atmospherics. These are critical in providing visual indicators of non-verbal sounds to
viewers who may be hard of hearing.
2. The sound should add important context and meaning to the video such
as sounds that:
Convey emotion
Background music
3. Avoid adding atmospherics that are excessive and do not add meaning:
(lip smacking, mouse clicking, pen clicking, etc.).
4. Use parentheses ( ) and lowercase unless a proper noun is used for all
atmospherics, e.g. (Erin sketching).
32
5. Describe the sound or sounds heard on screen by following this
convention: noun + descriptor/verb in the present tense form, e.g. (water
boiling), (door slams).
7. When a video also contains spoken words, only include background music
if there’s a significant time gap and it would benefit the viewer, e.g.
(background music).
UNCLAIMING A JOB
Currently, the only reason to unclaim a job is if a job is extremely poor audio
recording. When you encounter this, unclaim the job, and notify the IT
troubleshooter. When in doubt, please email support@el.ai first for guidance.
If there are repeated words, incorrect wording, or incorrect lyrics to music, please
correct and complete the job.
34
Syncing the audio accurately with the captions is vital for understanding. The color
blocks at the bottom of the screen align the sound waves with the caption. If the
caption extends beyond the red line, it needs to be stacked to the next line and the
color blocks will be adjusted so that each caption corresponds to the accurate color
block of sound.
35
● You can access the video timeline by right-clicking the video screen, and
selecting “show all controls.”
36
FAQ
Q - For long pauses, do you need to include an atmospheric tag such as (silence) or
(no audio)
A - No, this is not necessary. The screen can be caption-free when there is no one
speaking and then have the captions resume when the sentence resumes.
Q - For math lectures, should you write out the equations using words or the way a
professor is writing them out?
Q - Do we start the audio blocks when there is an uh or um, or exclude them from
the audio blocks and start with the typed text?
A- Start with the typed text, ignore the ums, and uh.
Q - Do we want the caption blocks to fill the whole bar or just where speaking is
occurring?
A - We only want to have captions appear where the speaker is talking. Try to match
the cadence of speech as much as possible.
Q - When the speaker stutters or repeats words, do we start the caption block where
they first start speaking or where they say it correctly?
A - Start where they say it correctly to optimize the readability for the students.
Q - When the speaker trails off and doesn't finish a sentence should we use ...?
37
A - We should use … to indicate that the sentence wasn't completed. This is being
updated in the AI soon. If it is just a pause in the sentence and they continue the
sentence, there is no need for ... or -, the screen can just be caption-free until they
resume speaking.
Q - If a video begins with music before the speaker does that need to be captioned
or not?
A - Yes! And please be as specific as possible, i.e. whimsical piano music, rock music,
etc.
38
COMMON CAPTIONING ERRORS
Below, are several examples of inappropriate captioning and common errors that
have been submitted for review.
● This is an example where the transcriptionist did not ensure the caption block
continued with the speaker before submitting.
QUICK GUIDE
ITALICS (As of 4/21, there is no ability to italicize but it will be coming soon!)
1. A voice-over reading of a poem, book, play, journal, letter, etc. (This is also
quoted material, so quotation marks are also needed.)
4. The first time a new word is being defined, but do not italicize the word
thereafter.
1. When an entire caption is already in italicized format, use Roman type to set
off a word that would normally be italicized.
42
2. If there is only one person speaking throughout the program (including the
narrator), whether onscreen or offscreen, use Roman type with no italics.
3. Do not italicize when a person who is offscreen is translating for a speaker who
is onscreen.