YouTube Transcript – read YouTube videos

banana_giraffe · on Dec 18, 2022

If you want an CLI version of a similar idea, you can use yt-dlp and some simple jq to pull down the captions for a file:

    curl `\
      yt-dlp -j "https://www.youtube.com/watch?v=aeWyp2vXxqA" | \
      jq -r '.automatic_captions.en[] | select(.ext=="json3") | .url'`

varenc · on Dec 18, 2022

That gets me the subs in some usual complicated format? It's a bit of work to extract the actual text from that.

To get youtube subs in .srt format this gave me some limited success:

    yt-dlp --convert-subs=srt  --write-auto-sub --write-sub --sub-lang "en,en-us,en-GB,automatic-caption-en" --skip-download  "https://www.youtube.com/watch?v=1OfxlSG6q5Y"

Behind the scenes yt-dlp is downloading the subs in .vtt format than using ffmpeg to convert those to .srt. Depending on your situation the original .vtt format might be fine.

cercatrova · on Dec 19, 2022

Or even better, yt-whisper, which uses OpenAI's Whisper speech to text. I guess it'd be better to first check whether the video has captions first before Whispering, so maybe both your command and this one could be used together.

https://github.com/m1guelpf/yt-whisper

ptspts · on Dec 18, 2022

Not all YouTube videos with spoken text have automatic captions.

arboles · on Dec 18, 2022

https://news.ycombinator.com/item?id=34041455

cercatrova · on Dec 19, 2022

https://github.com/m1guelpf/yt-whisper

jck · on Dec 19, 2022

I am not a fan of this pattern - if I'm understanding correctly, you would have to part with all of yt-dlp's niceties like playlist/channel handling, quality selection, file naming, logging config etc.

Why not just use the whisper cli on yt-dlp CLI's output for videos with bad or no subtitles?

cercatrova · on Dec 19, 2022

Sure you could do that too. yt-whisper uses yt-dlp underneath so there might be a way to pass arguments to the inner yt-dlp instance. Or if not you can modify the source directly, it seems to be a simple wrapper. Or again you can do what you were saying, using the Whisper CLI. All good options, I just mentioned this one since it's easier if I just want to download a video with subs.

rpigab · on Dec 19, 2022

  ... | split_sentences | grep -viE '*vpn*'

MollyRealized · on Dec 21, 2022

I apologize for the question, but I am not entirely clear where "split_sentences" is. Is it a separate script? I have been looking for something with that sort of functionality for a while, very often for this very purpose, splitting transcripts.

rpigab · on Dec 29, 2022

Sorry for the late answer, but yes, it would have to be a separate script or command, it is purely fictional, I made it up because it made more sense for the joke to have it, and people might have pointed out that my grep would have filtered out too much context, so I had to add this.

I'm sure there are many unix-y tools for this purpose, but I don't know of them. If you're looking for something that's installed everywhere, maybe a very big awk or sed regex with multiline wizardry could do the trick for most easy-to-parse latin languages and you'd just have to copypaste it around. It prolly becomes harder for regexes once you start working with right-to-left languages like Arabic, and languages with different ponctuation, so it might not be i18n-friendly.

Related Stackoverflow : https://stackoverflow.com/questions/33704443/python-regexp-s...

tetris11 · on Dec 19, 2022

`| grep -viE 'skillshare'`

alpb · on Dec 18, 2022

Fwiw YouTube already has a feature for this. Click the "..." next to the share and click Show Transcript. There are also extensions like https://chrome.google.com/webstore/detail/youtube-captions-s... that makes it easy to search them in a popup.

dobladov · on Dec 18, 2022

They seem to have moved the functionality to the end of the description, and there you can find the "Show captions" button.

The extension I made to export the transcript was based on this YouTube functionality, I should update the instructions now.

https://chrome.google.com/webstore/detail/youtube2anki/boebb...

modeless · on Dec 18, 2022

Regular "find in page" works to search the transcript on YouTube. I use it often.

Abishek_Muthian · on Dec 19, 2022

Thank you, Do you use it regularly? I run a problem validation forum and someone was asking exactly for that[1].

[1]: Searching YouTube videos with transcript - https://needgap.com/problems/88-searching-youtube-videos-wit....

svat · on Dec 18, 2022

This is a great idea; I really enjoy all these "two channels simultaneously" (side-by-side translations, video with subtitles, and in this case video with a readable transcript, where you can scroll in the video or scroll in the transcript, and be synchronized).

I had done something like this a couple of years ago for some specific set of videos (e.g. https://shreevatsa.net/tex/program/videos/s10/ — compare with https://youtubetranscript.com/?v=_0Cv1G_s4gQ for the same video), but never got around to making it general; glad someone has done it. It takes just a few lines of Javascript, using the Youtube API, to do this i.e. keeping the video and text in sync (just view source on either page to see the JS at the bottom).

Something like this can also help with audio recordings (generating the alignment automatically is called "forced alignment" and there are tools like "aeneas" for this). In case anyone's interested or wants to help (for Sanskrit texts): see https://github.com/shreevatsa/web-align-audio-text deployed at https://shreevatsa.net/ramayana/sarga/ and better version at https://github.com/avinashvarna/audio_alignment deployed at https://avinashvarna.github.io/audio_alignment/

BasilPH · on Dec 19, 2022

This is cool!

We're doing forced alignment with audio recordings i.e. podcasts. Here's an example from a test client: https://www.withfanfare.com/p/seldon-crisis/search-by-the-fo...

Grateful for any feedback you might have.

Also, if you run a knowledge-dense podcast, or know somebody that does, I would love to talk to you/them. I'm for example considering to link places, people and things in a transcript.

svat · on Dec 19, 2022

Lovely, thanks for the example. I really like how the underline is subtle yet effective. The animation/effect where it seems to inch forward is also intriguing; how does that work? (Also, this is subjective but have you considered making double-clicking on the text open the audio player even when it's not been started yet? Would make it more discoverable, though I imagine some could get annoyed. Another minor thing: when the audio is paused and you double click somewhere, the audio position changes but this is not reflected in the player; only takes effect when you unpause.)

You're doing cool work, good luck with your service! It looks very useful and if someone I know is running a podcast I'll recommend it for sure; it looks really well-done and polished (at least the example you shared).

BasilPH · on Dec 19, 2022

Thank you! I wrote your points down. We're still learning how to make it most usable.

We know the start and end timestamps on a word level and we know the current player time. So all we do is set a CSS class on the words that are currently playing. We not only highlight the current word, but also the words that are close in time. This is what generates the effect.

We just released autoscroll, which is something that was requested often.

marcelfahle · on Dec 19, 2022

Wow, this looks incredible! Love how the underline travels and guides the eye.. Very impressive!

Excuse the question, but "forced alignment" is when you don't have timestamps, like in webvtt?

BasilPH · on Dec 19, 2022

Thank you!

Yes, exactly. We do forced alignment when you edit your transcript. The new words don't have any timestamps, so we need to align them. For short sections we use interpolation. If we need align whole sections we use Gentle[^1].

[^1]: https://github.com/lowerquality/gentle

marcelfahle · on Dec 20, 2022

Thank you, this is really interesting! :)

modeless · on Dec 18, 2022

A supremely useful site that searches YouTube transcripts is https://youglish.com. It shows you pronunciations in context for any word or name.

arboles · on Dec 18, 2022

Thanks for the link! This site actually has a database of youtube transcripts unlike OP. Shame you can't search fixed strings, like two words in exact order. Though it seems genuinely useful for learning pronunciation as advertised.

arboles · on Dec 28, 2022

I found a site for seriously searching within youtube subtitles

https://filmot.com

cavisne · on Dec 18, 2022

This script for whisper.cpp works really well

https://github.com/ggerganov/whisper.cpp/blob/master/example...

for my purposes I changed the output from subtitles to txt (so I could pipe the result into chatgpt)

varenc · on Dec 18, 2022

That's doing speech recognition on the YouTube video's audio and then embedding the result as subtitles? Is the idea that this is superior to YouTube's own automatic subtitles? Though if the youtube video actually has manually crafted subtitles from the creator, as a lot of popular science content does, it seems a shame not to just use those subtitles directly and avoiding doing speech recognition. Or just rely on YouTube's automatic captions which are pretty decent in my experience.

See my comment on how to just download the subtitles YouTube provides with `yt-dlp` here: https://news.ycombinator.com/item?id=34040342

codetrotter · on Dec 18, 2022

> so I could pipe the result into chatgpt

Tell us more :)

cavisne · on Dec 18, 2022

Nothing too exciting, just “summarize this” followed by the transcript in quotes, it works very well

is0tope · on Dec 18, 2022

Maybe this is a bit off topic, but does anyone know the legal footing of having a business with another businesses name in it? For instance, this tool uses the word "YouTube" in its name, though it is used as only a part of it, and it is not a competitor. I've always wondered how this works.

kube-system · on Dec 18, 2022

Broadly speaking, it would be trademark infringement if it is used in a way that may confuse others about the source of the product. It doesn’t necessarily have to be a specific product that Alphabet has a direct competitor for.

thaumasiotes · on Dec 18, 2022

https://en.wikipedia.org/wiki/Nominative_use

> is a legal doctrine that provides an affirmative defense to trademark infringement as enunciated by the United States Ninth Circuit, by which a person may use the trademark of another as a reference to describe the other product, or to compare it to their own.

anticensor · on Dec 19, 2022

Nominative use is explicitly ruled out in some other jurisdictions. Do not rely upon it.

bdcravens · on Dec 18, 2022

Most corporations regularly search for such domains, and submit cease-and-desist. I received one related to an eBay-related domain, but in my case, I hadn't built a business around it so it was easy enough to just take the site offline.

chiefalchemist · on Dec 18, 2022

Not sure about YouTube but WordPress does not allow the use of the name. WP in your (e.g.) domain name is ok. WordPress is not.

I'd imagine it's very similar for others. Often a company will pursue a violation if only to be consistent in showing the courts they actively defend their copy right.

thaumasiotes · on Dec 18, 2022

> Not sure about YouTube but WordPress does not allow the use of the name.

They may not like it, but they don't have the power to disallow you from using their name to refer to them. That's allowed.

chiefalchemist · on Dec 18, 2022

Actually, they do. It's copyright. Plenty of legal precedent. They defend WordPress, but are willing to allow WP.

The law is on their side.

thaumasiotes · on Dec 19, 2022

> Actually, they do.

No, they don't.

> It's copyright.

No, it isn't. It's trademark. There is no such concept as copyright in a single word.

> The law is on their side.

Again, no, it isn't.

What was the point of your comment? Why talk if you're not worried about whether what you're saying is true or false?

chiefalchemist · on Dec 19, 2022

Yes. Trademark not copyright. My mistake.

Otherwise: https://wordpressfoundation.org/trademark-policy/

FFS Relax. It was an oversight on my part. It's minor at best in larger scheme of things. Maybe you're having a tough day? God bless you. But this is nothing to go to the mat over.

You could have just said, "Oh. Maybe you're confusing trademark and copyright?". The last thing the world and HN needs is another high-strung belligerent asshole. That doesn't help anyone, or anything, sans your ego. It's not a good look.

thaumasiotes · on Dec 19, 2022

> Yes. Trademark not copyright. My mistake.

That is your most minor mistake. The much larger one is that you are claiming that the law supports the trademark policy you link to. It doesn't. Everyone remains free to use a trademark belonging to wordpress in order to refer to wordpress.

You are not free to use any such mark - the WordPress Foundation would be on very solid ground in telling you to stop using their graphical logo. But they can't stop you from using "WordPress" in the name of your for-profit company, which you'll note is quite explicitly contrary to their stated policy.

tmpburning · on Dec 18, 2022

These companies tend to rename themselves when they become popular.

neilv · on Dec 18, 2022

Hook this up to a language model, and maybe a user could instantly get the one sentence worth of information that the YouTube video creator buries in 10 minutes of monetized noise.

And also save yourself time when the creator teases that they provide the info, but it turns out they don't, they're just trying to get views.

nostromo · on Dec 18, 2022

YouTube created that problem by incentivizing longer videos. And now we have videos with tons of fluff.

Similarly Google incentivizes longer webpages, so now we have recipes that start with a novella about grandma’s cooking before showing the actual recipe.

It used to be nice to see a video’s thumbs up to thumbs down ratio to know if you’ve been click baited or not before watching the whole video. But that signal has been removed now too.

anticristi · on Dec 18, 2022

As a

recipe reader

I want to

dismiss cookies, have a video ad follow me down the page, and read why this cake conjures up memories of the author's childhood, before reaching the actual recipe

so that

I feel connected to the author, before fully committing to mixing ingredients

neilv · on Dec 18, 2022

That "user story" is like a tragically misinterpreted comment by someone at a prospective customer, speaking of a special time with their grandmother, but garbled through N layers of field sales, marketing, product managers, engineering hierarchy, and Agile task management.

Including the part about declining more cookies offered (to save room for grandma's lasagna).

slipmagic · on Dec 18, 2022

Tom Redman had this idea but he took feedback from Twitter. https://digg.com/2021/one-main-character-tom-redman-recipeas...

vouaobrasil · on Dec 19, 2022

That is definitely true. I actually try and make my videos as short as possible but I noticed that over time the longer ones are definitely prioritized by the algorithm, yet I get lots of comments saying how people appreciate my very short videos.

alamortsubite · on Dec 19, 2022

As a hack, I wonder if tacking 10 minutes of ambient video onto the end (after a "And now, 10 minutes of ambient video" warning) would help.

vouaobrasil · on Dec 19, 2022

I am not so sure because YouTube keeps track of watch time. If no one watches past 2 minutes, then I think that would also be penalized, though I have no insight into the algorithm so I am not sure.

specialist · on Dec 19, 2022

Do you know if YouTube measures wall time or percent of video watched? If percent, then a hack might be slow-mo the content and encourage audience ti watch at 2x speed.

vouaobrasil · on Dec 19, 2022

I think it is percent of video watched, because if you check out the analytics in YouTube Studio, it says how many people are still watching by a certain point. So your hack makes sense, but I'm not sure it would work because it is VERY hard to keep anyone's attention on YouTube. The moment it's something even slightly less than interesting, the vast majority of people go onto the next video.

specialist · on Dec 19, 2022

Nice. Worth trying.

Or maybe just repeat the same content a few times.

causality0 · on Dec 19, 2022

I blame it on automated advertising. When you had to work to get someone to pay you you made good content. Now anyone can get paid as long as they're digestible. It's created the interesting situation where the average quality of demonetized content is far high than monetized content.

12907835202 · on Dec 18, 2022

Are you sure Google prefers longer pages? I find (annoyingly) that Google likes the search version of my page for lots of things. E.g. a page called "best x of the y" the page for searching comments on that page called "best x of y search" where the only text is the title and a search input, will rank really well

kristianheljas · on Dec 18, 2022

Try to search for recepies :) I also see long novels which seem to disguise the ridiculous amount ads which google seems to like as well (these are mostly provided by no other than themselves!).

InCityDreams · on Dec 18, 2022

...or, just follow decent creators.

No snark intended, but i just gave up with the dross. And even some of them, of late, are getting a bit crafty. But, creators get one chance from me now - give me decent content, or even with the fancy chapters, you're not getting my eyeballs past two minutes. What I have found is that leaving the decent stuff on, what auto-plays after is 'generally' of similar quality. A quick set of back-buttoning and bookmarking has fairly often got me some interesting results.

neilv · on Dec 18, 2022

Good idea, but I don't follow anyone on YouTube. I was thinking about searching the Web for a bit of info, the search hits include YouTube videos (but no finer resolution than "this entire video").

A search engine could, narrow in on the few sentences AV in the video that it thinks correspond to what I was searching for, and summarize that, and also link me to the AV start timepoint in case I also want to watch the video.

This might change the economics of some YouTube video content creation.

LelouBil · on Dec 18, 2022

Google does exactly that, if a video shows up in the search results, it shows you only the relevant small part.

neilv · on Dec 18, 2022

I've never seen this before now, but I just got a Google search result video page with a kind of table-of-contents index on one of the video hits just now. (These TOC entries don't correspond to the marked segments on the timeline. I don't know whether this is something YouTube is doing, or something the content creator did.)

Is this what you mean? (Pardon if I'm not familiar with the latest Google Search features; I've mostly been using DDG lately, so don't have occasion to see all the features that exhibit only occasionally.)

blauditore · on Dec 18, 2022

Sadly enough, I have the same issue with written content: BS websites add a ton of fluff to their texts as their deplorable approach to SEO and making some space for ads.

"Streaming your phone screen to a TV is something many people at some point want to do. But seeing the picture of your smartphone on a television or other device can be a daunting task. Here we list multiple ways how you can achieve the goal of sharing a mobile screen of an Android or an iOS device on a different device. It works for any manufacturer, like Samsung Smart TVs or Apple TV."

(goes on for another 15 paragrpahs before presenting a non-solution)

It's hard to put in words the amount of hate I feel for the authors of such pages. I'm desperately looking forward to the day Google comes up with better models for detecting useful content and those trash piles can burn in hell.

thombat · on Dec 19, 2022

Or Google has sorted the results by how many advert placement sites you'll have been exposed to on that page before it's predicted you'll stop reading, so having the paydirt right at the end looks great, and it doesn't matter if it's fool's gold.

j45 · on Dec 19, 2022

It sucks because time on page plus how much you scroll and once you find your answer you don’t click on any more results signals to Google that you were satisfied.

I’m just not convinced the preambles are needed.

CamperBob2 · on Dec 19, 2022

Sad thing is, the next generation of AI models is probably being trained on that type of content.

greggsy · on Dec 18, 2022

I put something like this together to collect transcripts for uni videos. It’s dumps all transcripts into a directory, with URL links, so I can just search the whole directory to find the keyword I need.

Helped a lot with take home exams.

Topgamer7 · on Dec 18, 2022

YouTube-dl had the ability to rip just subtitles. I once used this to grep for some information I wanted after downloading all of the transcripts.

piyh · on Dec 19, 2022

I built an entire website based on that, there's a few other ones out there too.

kawfey · on Dec 19, 2022

I've had luck with pasting the transcript into ChatGPT and asking it for a seven-bullet list of main talking points.

dirtyid · on Dec 19, 2022

Or hook it up with sponsorblock and remove all the non content time stamps

cedws · on Dec 18, 2022

Pretty nice. The sliver of content still worth watching on YouTube doesn't have repetitive stuff or padding to make it to the 10 min mark though.

If you go to the homepage with clear cookies it's just endless amounts of utterly dogshit cookie cutter content. Same clickbait thumbnails with a person pulling an idiotic expression. Even the videos masquerading as educational are entertainment at best. If I had kids I'd do everything in my power to keep them away from YouTube.

janandonly · on Dec 18, 2022

The burning hate I feel for all information to be locked away in a YouTube video. This will solve that real world problem. I love reading (or, skimming) through a long read.

alpb · on Dec 18, 2022

> I feel for all information to be locked away in a YouTube video.

Google Search actually indexes transcripts of a video and shows you some YouTube results based on that even though the title/description of the video doesn't match the search query.

RBerenguel · on Dec 18, 2022

I had a huge backlog of tech videos, so I wrote me this (also to play a bit with Haskell, the base idea can be replicated easily in any language though): https://github.com/rberenguel/glancer

j45 · on Dec 19, 2022

This is great. Sharing like this is what I love about hn. Do you have any other features planned?

I’m considering a master keyword list to index against any text that comes in.

RBerenguel · on Dec 19, 2022

The only thing I have in my "someday" list of tasks for Glancer is the possibility of adding/using the whisper binary to get captions when/if unavailable. Aside from that I just keep it more or less working by using it myself (and to be fair on this one, I wish I wrote it in another language, Haskell can be a bit finicky to build if you do it sparingly).

Any addition that is on the "view" layer (the generated HTML) is very easy to add, just needs to go into the template file, at some point I might tweak that area but currently have no outstanding idea/requirement. The rest (i.e. the bulk of the code) is just a very bare-bones parser for captions that should be pretty stable and need no additions (crossing fingers here).

j45 · on Dec 25, 2022

Whisper is neat. openai probably open sourced it for a reason and it’s probably because they have something much quicker lol.

I’ve seen vosk used on device and it’s decently quick too on a recent Apple chip.

arboles · on Dec 18, 2022

Heh, this basically makes a storyboard

Random_Person · on Dec 18, 2022

I've published almost 1,800 video diaries and this is a game changer for me. I've been wanting to do more with the back catalog, but don't have transcripts.

motoboi · on Dec 18, 2022

Not sure if you know that, but YouTube has a transcript feature available for years now. It's somewhat hidden in the interface, but let's you search with ctrl-F (or command-F) in the transcript

zbrozek · on Dec 18, 2022

I use this for city council meetings to figure out who said what. It's not easy, but it's better than nothing. YouTube doesn't appear to do so well with multiple speakers.

loughnane · on Dec 19, 2022

I’ve tried to do this recently. Any suggestions on tools or workflows to dissect into different speakers?

cratermoon · on Dec 18, 2022

Yeah this website just extract the transcript that exists and displays it alongside the video. It's nice, but it's not doing the transcribing itself.

xuhu · on Dec 18, 2022

Just checked that google also includes youtube captions in search returns.

thomassmith65 · on Dec 18, 2022

The ratio of information to misinformation on Youtube seems pretty bad.

To make transcripts easier to access might create more problems than it solves.

Granted I can't make a bullet-proof argument; there's no clear way to quantify that ratio.

joosters · on Dec 18, 2022

My only complaint is with the layout of the site - could you please make the transcripts span across the whole width of the page, not just to the right of the video?

My one gripe with Youtube's own transcript box is that it is too narrow, so it is a shame that a website designed to specifically make the transcripts more readable also displays the transcripts in a narrow box.

darepublic · on Dec 18, 2022

What I've wanted it search by transcript of past videos I've watched. With something like this it seems reasonable to imagine having a set up where every video you navigate to gets transcribed and test is indexed for later search

politelemon · on Dec 18, 2022

Just tried rickroll and many lines seem to be missing.

https://youtubetranscript.com/?v=dQw4w9WgXcQ

arboles · on Dec 18, 2022

This UI and Youtube's UI for transcripts are really nice. When I'm looking for a particular piece of information I can just Ctrl+F and click on the match to play from there. Youtube used to auto-generate subtitles, now it also formats subtitles as transcripts. I wish offline media players had this functionality, if I get distracted for a few seconds I don't have to watch those seconds again, I can speedread over the past couple lines.

arboles · on Dec 18, 2022

Call it "panoramic subtitles"

88stacks · on Dec 18, 2022

this will be dead soon due to having youtube in the name

lukeasch21 · on Dec 18, 2022

Don't worry, the website solved this issue: > "Probably Won't Fail: Featuring the latest build of an undocumented API."

This will work as long as YouTube doesn't change anything. And since when has YouTube changed anything?

seydor · on Dec 18, 2022

People can switch domains

kristianheljas · on Dec 18, 2022

Hehe, they might need to switch cloud provider as well. The domain and the underlying content is currently served by no other than google cloud.

ricklamers · on Dec 19, 2022

I see your transcript, and I raise you my ChatGPT summarized transcript extension!

https://github.com/ricklamers/ChatGPT-YouTube-summarizer

All jokes aside, I love the automated transcripts from YouTube. Videos are just so inefficient to consume as a format.

SkeuomorphicBee · on Dec 18, 2022

Why it is hard-coded to English? When I try to transcribe a video in any other language it throws the error:

> No transcripts were found for any of the requested language codes: ('en',) For this video ([...]) transcripts are available in the following languages: [...]

It even knows what language is available, so why no dump that instead?

aardvarkr · on Dec 18, 2022

Probably because it’s a hackathon style project that was slapped together and isn’t intended to support every use case. I’d recommend reaching out to the author with your feedback

amelius · on Dec 18, 2022

How can this be so fast? I tried it with two random urls, and the transcripts were instant, like less than 100ms.

FinalDestiny · on Dec 18, 2022

It appears to be using the YouTube auto-generated captions. The output, spacing, and punctuation are identical.

charcircuit · on Dec 18, 2022

YouTube already creates transcripts for accessibility and for feeding into other ML models.

samanator · on Dec 18, 2022

Likely cached. Try with a long video with few views.

Edit: after reading other comments it seems this may be using an undocumented api to retrieve the data.

faikuygur · on Dec 18, 2022

Here is how to extract Youtube video transcript to an Excel file with Robomotion:

https://demo.robomotion.io/designer/shared/6j984jBCQqYVBCaQk...

1vuio0pswjnm7 · on Dec 19, 2022

   #!/bin/sh

    ttml2srt()
    {
     x=$(echo x|tr x '\34');
     tr -d '\34' \
     |sed -n "/<p begin/{
      s/<p begin=\"//;
      s/\" end=\"/ --> /;
      s/\" style=\"s2\">/$x/;
      s#</p>##p;}" \
     |sed = \
     |sed "/$x/!s/^/$x/" \
     |tr '\34' '\12' \
     |sed '/[ ]-->[ ]/s/\./,/g'
    }

    read x;
    case $x in 
      https://www.youtube.com/watch?v=??????????\
      |https://www.youtube.com/watch?v=???????????\
      |https://youtu.be/???????????\
      |https://youtu.be/??????????)
    f=${x##*=};f=${f##*/};case ${#f} in 10|11)
    curl -4o $f.mp4 $x 
    video=$(tr \{ '\12' < $f.mp4|sed -n "/itag=22/{s/u0026/\&/g;s/%3D/=/g;s/%2C/,/g;s/%26/\&/g;s/.*url\":\"//;s/\".*//p;}"|tr -d '\134')
    test $video||exit
    ttml=$(tr \{ '\12' < $f.mp4 |sed -n '/timedtext/{s/u0026/\&/g;s/.*:\"//;s/\".*//;s/$/\&fmt=ttml/p;q;}'|tr -d '\134')
    test $ttml||exit
    curl -s4 $ttml|ttml2srt > $f.srt
    exec ffmpeg -v quiet -y -i $video -vf subtitles=$f.srt $f.mp4 
    esac
    esac

    exit

The script above, "1.sh", can be used as follows.

   echo https://www.youtube.com/watch?v=aeWyp2vXxqA | 1.sh

It will download the captions as .srt and then "hardsub" them into the video as it dowloads the .mp4. NB. This is slow YouTube downloading without using yt-dl/yt-dlp. Obviously, it will not work with commercial videos.

The .srt file is saved as [YouTube ID].srt and the video as [YouTube ID].mp4, where [YouTube ID] is a 10 or 11 ASCII character string.

The video format is itag=22, i.e., mp4/720p. Not all videos will have 22 of course. I usually try itag=18, mp4/360p, if 22 is not available. Change the format to whatever is preferred.

Looking around for a .ttml/.vtt/.srv[1-3] to srt converter I found solutions that required installing Python or some other large scripting language. On GitHub I found a project called "astisub" that will convert from ttml or vtt to srt. It is a 3.8M Go binary. I wrote a shell function instead.

johnlk · on Dec 18, 2022

Take video > transcribe > ask gpt to summarize > be genius in 2 mins

unangst · on Dec 18, 2022

Expect an email from Google lawyers early this week about the domain name.

antman · on Dec 18, 2022

I think "transcriptsforyoutube" would be passable? I remember something about a case using "for" and being ok but not any details.

bdcravens · on Dec 18, 2022

They generally don't get into nuance. If someone's trademark is in your domain name, expect a C&D.

seydor · on Dec 18, 2022

This is great and works well. What is the copyright status of transcripts?

wantlotsofcurry · on Dec 18, 2022

Not sure on the transcript front, but the owner may want to consider removing ‘youtube’ from their name.

kube-system · on Dec 18, 2022

They are owned by the copyright owner of the underlying audio.

seydor · on Dec 18, 2022

but for example, is it fair use to reproduce? what about indexing?

kube-system · on Dec 18, 2022

Depends on why it is being done.

breck · on Dec 18, 2022

This is amazing! The speed and simplicity makes me happy. Thank you!

breck · on Dec 18, 2022

https://youtubetranscript.com/?v=DvxxdZpMFHg

"Error: transcripts disabled for that video"

Why?

arboles · on Dec 18, 2022

Youtube didn't generate captions for that video

t0bia_s · on Dec 19, 2022

Anything like that for podcasts? I cannot waste time by listening casual dialogues. Text reduced to point is much more effective.

agowa338 · on Dec 19, 2022

Fun fact, this is how youtube does "manual reviews". A manual review means someone has read the automated transcript...

herpderperator · on Dec 19, 2022

Slightly related: https://youglish.com/

EGreg · on Dec 18, 2022

Who built this?

We want to partner with you on a topl that autogenerates clips of any video based on the topic start and end

dukeofdoom · on Dec 18, 2022

Something like this would be nice to be able to search local videos for specific keywords spoken too.

maybelsyrup · on Dec 18, 2022

I’ve been dreaming about something like this for years. Huge deal for me. Thank you for your work!

chrisbrndl · on Dec 19, 2022

Is it done with OpenAi's whisper? Now it would be cool to have it summerized by ChatGPT.

pragmatick · on Dec 19, 2022

As funny as that landing page is I would like to see at least some information on the stack.

bentt · on Dec 19, 2022

Tried it. It works! I always wanted this. Thank you YoutubeTranscript.com person(s)!

zcombynator · on Dec 19, 2022

didn't work of https://youtubetranscript.com/?v=FgW5e5OuOB8

TheCaptain4815 · on Dec 18, 2022

Funny, was just looking for a tool like this.

Any chance timestamps could be added?

cm2187 · on Dec 18, 2022

With youtube dl you can download the subtitle tracks which should have timestamps. Though last time I checked they were broken (showing the whole test on the first timestamp) but perhaps they fixed it

chiefalchemist · on Dec 18, 2022

For Power Point and screenshare based videos, a screenshot every 15 seconds or so would be great.

Often enough I'd rather read than watch. Reading in faster. Having corresponding visuals would be a big plus.

pete_nic · on Dec 19, 2022

Can it calculate words per minute too? That would be helpful.

arcturus17 · on Dec 18, 2022

The copy on your website is pure fire my dude.

gbertb · on Dec 18, 2022

Is this utilizing whisper to transcribe?

arboles · on Dec 18, 2022

Youtube already auto-generates transcripts that you can see in the ... menu in most videos. This website just seems like an alternative frontend?

EGreg · on Dec 18, 2022

Or maybe it processes the video with its own backend ? How do you tell

arboles · on Dec 18, 2022

Just minutes ago, I compared two transcripts for the same video and they were the exact same. Also on YouTubeTranscript.com swearing was redacted with [_], which is something I've only ever seen on youtube captions.

kristianheljas · on Dec 18, 2022

First indication is the processing speed - there's known machine in the world that could transcribe videos in such speed.

EGreg · on Dec 18, 2022

How about a cluster in parallel?

codetrotter · on Dec 18, 2022

The simplest explanation is often the most probable one.

Why would you reach for a cluster of machines working in parallel, when you could retrieve the already auto-created transcript from YouTube servers?

Also, other comments have pointed out that the transcripts are identical with the ones created by YouTube, which would be unlikely to happen if this service was creating transcripts of their own.

tkgally · on Dec 19, 2022

Apparently not. I compared the result of the linked site to a Whisper transcription of the same video. While the word-by-word accuracy of the two transcriptions was about the same, the Whisper transcription was punctuated nearly perfectly and therefore much easier to read.

mmsc · on Dec 19, 2022

youglish.com is a similar type of website, but you can read the words as they're spoken.

bilater · on Dec 18, 2022

Nice! Did something similar a couple of years ago: https://you-tldr.com/

Summary remains the elusive hard part...both to do and to serve at scale. But we get closer by the day.

chillel · on Dec 20, 2022

Nice work. I actually found your site thru Google by trying to find this HN post again. Out of curiosity, do you get a lot of paid subscribers? What do revenue figures look like for a project like this?