Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Hacker News new | past | comments | ask | show | jobs | submit login
YouTube Transcript – read YouTube videos (youtubetranscript.com)
411 points by fragmede on Dec 18, 2022 | hide | past | favorite | 140 comments



If you want an CLI version of a similar idea, you can use yt-dlp and some simple jq to pull down the captions for a file:

    curl `\
      yt-dlp -j "https://www.youtube.com/watch?v=aeWyp2vXxqA" | \
      jq -r '.automatic_captions.en[] | select(.ext=="json3") | .url'`


That gets me the subs in some usual complicated format? It's a bit of work to extract the actual text from that.

To get youtube subs in .srt format this gave me some limited success:

    yt-dlp --convert-subs=srt  --write-auto-sub --write-sub --sub-lang "en,en-us,en-GB,automatic-caption-en" --skip-download  "https://www.youtube.com/watch?v=1OfxlSG6q5Y"
Behind the scenes yt-dlp is downloading the subs in .vtt format than using ffmpeg to convert those to .srt. Depending on your situation the original .vtt format might be fine.


Or even better, yt-whisper, which uses OpenAI's Whisper speech to text. I guess it'd be better to first check whether the video has captions first before Whispering, so maybe both your command and this one could be used together.

https://github.com/m1guelpf/yt-whisper


Not all YouTube videos with spoken text have automatic captions.




I am not a fan of this pattern - if I'm understanding correctly, you would have to part with all of yt-dlp's niceties like playlist/channel handling, quality selection, file naming, logging config etc.

Why not just use the whisper cli on yt-dlp CLI's output for videos with bad or no subtitles?


Sure you could do that too. yt-whisper uses yt-dlp underneath so there might be a way to pass arguments to the inner yt-dlp instance. Or if not you can modify the source directly, it seems to be a simple wrapper. Or again you can do what you were saying, using the Whisper CLI. All good options, I just mentioned this one since it's easier if I just want to download a video with subs.


  ... | split_sentences | grep -viE '*vpn*'


I apologize for the question, but I am not entirely clear where "split_sentences" is. Is it a separate script? I have been looking for something with that sort of functionality for a while, very often for this very purpose, splitting transcripts.


Sorry for the late answer, but yes, it would have to be a separate script or command, it is purely fictional, I made it up because it made more sense for the joke to have it, and people might have pointed out that my grep would have filtered out too much context, so I had to add this.

I'm sure there are many unix-y tools for this purpose, but I don't know of them. If you're looking for something that's installed everywhere, maybe a very big awk or sed regex with multiline wizardry could do the trick for most easy-to-parse latin languages and you'd just have to copypaste it around. It prolly becomes harder for regexes once you start working with right-to-left languages like Arabic, and languages with different ponctuation, so it might not be i18n-friendly.

Related Stackoverflow : https://stackoverflow.com/questions/33704443/python-regexp-s...


`| grep -viE 'skillshare'`


Fwiw YouTube already has a feature for this. Click the "..." next to the share and click Show Transcript. There are also extensions like https://chrome.google.com/webstore/detail/youtube-captions-s... that makes it easy to search them in a popup.


They seem to have moved the functionality to the end of the description, and there you can find the "Show captions" button.

The extension I made to export the transcript was based on this YouTube functionality, I should update the instructions now.

https://chrome.google.com/webstore/detail/youtube2anki/boebb...


Regular "find in page" works to search the transcript on YouTube. I use it often.


Thank you, Do you use it regularly? I run a problem validation forum and someone was asking exactly for that[1].

[1]: Searching YouTube videos with transcript - https://needgap.com/problems/88-searching-youtube-videos-wit....


This is a great idea; I really enjoy all these "two channels simultaneously" (side-by-side translations, video with subtitles, and in this case video with a readable transcript, where you can scroll in the video or scroll in the transcript, and be synchronized).

I had done something like this a couple of years ago for some specific set of videos (e.g. https://shreevatsa.net/tex/program/videos/s10/ — compare with https://youtubetranscript.com/?v=_0Cv1G_s4gQ for the same video), but never got around to making it general; glad someone has done it. It takes just a few lines of Javascript, using the Youtube API, to do this i.e. keeping the video and text in sync (just view source on either page to see the JS at the bottom).

Something like this can also help with audio recordings (generating the alignment automatically is called "forced alignment" and there are tools like "aeneas" for this). In case anyone's interested or wants to help (for Sanskrit texts): see https://github.com/shreevatsa/web-align-audio-text deployed at https://shreevatsa.net/ramayana/sarga/ and better version at https://github.com/avinashvarna/audio_alignment deployed at https://avinashvarna.github.io/audio_alignment/


This is cool!

We're doing forced alignment with audio recordings i.e. podcasts. Here's an example from a test client: https://www.withfanfare.com/p/seldon-crisis/search-by-the-fo...

Grateful for any feedback you might have.

Also, if you run a knowledge-dense podcast, or know somebody that does, I would love to talk to you/them. I'm for example considering to link places, people and things in a transcript.


Lovely, thanks for the example. I really like how the underline is subtle yet effective. The animation/effect where it seems to inch forward is also intriguing; how does that work? (Also, this is subjective but have you considered making double-clicking on the text open the audio player even when it's not been started yet? Would make it more discoverable, though I imagine some could get annoyed. Another minor thing: when the audio is paused and you double click somewhere, the audio position changes but this is not reflected in the player; only takes effect when you unpause.)

You're doing cool work, good luck with your service! It looks very useful and if someone I know is running a podcast I'll recommend it for sure; it looks really well-done and polished (at least the example you shared).


Thank you! I wrote your points down. We're still learning how to make it most usable.

We know the start and end timestamps on a word level and we know the current player time. So all we do is set a CSS class on the words that are currently playing. We not only highlight the current word, but also the words that are close in time. This is what generates the effect.

We just released autoscroll, which is something that was requested often.


Wow, this looks incredible! Love how the underline travels and guides the eye.. Very impressive!

Excuse the question, but "forced alignment" is when you don't have timestamps, like in webvtt?


Thank you!

Yes, exactly. We do forced alignment when you edit your transcript. The new words don't have any timestamps, so we need to align them. For short sections we use interpolation. If we need align whole sections we use Gentle[^1].

[^1]: https://github.com/lowerquality/gentle


Thank you, this is really interesting! :)


A supremely useful site that searches YouTube transcripts is https://youglish.com. It shows you pronunciations in context for any word or name.


Thanks for the link! This site actually has a database of youtube transcripts unlike OP. Shame you can't search fixed strings, like two words in exact order. Though it seems genuinely useful for learning pronunciation as advertised.


I found a site for seriously searching within youtube subtitles

https://filmot.com


This script for whisper.cpp works really well

https://github.com/ggerganov/whisper.cpp/blob/master/example...

for my purposes I changed the output from subtitles to txt (so I could pipe the result into chatgpt)


That's doing speech recognition on the YouTube video's audio and then embedding the result as subtitles? Is the idea that this is superior to YouTube's own automatic subtitles? Though if the youtube video actually has manually crafted subtitles from the creator, as a lot of popular science content does, it seems a shame not to just use those subtitles directly and avoiding doing speech recognition. Or just rely on YouTube's automatic captions which are pretty decent in my experience.

See my comment on how to just download the subtitles YouTube provides with `yt-dlp` here: https://news.ycombinator.com/item?id=34040342


> so I could pipe the result into chatgpt

Tell us more :)


Nothing too exciting, just “summarize this” followed by the transcript in quotes, it works very well


Maybe this is a bit off topic, but does anyone know the legal footing of having a business with another businesses name in it? For instance, this tool uses the word "YouTube" in its name, though it is used as only a part of it, and it is not a competitor. I've always wondered how this works.


Broadly speaking, it would be trademark infringement if it is used in a way that may confuse others about the source of the product. It doesn’t necessarily have to be a specific product that Alphabet has a direct competitor for.


https://en.wikipedia.org/wiki/Nominative_use

> is a legal doctrine that provides an affirmative defense to trademark infringement as enunciated by the United States Ninth Circuit, by which a person may use the trademark of another as a reference to describe the other product, or to compare it to their own.


Nominative use is explicitly ruled out in some other jurisdictions. Do not rely upon it.


Most corporations regularly search for such domains, and submit cease-and-desist. I received one related to an eBay-related domain, but in my case, I hadn't built a business around it so it was easy enough to just take the site offline.


Not sure about YouTube but WordPress does not allow the use of the name. WP in your (e.g.) domain name is ok. WordPress is not.

I'd imagine it's very similar for others. Often a company will pursue a violation if only to be consistent in showing the courts they actively defend their copy right.


> Not sure about YouTube but WordPress does not allow the use of the name.

They may not like it, but they don't have the power to disallow you from using their name to refer to them. That's allowed.


Actually, they do. It's copyright. Plenty of legal precedent. They defend WordPress, but are willing to allow WP.

The law is on their side.


> Actually, they do.

No, they don't.

> It's copyright.

No, it isn't. It's trademark. There is no such concept as copyright in a single word.

> The law is on their side.

Again, no, it isn't.

What was the point of your comment? Why talk if you're not worried about whether what you're saying is true or false?


Yes. Trademark not copyright. My mistake.

Otherwise: https://wordpressfoundation.org/trademark-policy/

FFS Relax. It was an oversight on my part. It's minor at best in larger scheme of things. Maybe you're having a tough day? God bless you. But this is nothing to go to the mat over.

You could have just said, "Oh. Maybe you're confusing trademark and copyright?". The last thing the world and HN needs is another high-strung belligerent asshole. That doesn't help anyone, or anything, sans your ego. It's not a good look.


> Yes. Trademark not copyright. My mistake.

That is your most minor mistake. The much larger one is that you are claiming that the law supports the trademark policy you link to. It doesn't. Everyone remains free to use a trademark belonging to wordpress in order to refer to wordpress.

You are not free to use any such mark - the WordPress Foundation would be on very solid ground in telling you to stop using their graphical logo. But they can't stop you from using "WordPress" in the name of your for-profit company, which you'll note is quite explicitly contrary to their stated policy.


These companies tend to rename themselves when they become popular.


Hook this up to a language model, and maybe a user could instantly get the one sentence worth of information that the YouTube video creator buries in 10 minutes of monetized noise.

And also save yourself time when the creator teases that they provide the info, but it turns out they don't, they're just trying to get views.


YouTube created that problem by incentivizing longer videos. And now we have videos with tons of fluff.

Similarly Google incentivizes longer webpages, so now we have recipes that start with a novella about grandma’s cooking before showing the actual recipe.

It used to be nice to see a video’s thumbs up to thumbs down ratio to know if you’ve been click baited or not before watching the whole video. But that signal has been removed now too.


As a

recipe reader

I want to

dismiss cookies, have a video ad follow me down the page, and read why this cake conjures up memories of the author's childhood, before reaching the actual recipe

so that

I feel connected to the author, before fully committing to mixing ingredients


That "user story" is like a tragically misinterpreted comment by someone at a prospective customer, speaking of a special time with their grandmother, but garbled through N layers of field sales, marketing, product managers, engineering hierarchy, and Agile task management.

Including the part about declining more cookies offered (to save room for grandma's lasagna).


Tom Redman had this idea but he took feedback from Twitter. https://digg.com/2021/one-main-character-tom-redman-recipeas...


That is definitely true. I actually try and make my videos as short as possible but I noticed that over time the longer ones are definitely prioritized by the algorithm, yet I get lots of comments saying how people appreciate my very short videos.


As a hack, I wonder if tacking 10 minutes of ambient video onto the end (after a "And now, 10 minutes of ambient video" warning) would help.


I am not so sure because YouTube keeps track of watch time. If no one watches past 2 minutes, then I think that would also be penalized, though I have no insight into the algorithm so I am not sure.


Do you know if YouTube measures wall time or percent of video watched? If percent, then a hack might be slow-mo the content and encourage audience ti watch at 2x speed.


I think it is percent of video watched, because if you check out the analytics in YouTube Studio, it says how many people are still watching by a certain point. So your hack makes sense, but I'm not sure it would work because it is VERY hard to keep anyone's attention on YouTube. The moment it's something even slightly less than interesting, the vast majority of people go onto the next video.


Nice. Worth trying.

Or maybe just repeat the same content a few times.


I blame it on automated advertising. When you had to work to get someone to pay you you made good content. Now anyone can get paid as long as they're digestible. It's created the interesting situation where the average quality of demonetized content is far high than monetized content.


Are you sure Google prefers longer pages? I find (annoyingly) that Google likes the search version of my page for lots of things. E.g. a page called "best x of the y" the page for searching comments on that page called "best x of y search" where the only text is the title and a search input, will rank really well


Try to search for recepies :) I also see long novels which seem to disguise the ridiculous amount ads which google seems to like as well (these are mostly provided by no other than themselves!).


...or, just follow decent creators.

No snark intended, but i just gave up with the dross. And even some of them, of late, are getting a bit crafty. But, creators get one chance from me now - give me decent content, or even with the fancy chapters, you're not getting my eyeballs past two minutes. What I have found is that leaving the decent stuff on, what auto-plays after is 'generally' of similar quality. A quick set of back-buttoning and bookmarking has fairly often got me some interesting results.


Good idea, but I don't follow anyone on YouTube. I was thinking about searching the Web for a bit of info, the search hits include YouTube videos (but no finer resolution than "this entire video").

A search engine could, narrow in on the few sentences AV in the video that it thinks correspond to what I was searching for, and summarize that, and also link me to the AV start timepoint in case I also want to watch the video.

This might change the economics of some YouTube video content creation.


Google does exactly that, if a video shows up in the search results, it shows you only the relevant small part.


I've never seen this before now, but I just got a Google search result video page with a kind of table-of-contents index on one of the video hits just now. (These TOC entries don't correspond to the marked segments on the timeline. I don't know whether this is something YouTube is doing, or something the content creator did.)

Is this what you mean? (Pardon if I'm not familiar with the latest Google Search features; I've mostly been using DDG lately, so don't have occasion to see all the features that exhibit only occasionally.)


Sadly enough, I have the same issue with written content: BS websites add a ton of fluff to their texts as their deplorable approach to SEO and making some space for ads.

"Streaming your phone screen to a TV is something many people at some point want to do. But seeing the picture of your smartphone on a television or other device can be a daunting task. Here we list multiple ways how you can achieve the goal of sharing a mobile screen of an Android or an iOS device on a different device. It works for any manufacturer, like Samsung Smart TVs or Apple TV."

(goes on for another 15 paragrpahs before presenting a non-solution)

It's hard to put in words the amount of hate I feel for the authors of such pages. I'm desperately looking forward to the day Google comes up with better models for detecting useful content and those trash piles can burn in hell.


Or Google has sorted the results by how many advert placement sites you'll have been exposed to on that page before it's predicted you'll stop reading, so having the paydirt right at the end looks great, and it doesn't matter if it's fool's gold.


It sucks because time on page plus how much you scroll and once you find your answer you don’t click on any more results signals to Google that you were satisfied.

I’m just not convinced the preambles are needed.


Sad thing is, the next generation of AI models is probably being trained on that type of content.


I put something like this together to collect transcripts for uni videos. It’s dumps all transcripts into a directory, with URL links, so I can just search the whole directory to find the keyword I need.

Helped a lot with take home exams.


YouTube-dl had the ability to rip just subtitles. I once used this to grep for some information I wanted after downloading all of the transcripts.


I built an entire website based on that, there's a few other ones out there too.


I've had luck with pasting the transcript into ChatGPT and asking it for a seven-bullet list of main talking points.


Or hook it up with sponsorblock and remove all the non content time stamps


Pretty nice. The sliver of content still worth watching on YouTube doesn't have repetitive stuff or padding to make it to the 10 min mark though.

If you go to the homepage with clear cookies it's just endless amounts of utterly dogshit cookie cutter content. Same clickbait thumbnails with a person pulling an idiotic expression. Even the videos masquerading as educational are entertainment at best. If I had kids I'd do everything in my power to keep them away from YouTube.


The burning hate I feel for all information to be locked away in a YouTube video. This will solve that real world problem. I love reading (or, skimming) through a long read.


> I feel for all information to be locked away in a YouTube video.

Google Search actually indexes transcripts of a video and shows you some YouTube results based on that even though the title/description of the video doesn't match the search query.


I had a huge backlog of tech videos, so I wrote me this (also to play a bit with Haskell, the base idea can be replicated easily in any language though): https://github.com/rberenguel/glancer


This is great. Sharing like this is what I love about hn. Do you have any other features planned?

I’m considering a master keyword list to index against any text that comes in.


The only thing I have in my "someday" list of tasks for Glancer is the possibility of adding/using the whisper binary to get captions when/if unavailable. Aside from that I just keep it more or less working by using it myself (and to be fair on this one, I wish I wrote it in another language, Haskell can be a bit finicky to build if you do it sparingly).

Any addition that is on the "view" layer (the generated HTML) is very easy to add, just needs to go into the template file, at some point I might tweak that area but currently have no outstanding idea/requirement. The rest (i.e. the bulk of the code) is just a very bare-bones parser for captions that should be pretty stable and need no additions (crossing fingers here).


Whisper is neat. openai probably open sourced it for a reason and it’s probably because they have something much quicker lol.

I’ve seen vosk used on device and it’s decently quick too on a recent Apple chip.


Heh, this basically makes a storyboard


I've published almost 1,800 video diaries and this is a game changer for me. I've been wanting to do more with the back catalog, but don't have transcripts.


Not sure if you know that, but YouTube has a transcript feature available for years now. It's somewhat hidden in the interface, but let's you search with ctrl-F (or command-F) in the transcript


I use this for city council meetings to figure out who said what. It's not easy, but it's better than nothing. YouTube doesn't appear to do so well with multiple speakers.


I’ve tried to do this recently. Any suggestions on tools or workflows to dissect into different speakers?


Yeah this website just extract the transcript that exists and displays it alongside the video. It's nice, but it's not doing the transcribing itself.


Just checked that google also includes youtube captions in search returns.


The ratio of information to misinformation on Youtube seems pretty bad.

To make transcripts easier to access might create more problems than it solves.

Granted I can't make a bullet-proof argument; there's no clear way to quantify that ratio.


My only complaint is with the layout of the site - could you please make the transcripts span across the whole width of the page, not just to the right of the video?

My one gripe with Youtube's own transcript box is that it is too narrow, so it is a shame that a website designed to specifically make the transcripts more readable also displays the transcripts in a narrow box.


What I've wanted it search by transcript of past videos I've watched. With something like this it seems reasonable to imagine having a set up where every video you navigate to gets transcribed and test is indexed for later search


Just tried rickroll and many lines seem to be missing.

https://youtubetranscript.com/?v=dQw4w9WgXcQ


This UI and Youtube's UI for transcripts are really nice. When I'm looking for a particular piece of information I can just Ctrl+F and click on the match to play from there. Youtube used to auto-generate subtitles, now it also formats subtitles as transcripts. I wish offline media players had this functionality, if I get distracted for a few seconds I don't have to watch those seconds again, I can speedread over the past couple lines.


Call it "panoramic subtitles"


this will be dead soon due to having youtube in the name


Don't worry, the website solved this issue: > "Probably Won't Fail: Featuring the latest build of an undocumented API."

This will work as long as YouTube doesn't change anything. And since when has YouTube changed anything?


People can switch domains


Hehe, they might need to switch cloud provider as well. The domain and the underlying content is currently served by no other than google cloud.


I see your transcript, and I raise you my ChatGPT summarized transcript extension!

https://github.com/ricklamers/ChatGPT-YouTube-summarizer

All jokes aside, I love the automated transcripts from YouTube. Videos are just so inefficient to consume as a format.


Why it is hard-coded to English? When I try to transcribe a video in any other language it throws the error:

> No transcripts were found for any of the requested language codes: ('en',) For this video ([...]) transcripts are available in the following languages: [...]

It even knows what language is available, so why no dump that instead?


Probably because it’s a hackathon style project that was slapped together and isn’t intended to support every use case. I’d recommend reaching out to the author with your feedback


How can this be so fast? I tried it with two random urls, and the transcripts were instant, like less than 100ms.


It appears to be using the YouTube auto-generated captions. The output, spacing, and punctuation are identical.


YouTube already creates transcripts for accessibility and for feeding into other ML models.


Likely cached. Try with a long video with few views.

Edit: after reading other comments it seems this may be using an undocumented api to retrieve the data.


Here is how to extract Youtube video transcript to an Excel file with Robomotion:

https://demo.robomotion.io/designer/shared/6j984jBCQqYVBCaQk...


   #!/bin/sh

    ttml2srt()
    {
     x=$(echo x|tr x '\34');
     tr -d '\34' \
     |sed -n "/<p begin/{
      s/<p begin=\"//;
      s/\" end=\"/ --> /;
      s/\" style=\"s2\">/$x/;
      s#</p>##p;}" \
     |sed = \
     |sed "/$x/!s/^/$x/" \
     |tr '\34' '\12' \
     |sed '/[ ]-->[ ]/s/\./,/g'
    }

    read x;
    case $x in 
      https://www.youtube.com/watch?v=??????????\
      |https://www.youtube.com/watch?v=???????????\
      |https://youtu.be/???????????\
      |https://youtu.be/??????????)
    f=${x##*=};f=${f##*/};case ${#f} in 10|11)
    curl -4o $f.mp4 $x 
    video=$(tr \{ '\12' < $f.mp4|sed -n "/itag=22/{s/u0026/\&/g;s/%3D/=/g;s/%2C/,/g;s/%26/\&/g;s/.*url\":\"//;s/\".*//p;}"|tr -d '\134')
    test $video||exit
    ttml=$(tr \{ '\12' < $f.mp4 |sed -n '/timedtext/{s/u0026/\&/g;s/.*:\"//;s/\".*//;s/$/\&fmt=ttml/p;q;}'|tr -d '\134')
    test $ttml||exit
    curl -s4 $ttml|ttml2srt > $f.srt
    exec ffmpeg -v quiet -y -i $video -vf subtitles=$f.srt $f.mp4 
    esac
    esac

    exit
The script above, "1.sh", can be used as follows.

   echo https://www.youtube.com/watch?v=aeWyp2vXxqA | 1.sh
It will download the captions as .srt and then "hardsub" them into the video as it dowloads the .mp4. NB. This is slow YouTube downloading without using yt-dl/yt-dlp. Obviously, it will not work with commercial videos.

The .srt file is saved as [YouTube ID].srt and the video as [YouTube ID].mp4, where [YouTube ID] is a 10 or 11 ASCII character string.

The video format is itag=22, i.e., mp4/720p. Not all videos will have 22 of course. I usually try itag=18, mp4/360p, if 22 is not available. Change the format to whatever is preferred.

Looking around for a .ttml/.vtt/.srv[1-3] to srt converter I found solutions that required installing Python or some other large scripting language. On GitHub I found a project called "astisub" that will convert from ttml or vtt to srt. It is a 3.8M Go binary. I wrote a shell function instead.


Take video > transcribe > ask gpt to summarize > be genius in 2 mins


Expect an email from Google lawyers early this week about the domain name.


I think "transcriptsforyoutube" would be passable? I remember something about a case using "for" and being ok but not any details.


They generally don't get into nuance. If someone's trademark is in your domain name, expect a C&D.


This is great and works well. What is the copyright status of transcripts?


Not sure on the transcript front, but the owner may want to consider removing ‘youtube’ from their name.


They are owned by the copyright owner of the underlying audio.


but for example, is it fair use to reproduce? what about indexing?


Depends on why it is being done.


This is amazing! The speed and simplicity makes me happy. Thank you!


https://youtubetranscript.com/?v=DvxxdZpMFHg

"Error: transcripts disabled for that video"

Why?


Youtube didn't generate captions for that video


Anything like that for podcasts? I cannot waste time by listening casual dialogues. Text reduced to point is much more effective.


Fun fact, this is how youtube does "manual reviews". A manual review means someone has read the automated transcript...


Slightly related: https://youglish.com/


Who built this?

We want to partner with you on a topl that autogenerates clips of any video based on the topic start and end


Something like this would be nice to be able to search local videos for specific keywords spoken too.


I’ve been dreaming about something like this for years. Huge deal for me. Thank you for your work!


Is it done with OpenAi's whisper? Now it would be cool to have it summerized by ChatGPT.


As funny as that landing page is I would like to see at least some information on the stack.


Tried it. It works! I always wanted this. Thank you YoutubeTranscript.com person(s)!



Funny, was just looking for a tool like this.

Any chance timestamps could be added?


With youtube dl you can download the subtitle tracks which should have timestamps. Though last time I checked they were broken (showing the whole test on the first timestamp) but perhaps they fixed it


For Power Point and screenshare based videos, a screenshot every 15 seconds or so would be great.

Often enough I'd rather read than watch. Reading in faster. Having corresponding visuals would be a big plus.


Can it calculate words per minute too? That would be helpful.


The copy on your website is pure fire my dude.


Is this utilizing whisper to transcribe?


Youtube already auto-generates transcripts that you can see in the ... menu in most videos. This website just seems like an alternative frontend?


Or maybe it processes the video with its own backend ? How do you tell


Just minutes ago, I compared two transcripts for the same video and they were the exact same. Also on YouTubeTranscript.com swearing was redacted with [_], which is something I've only ever seen on youtube captions.


First indication is the processing speed - there's known machine in the world that could transcribe videos in such speed.


How about a cluster in parallel?


The simplest explanation is often the most probable one.

Why would you reach for a cluster of machines working in parallel, when you could retrieve the already auto-created transcript from YouTube servers?

Also, other comments have pointed out that the transcripts are identical with the ones created by YouTube, which would be unlikely to happen if this service was creating transcripts of their own.


Apparently not. I compared the result of the linked site to a Whisper transcription of the same video. While the word-by-word accuracy of the two transcriptions was about the same, the Whisper transcription was punctuated nearly perfectly and therefore much easier to read.


youglish.com is a similar type of website, but you can read the words as they're spoken.


Nice! Did something similar a couple of years ago: https://you-tldr.com/

Summary remains the elusive hard part...both to do and to serve at scale. But we get closer by the day.


Nice work. I actually found your site thru Google by trying to find this HN post again. Out of curiosity, do you get a lot of paid subscribers? What do revenue figures look like for a project like this?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: