I Wrote an A.I. Program to Watch Youtube, Transcribe the Video and Create a Summary
top of page
搜尋
  • Jack Lau

I Wrote an A.I. Program to Watch Youtube, Transcribe the Video and Create a Summary

Happy Year of the Rabbit, everyone.


Over the past few months, one of the hottest news in technologies is the development of A.I. chatbot --- a service called ChatGPT created by OpenAI.


The technology is so hot that Microsoft will invest US$10 billion in it. ("ChatGPT: Microsoft to invest billions in chatbot maker OpenAI", https://www.bbc.com/news/technology-64374283). Its ability to generate text, essays, software code, and conversation is now touted as one of the most revolutionary developments. In fact, the implication is so great that Google has invited both of its founders to go back to Google to discuss what should be their response. Rumor has it that the technology can easily be a direct challenge to Google's search engine. (See news: "Google Calls In Help From Larry Page and Sergey Brin for A.I. Fight" , https://www.nytimes.com/2023/01/20/technology/google-chatgpt-artificial-intelligence.html)


Firstly, what can ChatGPT do? Simply put, it is a "brain" which has collected a massive amount of data that you can retrieve by asking human-like questions. You see, normally, in the past, in a Google search engine, you type a question, then Google will show you the relevant web page which contains the information you need. In ChatGPT, on the other hand, the machine "has already read the web pages", and will give the response directly to you.

More impressively, the information can be from several data source, ChatGPT organizes them and give you a human-like response.


Some questions you can ask, for instance:



I could also ask it to show me the programming code for certain tasks.


Some Limitations Though

Impressive as it is, there are several limitations. First and foremost, the data collected and "trained" by the ChatGPT engine is only up to the year 2020. So, you cannot really ask, for instance, what is the closing price of Microsoft today in the year 2023.

Secondly, it seems to be "legally" cautious. That sounds funny, right? But if you ask a question, like stock prices, it will tell you something that sounds like legal protection than giving it a straight answer. It would say something like, well, I am not sure since you did not specify the time etc. Of course, if your stock broker answers you, she would have given you at least a rough estimate.

Despite that, ChatGPT (it has a cousin called GPT-3), can provide you with a lot of "insights". For instance, I use it to help me draft software codes. I can also ask it to help me "summarize" a reading.

A Program to Watch Youtube and Summarize the Context

Over the holiday, while eating a lot of festival cakes and bloating like a silly piggy, I thought how nice it would be if I could create an assistant who (should it be "which") could help me watch all the informative Youtube video and give me a summary.


Not sure if it is true for others, one of my biggest problems is watching a full Youtube video can take too long. I would very much like to get a "Readers' Digest" of some videos. Then, of course, I could choose to watch the full video later.


And, on top of that, I sometimes just want to know the "hot" headlines from time to time.

So, what I did was something like this:


Step 1: Write a program that reads ("watches") any Youtube video. Then use the program to provide a transcript of the conversation. (Notice that it is true now Youtube can generate a transcript as well. But, I do that it would not be so automatic for my next step.)

Step 2: Continue the program and pass the transcript to ChatGPT (or GPT-3) to perform a key summary.

Step 3: Save the result on Google Sheets, so I can read it anytime.

Before I show the program itself, I must say, I have never had so much fun writing a piece of code. Actually, part of the code was assisted by ChatGPT itself, which shows me how to write a code for a task. Granted that modifications were necessary, hey, I feel being helped. And this must sound weird, I actually feel less "lonely" in writing the code. OK. This must be sounding really weird and downright "errie" --- a robo-friend? More on that some other time.


Here I use this video as a test.






Well, I hope this is fun for you to read. Happy Year of the Rabbit, everyone. May fortune and health be with you and your loved ones.


121 次查看0 則留言
bottom of page