OpenAI's GPT-4 using YouTube transcripts to improve its language model
By MYBRANDBOOK
More than a million hours of YouTube video transcriptions have been used by OpenAI to improve GPT-4, their sophisticated language model. Even with the knowledge of possible legal repercussions, OpenAI defended its conduct by citing fair usage as a means of improving the worldview of its model. OpenAI's President, Greg Brockman, was directly involved in the selection of training videos.
OpenAI uses "numerous sources including publicly available data and partnerships for non-public data."The company is also contemplating the creation of its own synthetic data. The company had trained its models using data such as computer code from Github, chess move databases, as well as educational content from Quizlet. After other resources were depleted, it considered using transcriptions from YouTube videos, podcasts, and audiobooks.
The report also mentions that OpenAI had exhausted useful data sources by 2021. OpenAI is consistently sourcing data to improve its AI models.
Google's representative, Matt Bryant, stated that the company has "seen unconfirmed reports" about OpenAI's use of YouTube transcripts. He said that Google's guidelines prohibit unauthorized scraping or downloading of YouTube content.
YouTube CEO Neal Mohan made similar comments this week regarding OpenAI's potential use of YouTube data to train its Sora video-generating model. Bryant also highlighted that Google enforces "technical and legal measures" to prevent unauthorized usage when there's a clear legal or technical justification.
Google Pay has added "Open Wallet" shortcut
With the introduction of the "Open Wallet" shortcut, Google Pay has impro...
TRAI targets to finalise National Broadcast Policy by May-end
The Telecom Regulatory Authority of India will finalise the National Broa...
TAC Security becomes Cyber Security Assessor for the App Defen
The cybersecurity company, TAC Security has been selected as a key Cyber ...
InterGlobe’s Rahul Bhatia and C.P. Gurnani together announce
In a move that is set to transform the AI landscape, Rahul Bhatia, Group M...
TEJAS NETWORKS INDIA PVT. LTD.
BHARAT ELECTRONICS LTD.
RELIANCE JIO INFOCOMM LTD.
ZOHO CORPORATION PVT. LTD.
Technology Icons Of India 2023: Harsh Jain
Harsh Jain is an Indian Entrepreneur, the co-founder and CEO of the In...
Technology Icons Of India 2023: Shailender Kumar
Shailender Kumar is senior vice president and regional managing direct...
Technology Icons Of India 2023: Rajiv Srivastava
Rajiv Srivastava is the Managing Director of Redington Group. With 35 ...
NIC bridging the digital divide and supporting government in eGovernance
The National Informatics Centre (NIC) is an Indian government departme...
INDIANOIL helps reach precious petroleum fuels to every nook and corner of the country
IndianOil, a diversified, integrated energy major with presence in alm...
BBNL empowering rural India digitally
BBNL provide high speed digital connectivity to Rural India at afforda...
TECHNOBIND SOLUTIONS PVT. LTD.
TechnoBind’s business model is focused on identifying and partnering...
SONATA INFORMATION TECHNOLOGY LIMITED
Sonata Software Limited is a leading Modernization engineering company...
R P TECH INDIA
R P Tech is recognized for its diverse products portfolio, value-add...