Premium Only Content
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models (Explained)
#gpt3 #knowledge #symbolic
Symbolic knowledge models are usually trained on human-generated corpora that are cumbersome and expensive to create. Such corpora consist of structured triples of symbolic knowledge. This paper takes a different approach and attempts to generate such a corpus by prompting GPT-3. Results show that clever prompting, combined with targeted small critic models trained on human ratings can outperform both human-generated data, as well as the teacher model (GPT-3) itself. The results of this paper give a general recipe for automatically building corpora for various NLP tasks by extracting samples from large language models.
OUTLINE:
0:00 - Intro & Overview
2:30 - Sponsor: Weights & Biases
4:15 - Commonsense Knowledge Graphs
7:50 - ATOMIC dataset
10:00 - Generating the corpus from a model
13:00 - Prompting GPT-3
15:30 - Generating Events
18:40 - Generating Inferences
23:00 - Evaluating the created dataset
26:45 - Introducing the critic
31:25 - Using the critic to filter the data
36:30 - Training a student on the generated data
41:00 - Key Findings
44:45 - Comments & Conclusion
Paper: https://arxiv.org/abs/2110.07178
Code & Corpus: https://github.com/peterwestai2/symbo...
Sponsor: Weights & Biases
https://wandb.com
https://community.wandb.ai/
Abstract:
The common practice for training commonsense models has gone from-human-to-corpus-to-machine: humans author commonsense knowledge graphs in order to train commonsense models. In this work, we investigate an alternative, from-machine-to-corpus-to-machine: general language models author these commonsense knowledge graphs to train commonsense models. Our study leads to a new framework, Symbolic Knowledge Distillation. As with prior art in Knowledge Distillation (Hinton et al., 2015), our approach uses larger models to teach smaller models. A key difference is that we distill knowledge symbolically-as text-in addition to the neural model. We also distill only one aspect-the commonsense of a general language model teacher, allowing the student to be a different type, a commonsense model. Altogether, we show that careful prompt engineering and a separately trained critic model allow us to selectively distill high-quality causal commonsense from GPT-3, a general language model. Empirical results demonstrate that, for the first time, a human-authored commonsense knowledge graph is surpassed by our automatically distilled variant in all three criteria: quantity, quality, and diversity. In addition, it results in a neural commonsense model that surpasses the teacher model's commonsense capabilities despite its 100x smaller size. We apply this to the ATOMIC resource, and share our new symbolic knowledge graph and commonsense models.
Authors: Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, Yejin Choi
Links:
TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick
YouTube: https://www.youtube.com/c/yannickilcher
Twitter: https://twitter.com/ykilcher
Discord: https://discord.gg/4H8xxDF
BitChute: https://www.bitchute.com/channel/yann...
Minds: https://www.minds.com/ykilcher
Parler: https://parler.com/profile/YannicKilcher
LinkedIn: https://www.linkedin.com/in/ykilcher
BiliBili: https://space.bilibili.com/1824646584
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: https://www.subscribestar.com/yannick...
Patreon: https://www.patreon.com/yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
-
3:33
American Trivia
3 years agoGeneral Knowledge Trivia
14 -
3:13
American Trivia
3 years agoGeneral Knowledge test
6 -
2:04:40
I_Came_With_Fire_Podcast
13 hours agoCartels vs The United States, Fentanyls 2 Front WAR, and FTOs
27.8K -
4:54
CryptoWrld
14 hours ago $1.87 earnedCrypto Startup Launches Tokenized US Treasury Bonds
28.9K4 -
2:29:15
We Like Shooting
20 hours ago $1.21 earnedWe Like Shooting 596 (Gun Podcast)
20.8K -
54:43
Kimberly Guilfoyle
13 hours agoThe Trump Effect: Mexico Folds, Live with Dinesh D’Souza & Chuck DeVore | Ep.193
104K34 -
1:20:47
Redacted News
12 hours agoMexico CAVES to Trump over tariffs, USAID Shutdown, & Zelensky loses $200 billion | Redacted Live
186K467 -
1:02:29
The StoneZONE with Roger Stone
8 hours agoIs GOP Sen. Bill Cassidy Playing Politics with RFK Jr. Vote as U.S. Faces Public Health Crisis?
36.2K5 -
1:10:30
BIG NEM
11 hours ago📢 THE JOLLOF-OFF: The Battle for West African Cuisine! 🇳🇬🔥🇬ðŸ‡
30.7K4 -
54:58
LFA TV
1 day agoThe Trade War Begins | TRUMPET DAILY 2.3.25 7pm
56.7K27