newsHacker NewsTrust 72 · CommunityPublished 4d agoLive · 4d ago
Knowledge Distillation of Black-Box Large Language Models (2024)
115points22comments
Covers
paperNuclearQAv2: A Structured Benchmark for Evaluating Domain-Science Competence in Large Language ModelspaperHow Surprising Is Historical Italian to Language Models? Tokenization Tax, Comprehension Tax, and a Simple Mitigationrepoengineering87/llm-atlasmodeldeepseek-ai/DeepSeek-V3modeldeepseek-ai/DeepSeek-V3-0324glossary_termTransformer
Covers (incoming)
paperLittle Brains, Big Feats: Exploring Compact Language ModelspaperEfficient Retrieval-Augmented Generation via Token Co-occurrence GraphspaperGrounding LLM Reasoning under Incomplete Graph EvidencepaperThe Model Organism Lottery: Model Organism Interpretability Strongly Depends on Training Methodologyrepochrisliu298/awesome-on-policy-distillationrepochrisliu298/awesome-llm-unlearning
Related across the graph
repochrisliu298/awesome-llm-unlearningpaperLittle Brains, Big Feats: Exploring Compact Language Modelsrepochrisliu298/awesome-on-policy-distillationglossary_termTransformermodeldeepseek-ai/DeepSeek-V3repoengineering87/llm-atlaspaperGrounding LLM Reasoning under Incomplete Graph EvidencepaperEfficient Retrieval-Augmented Generation via Token Co-occurrence GraphspaperHow Surprising Is Historical Italian to Language Models? Tokenization Tax, Comprehension Tax, and a Simple Mitigationmodeldeepseek-ai/DeepSeek-V3-0324paperThe Model Organism Lottery: Model Organism Interpretability Strongly Depends on Training MethodologypaperNuclearQAv2: A Structured Benchmark for Evaluating Domain-Science Competence in Large Language Models
