Ep. 257 - June 7, 2024
Description
ArXiv NLP research for Friday, June 07, 2024.
00:19: Key-Element-Informed sLLM Tuning for Document Summarization
01:22: Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models
02:42: Large Language Model-guided Document Selection
04:13: More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play
05:24: DiNeR: a Large Realistic Dataset for Evaluating Compositional Generalization
06:43: MATTER: Memory-Augmented Transformer Using Heterogeneous Knowledge Sources
08:01: Mixture-of-Agents Enhances Large Language Model Capabilities
09:09: AICoderEval: Improving AI Domain Code Generation of Large Language Models
11:00: CRAG -- Comprehensive RAG Benchmark
13:04: CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models
14:52: Think out Loud: Emotion Deducing Explanation in Dialogues
16:43: WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
18:46: SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals
19:58: BERTs are Generative In-Context Learners
20:43: Annotating FrameNet via Structure-Conditioned Language Generation
21:49: Revisiting Catastrophic Forgetting in Large Language Model Tuning
22:43: FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
24:33: Do Language Models Exhibit Human-like Structural Priming Effects?
25:27: Uncertainty Aware Learning for Language Model Alignment
26:50: The Russian Legislative Corpus
27:24: ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
28:53: HateDebias: On the Diversity and Variability of Hate Speech Debiasing
30:29: A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
32:00: Sexism Detection on a Data Diet
33:18: XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model
34:21: Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models
35:32: LLM-based speaker diarization correction: A generalizable approach
36:52: TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models
38:10: BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
39:10: Quantifying Geospatial in the Common Crawl Corpus
40:14: MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter
41:47: Language models emulate certain cognitive profiles: An investigation of how predictability measures interact with individual differences
43:19: Compositional Generalization with Grounded Language Models
44:26: Scenarios and Approaches for Situated Natural Language Explanations
46:04: Are Large Language Models More Empathetic than Humans?
47:38: SUMIE: A Synthetic Benchmark for Incremental Entity Summarization
48:52: Multi-Head RAG: Solving Multi-Aspect Problems with LLMs
50:33: An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
ArXiv NLP research for Thursday, June 13, 2024.
00:20: Chain-of-Though (CoT) prompting strategies for medical error detection and correction
01:31: CoastTerm: a Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature
02:52: RH-SQL: Refined Schema and Hardness Prompt for...
Published 06/15/24
ArXiv NLP research for Thursday, June 13, 2024.
00:20: Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning
01:53: Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models
03:26: Automated Essay Scoring Using Grammatical Variety and...
Published 06/15/24