CSE 561A: Large Language Models (2024 Fall)

Course Overview

This is an advanced research-oriented course that teaches fundamentals of Large Language Models (language model architecture and training framework) as well as Large Language Model capabilities, applications and issues. We will be teaching and discussing state-of-the-art papers about large language models.
Pre-requisites: Students are expected to understand concepts in machine learning (CSE 417T/517A)

Course Grading

15% Class Participation
- Regular class participation and discussion (10%)
- Preview question submissions (5%)
30% Paper Presentation
55% Final Project
- 10% Project/Survey Proposal
- 10% Mid-term Report
- 10% Final Course Presentation (Group-based)
- 5% Feedbacks for other groups’ final project presentations
- 20% Final Project Report

Paper Presentation

Grading Criteria:

Well Preparation: Whether the slides are sent over by the given deadline for the instructors to give feedback
- For Tuesday classes, send over your slides before the previous Friday 12:00PM
- For Thursday classes, send over your slides before the previous Monday 12:00PM
Completeness: Whether the presentation covers the background and major contribution of the listed papers, and is delivered within the required timeframe
Clarity: Whether the presenter clearly convey the information from their slides
Q&A: If there are any raised questions from the audiences, whether the presenters can handle the questions properly

Each student is also required to submit a preview question for a paper one day before the presentation for 3 times (need to be on 3 different classes, and not the date that you present). You are also encouraged to raise that question in class. Preview questions cannot be simple ones like "what is the aim of the paper?" or "what is the difference between this method and traditional method in nlp?"

Final Project (2-3 students per group)

Project Requirement: There are typically two types of projects.

Designing a novel algorithm to train a medium-sized language model: BERT, GPT-2 for problems that you are interested in.

https://huggingface.co/models

Designing a novel algorithm to do inference on large language models (white box models such as LLaMA2 models, or black box models such as GPT-4, CLAUDE, etc.) to solve some type of complex problems, and analyze their limitations.

Project Presentation: Date: 12/3 and 12/5. You will need to signup for a time slot near the end of the semester. Students will need to submit feedback scores for other groups’ presentation (through Google Form).

Office Hour

Our office hour will be on-demand ones: If you find yourself needing to discuss course materials or have questions at any point, feel free to send an email requesting an office hour. Based on these requests, we will organize time slots for students to schedule appointments.

Teaching Assistant

Chengsong Huang(chengsong@wustl.edu)

Syllabus (The dates of the courses are tentative due to guest lectures.)

Date	Topic	Readings	Slides
Large Language Model Basics
8/27	Course Overview	Distributed Representations of Words and Phrases and their Compositionality (Word2Vec) Enriching Word Vectors with Subword Information Attention Is All You Need (Transformer)	Slides
8/29	Language Model Architectures	Language Models are Unsupervised Multitask Learners (GPT-2) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension	Slides
9/3	Prompting and In-Context Learning	Language Models are Few-Shot Learners (GPT-3) Emergent Abilities of Large Language Models Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers	Slides
9/5	Language Model Instruction Tuning	Multitask Prompted Training Enables Zero-Shot Task Generalization Cross-Task Generalization via Natural Language Crowdsourcing Instructions Self-Instruct: Aligning Language Models with Self-Generated Instructions LIMA: Less Is More for Alignment How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources	Slides
	-----Student Presentation Starts-----
Large Language Model Capabilities
9/10	Language Model Reasoning (I)	Chain of Thought Prompting Elicits Reasoning in Large Language Models Least-to-Most Prompting Enables Complex Reasoning in Large Language Models Self-Consistency Improves Chain of Thought Reasoning in Language Models Graph of Thoughts: Solving Elaborate Problems with Large Language Models	Slides
9/12	Language Model Reasoning (II)	Large Language Models Can Self-Improve Progressive-Hint Prompting Improves Reasoning in Large Language Models Large Language Models are Better Reasoners with Self-Verification Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models	Slides
	-----Project Proposal Deadline: 9/16 11:59pm-----
9/17	Language Model Calibration	Teaching models to express their uncertainty in words SLiC-HF: Sequence Likelihood Calibration with Human Feedback Navigating the Grey Area: Expressions of Overconfidence and Uncertainty in Language Models Just Ask for Calibration	Slides
9/19	LLM Hallucination and Solutions	Improving Factuality and Reasoning in Language Models through Multiagent Debate How Language Model Hallucinations Can Snowball Trusting Your Evidence: Hallucinate Less with Context-aware Decoding Hallucination Detection for Generative Large Language Models by Bayesian Sequential Estimation	Slides
9/24	Retrieval Augmentation Generation	Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation REPLUG: Retrieval-Augmented Black-Box Language Models Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection	Slides
9/26	Reinforcement Learning with Human Feedback	Training language models to follow instructions with human feedback Direct Preference Optimization: Your Language Model is Secretly a Reward Model SimPO: Simple Preference Optimization with a Reference-Free Reward Fine-Grained Human Feedback Gives Better Rewards for Language Model Training	Slides
Advanced Methods for Large Language Models
10/1	Efficient Fine-Tuning	The Power of Scale for Parameter-Efficient Prompt Tuning Parameter-Efficient Transfer Learning for NLP LoRA: Low-Rank Adaptation of Large Language Models DoRA: Weight-Decomposed Low-Rank Adaptation	Slides
10/3	Efficient Inference	Fast Inference from Transformers via Speculative Decoding Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models Adapting Language Models to Compress Contexts	Slides
10/8	----Fall Break-----
10/10	Long-Context Language Models	LongNet: Scaling Transformers to 1B Tokens LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Lost in the Middle: How Language Models Use Long Contexts Memorizing Transformers	Slides
10/15	Guest Lecture: "Effective Pretraining and Finetuning: Methods for optimizing your data." by Shayne Longpre (MIT)
Large Language Model Applications
10/17	Code Language Models	Code Llama: Open Foundation Models for Code Planning with Large Language Models for Code Generation Teaching Large Language Models to Self-Debug SelfEvolve: A Code Evolution Framework via Large Language Models	Slides
	-----Project Mid-Term Report Deadline: 10/21 11:59pm-----
10/22	Multimodal Language Models	VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks Visual Instruction Tuning NExT-GPT: Any-to-Any Multimodal LLM Evaluating Object Hallucination in Large Vision-Language Models	Slides
10/24	Language Models as Agents	Toolformer: Language Models Can Teach Themselves to Use Tools ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs ART: Automatic multi-step reasoning and tool-use for large language models LLM+P: Empowering Large Language Models with Optimal Planning Proficiency	Slides
10/29	Language Models and Knowledge Graphs	GNN-LM: Language Modeling based on Global Contexts via GNN G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs?	Slides
10/31	Language Models for Specialized Domains	Don't Stop Pretraining: Adapt Language Models to Domains and Tasks SciBERT: A Pretrained Language Model for Scientific Text Large Language Models Encode Clinical Knowledge Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models	Slides
Large Language Model Analysis
11/5	Evaluation of Language Models	Proving Test Set Contamination in Black Box Language Models Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation Large Language Models are not Fair Evaluators Holistic Evaluation of Language Models	Slides
11/7	Detection of LLM Generation	DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature GPT-who: An Information Density-based Machine-Generated Text Detector A Watermark for Large Language Models GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content	Slides
11/12	Language Model Bias	Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints Whose Opinions Do Language Models Reflect? “Kelly is a Warm Person, Joseph is a Role Model”: Gender Biases in LLM-Generated Reference Letters Red Teaming Language Models with Language Models	Slides
11/14	Language Model Privacy & Security	Multi-step Jailbreaking Privacy Attacks on ChatGPT Jailbreaking Black Box Large Language Models in Twenty Queries Quantifying Memorization Across Neural Language Models Poisoning Language Models During Instruction Tuning	Slides
11/19	Guest Lecture: "Breaking the Curse of Multilinguality in Language Models" by Terra Blevins (Incoming Asst. Prof. at Northeastern Univ.)
11/21	----No Class-----
11/26	----No Class-----
	-----Project Presentation Deadline: 12/2 11:59pm-----
12/3	Final Project Presentation I
12/5	Final Project Presentation II
	-----Project Final Report Deadline: 12/13 11:59pm-----

Jiaxin Huang