NLP Seminar Autumn 2024

Time: every Friday from 10:15h until 12:00h
Location: Room 105 at the University Bern main building, Hochschulstrasse 4, Bern
ILIAS Course: https://ilias.unibe.ch/goto_ilias3_unibe_crs_3102245.html
KSL Entry: https://ksl.unibe.ch/KSL/kurzansicht?28&stammNr=471397&semester=HS2024&lfdNr=0

Responsible for the seminar: PD Dr. Matthias Stürmer
Main lecturer of the seminar: Luca Rolshoven

This seminar offers a conceptual and practical introduction to modern-day Natural Language Processing (NLP). The covered NLP techniques include the latest advancements such as transformer architectures, BERT, Large Language Models (LLMs), and Retrieval-Augmented Generation (RAG), alongside foundational methods like Bag-of-Words (BoW), TF-IDF, and word2vec. Some lectures are featured by guest speakers from academia or industry, giving additional perspectives to the students.

Before each session, students have to study the reading material and prepare questions for discussion. This engagement will deepen the understanding and foster analytical skills. Additionally, participants will undertake a project in the realm of NLP, which they will present at the end of the seminar.

This seminar is mandatory for all students conducting a bachelor's or master's thesis at the Research Center for Digital Sustainability.

Upon successful completion of this course, you will …

know the most important methods of NLP
be able to understand papers in the field of NLP
have planned, executed and presented a project in the field of NLP
have developed your presentation skills
be able to understand and critically comment on the presentations of your fellow students

Schedule 2024

Date	Topic	Reading Material	Guest Speaker
20 September 2024	Introduction to NLP	None
27 September 2024	Text Preprocessing and Language Basics	NLP Course - Tokenizers (Hugging Face NLP Course) Tokenizers Quicktour (Hugging Face Tokenizers Library)	Veton Matoshi, Researcher at Bern University of Applied Sciences
4 October 2024	Classical Machine Learning for NLP	Analyzing Documents with TF-IDF (Tutorial) Sentiment Analysis Using Naive Bayes (Course Notes)
11 October 2024	Word Embeddings and Vector Space Models	Chapter 8: Distributional Semantics and Word Embeddings, Text Analytics for Corpus Linguistics and Digital Humanities: Simple R Scripts and Tools (access using university login) Additional reading material (optional): Efficient Estimation of Word Representations in Vector Space (word2vec Paper) Global Vectors for Word Representation (GloVe Paper) Deep Contextualized Word Representations (ELMo Paper)	Prof. Dr. Gerold Schneider, Professor of Computational Linguistics and Head of LiRI NLP group at University of Zürich
18 October 2024	Transformer Architecture	Attention is All You Need (Transformer Paper) The Illustrated Transformer (Blog Post)
25 October 2024	Introduction to Student Project	None
1 November 2024	Pre-Training	LLaMA: Open and Efficient Foundation Language Models Additional reading material (optional): Improving Language Understanding by Generative Pre-Training (GPT Paper) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Paper)	Leandro von Werra, Chief Loss Officer at Hugging Face
8 November 2024	LLM Inference	A Guide to Quantization in LLMs (Blog Post) Fast Inference from Transformers via Speculative Decoding (Paper) Additional reading material (optional): Flash Attention 3 (Blog Post) vLLM (Blog Post) Text Generation Inference (TGI; Hugging Face Library)
15 November 2024	Fine-Tuning	FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning (paper) Understanding Parameter-Efficient Finetuning of Large Language Models: From Prefix Tuning to LLaMA-Adapters (Blog Post) Understanding Mixed Precision Training (Blog Post) Additional Links: SFT Trainer (Huggingface TRL Library)	Dr. Joel Niklaus, Research Scientist at Harvey
22 November 2024	Alignment	Hugging Face Blog Post about RLHF Direct Preference Optimization (paper) Everything you need to know about Fine-tuning and Merging LLMs: Maxime Labonne (video)	Lewis Tunstall, LLM Engineering & Research at Hugging Face
29 November 2024	NLP in Industry	None	Flurin Gishamer, Senior Data Scientist at Open Systems
6 December 2024	Large Language Models and Applications	A Comprehensive Overview of Large Language Models (paper)	Prof. Dr. Marcel Gygli, Professor for AI in the Public Sector at Bern University of Applied Sciences
13 December 2024	Emerging Trends & Student Project Presentations	None
20 December 2024	Student Project Presentations	None

Digital Sustainability Group

NLP Seminar Autumn 2024

Schedule 2024