Course : Επεξεργασία Φυσικής Γλώσσας - Natural Language Processing (MSc CS & MSc ISDS)
Course code : INF210
INF210 - Ion Androutsopoulos
Root directory slides_2025_26
The slides of 2025-26.
| First Name | Size | Date | ||
|---|---|---|---|---|
|
|
Introduction and course organization.
|
2.12 MB | 10/1/25, 1:36 PM | |
|
|
n-gram language models, estimating probabilities from corpora, entropy, cross-entropy, perplexity, edit distance, context-aware spelling correction, beam-search decoding.
|
2.92 MB | 10/14/25, 10:03 AM | |
|
|
Representing documents as bags of words. Boolean and TF-IDF features. Feature selection and extraction using information gain and SVD. Obtaining word embeddings from PMI scores. Word and text clustering with k-means. Text classification with k nearest neighbors. Linear and logistic regression, stochastic gradient descent. Evaluating classifiers with precision, recall, F1, ROC AUC. Practical advice and diagnostics for text classification with supervised machine learning.
|
3.67 MB | 10/23/25, 1:20 PM | |
|
|
Multi-Layer Perceptrons (MLPs) and backpropagation. Dropout, batch and layer normalization. MLPs for text classification, regression, token classification (e.g., for POS tagging, named entity recognition). Pre-training word embeddings, Word2Vec. Advice for training large neural networks.
|
2.28 MB | 10/23/25, 1:20 PM | |
|
|
Recurrent neural networks (RNNs), GRUs/LSTMs. Bidirectional and stacked RNNs. RNNs with self-attention or global max-pooling. RNNs in text and token classification. RNN language models. Obtaining word embeddings from character-based RNNs. Hierarchical RNNs. Sequence-to-sequence RNN models with attention, applications in machine translation. Optional slides: Variational dropout. Universal sentence encoders, LASER. Pre-training RNN language models, ELMo.
|
3.38 MB | 11/12/25, 3:17 PM | |
|
|
Quick background on Convolutional Neural Networks (CNNs) in Computer Vision. Text processing with CNNs. Image to text generation with CNN encoders and RNN decoders.
|
2.31 MB | 11/12/25, 3:16 PM | |
|
|
Transformer encoders, BERT. Encoder-decoder Transformers, BART, T5. Decoder-only Transformers, GPT-x. Prompting, supervised fine-tuning, RLHF, DPO. Parameter efficient training, LoRA. Retrieval augmented generation (RAG), LLMs with tools, agents, ReACT. Adding vision to LLMs, LLaVA, InstructBLIP. Optional slides: Data augmentation for NLP.
|
6.39 MB | 11/22/25, 8:57 PM |