Course : M35209F/Μ36209P - Text Analytics (MSc Data Science)
Course code : INF312
Root directory ta_slides_2025_26
The slides of 2025-26.
| First Name | Size | Date | ||
|---|---|---|---|---|
|
|
Recurrent neural networks (RNNs), GRUs/LSTMs. Bidirectional and stacked RNNs. RNNs with self-attention or global max-pooling. RNNs in text and token classification. RNN language models. Obtaining word embeddings from character-based RNNs. Hierarchical RNNs. Sequence-to-sequence RNN models with attention, applications in machine translation. Optional slides: Variational dropout. Universal sentence encoders, LASER. Pre-training RNN language models, ELMo.
|
3.35 MB | 2/10/26, 10:24 AM | |
|
|
Multi-Layer Perceptrons (MLPs) and backpropagation. Dropout, batch and layer normalization. MLPs for text classification, regression, token classification (e.g., for POS tagging, named entity recognition). Pre-training word embeddings, Word2Vec. Advice for training large neural networks.
|
2.34 MB | 1/27/26, 11:50 AM | |
|
|
Representing documents as bags of words. Boolean and TF-IDF features. Feature selection and extraction using information gain and SVD. Obtaining word embeddings from PMI scores. Word and text clustering with k-means. Text classification with k nearest neighbors. Linear and logistic regression, stochastic gradient descent. Evaluating classifiers with precision, recall, F1, ROC AUC. Practical advice and diagnostics for text classification with supervised machine learning.
|
3.63 MB | 1/16/26, 10:40 PM | |
|
|
n-gram language models, estimating probabilities from corpora, entropy, cross-entropy, perplexity, edit distance, context-aware spelling correction, beam-search decoding.
|
2.89 MB | 1/9/26, 5:34 PM | |
|
|
Introduction and course organization.
|
2.18 MB | 1/9/26, 5:34 PM |