Open eClass του Οικονομικού Πανεπιστημίου Αθηνών | M35209F/Μ36209P - Text Analytics... | Documents

M35209F/Μ36209P - Text Analytics (MSc Data Science)

INF312 - Ion Androutsopoulos

Documents

Root directory ta_slides_2025_26 The slides of 2025-26.

Up

Bulk Processing

Cancel

	First Name	Size	Date
	ta_slides_part00_introduction.pdf Introduction and course organization.	2.18 MB	1/9/26, 5:34 PM
	ta_slides_part01_ngrams.pdf n-gram language models, estimating probabilities from corpora, entropy, cross-entropy, perplexity, edit distance, context-aware spelling correction, beam-search decoding.	2.89 MB	1/9/26, 5:34 PM
	ta_slides_part02_text_classification_with_mostly_linear_models.pdf Representing documents as bags of words. Boolean and TF-IDF features. Feature selection and extraction using information gain and SVD. Obtaining word embeddings from PMI scores. Word and text clustering with k-means. Text classification with k nearest neighbors. Linear and logistic regression, stochastic gradient descent. Evaluating classifiers with precision, recall, F1, ROC AUC. Practical advice and diagnostics for text classification with supervised machine learning.	3.63 MB	1/16/26, 10:40 PM
	ta_slides_part03_text_classification_with_mlps.pdf Multi-Layer Perceptrons (MLPs) and backpropagation. Dropout, batch and layer normalization. MLPs for text classification, regression, token classification (e.g., for POS tagging, named entity recognition). Pre-training word embeddings, Word2Vec. Advice for training large neural networks.	2.34 MB	1/27/26, 11:50 AM
	ta_slides_part04_nlp_with_rnns.pdf Recurrent neural networks (RNNs), GRUs/LSTMs. Bidirectional and stacked RNNs. RNNs with self-attention or global max-pooling. RNNs in text and token classification. RNN language models. Obtaining word embeddings from character-based RNNs. Hierarchical RNNs. Sequence-to-sequence RNN models with attention, applications in machine translation. Optional slides: Variational dropout. Universal sentence encoders, LASER. Pre-training RNN language models, ELMo.	3.35 MB	2/10/26, 10:24 AM