Course : Επεξεργασία Φυσικής Γλώσσας - Natural Language Processing (MSc CS & MSc ISDS)
Course code : INF210
INF210 - Ion Androutsopoulos
Root directory slides_2025_26
The slides of 2025-26.
First Name | Size | Date | ||
---|---|---|---|---|
|
Text classification with (mostly) linear models: Representing texts as bags of words. Boolean and TF-IDF features. Feature selection and extraction using information gain and SVD. Obtaining word embeddings from PMI scores. Word and text clustering with k-means. Text classification with k nearest neighbors. Linear and logistic regression, stochastic gradient descent. Evaluating classifiers with precision, recall, F1, ROC AUC. Practical advice and diagnostics for text classification with supervised machine learning.
|
3.7 MB | 10/8/25, 4:43 PM | |
|
n-gram language models, estimating probabilities from corpora, entropy, cross-entropy, perplexity, edit distance, context-aware spelling correction, beam-search decoding.
|
2.94 MB | 10/1/25, 1:36 PM | |
|
Introduction and course organization.
|
2.12 MB | 10/1/25, 1:36 PM |