The course is concerned with algorithms, models, and systems that can be used to "understand" and generate natural language text. Natural language processing methods are used, for example, in sentiment analysis and opinion mining, information extraction from documents, search engines and question answering systems. They are particularly important in corporate information systems, where knowledge is often expressed in natural language (e.g., minutes, reports, regulations, contracts, product descriptions, manuals, patents). Companies also interact with their customers mostly in natural language (e.g., via e-mail, call centers, web pages describing products, blogs and social media).
N-gram language models. Estimating probabilities from corpora. Entropy, cross-entropy, perplexity. Edit distance. Spelling correction and text normalization. Bag of word text representations. Feature selection and extraction with information gain and SVD. Text classification with k nearest neighbours and Naive Bayes. Word embeddings with PMI scores. Word and text clustering with k-means. Linear and logistic regression, stochastic gradient descent. Lexicon-based features, constructing and using sentiment lexica. Perceptrons, Multi-Layer Perceptrons, backpropagation, for text classification, text regression, and sequence labelling with sliding windows. Pre-training word embeddings, Word2Vec, FastText. Recurrent neural networks (RNNs), GRU and LSTM cells, RNNs with self-attention, bidirectional, stacked, hierarchical RNNs, and applications in language modelling, text classification, and sequence labelling. Sequence-to-sequence RNN models with attention, machine translation. Universal sentence encoders. Pretraining language models, context-aware embeddings, ELMo. Convolutional neural networks (CNNs) and applications in NLP. Query-key-value attention, multi-head attention, Transformers, BERT. Grammars, parse trees, transition-based and graph-based dependency parsing with deep learning. Relation extraction with deep learning. Joint parsing and relation extraction. Question answering for document collections.
There is no required textbook. Extensive notes in the form of slides are provided.
The course is mainly based on the books:
- Speech and Language Processing, by D. Jurafsky and J.H. Martin, 2nd edition, Pearson, 2009. The 3rd edition (in preparation) is freely available (http://web.stanford.edu/~jurafsky/slp3/).
- Neural Network Methods for Natural Language Processing, by Y. Goldberg, Morgan & Claypool, 2017.
Both books can be found at AUEB’s library.
Basic knowledge of calculus, linear algebra, probability theory. For the programming assignments, programming experience in Python is required. The students will be allowed to implement the programming assignments of the course in any language, but Python is strongly recommended. An introduction to natural language processing and machine learning libraries (e.g., NLTK, scikit-learn, Keras, PyTorch) will be provided, and students will have the opportunity to use these libraries in the course’s assignments. For assignments that require training neural networks, cloud virtual machines with GPUs (e.g., in Google’s Colab) can be used.
In each part of the course, study exercises are provided (solved and unsolved, some requiring programming), of which one or two per part are handed in (as assignments). Students are graded for their assignments (50%) and their performance at the final exam (50%).
Instructor: Ion Androutsopoulos (http://www.aueb.gr/users/ion/contact.html)
Labs and assignments assistant: Vasiliki Kougia (vasilikvasilikikou at gmail com).