The course is concerned with algorithms, models, and systems that can be used to "understand" and generate natural language text. Natural language processing methods are used, for example, in sentiment analysis and opinion mining, information extraction from documents, search engines and question answering systems. They are particularly important in corporate information systems, where knowledge is often expressed in natural language (e.g., minutes, reports, regulations, contracts, product descriptions, manuals, patents). Companies also interact with their customers mostly in natural language (e.g., via e-mail, call centers, web pages describing products, blogs and social media).
N-gram language models. Entropy, cross-entropy, perplexity. Spelling correction. Bag-of-word text representations. Feature selection and extraction. Text classification with k nearest neighbors and Naive Bayes. Clustering words and texts with k-means. Logistic regression, stochastic gradient descent, multi-layer Perceptrons, backpropagation for text classification. Pre-trained word embeddings, Word2Vec, FastText. Recurrent neural networks (RNNs), GRU and LSTM cells, RNNs with self-attention, bidirectional, stacked, hierarchical RNNs and applications to language models, text classification and sequence labeling. Sequence-to-sequence RNN models, machine translation. Pre-trained RNN language models, ELMo. Convolutional neural networks and applications to text processing. Transformers, BERT, BART, T5 and their use in text classification, sequence labeling, sequence-to-sequence tasks. Syntactic dependency parsing and relation extraction with deep learning models. Question answering systems for document collections. Discourse processing and dialogue systems.
There is no required textbook. Extensive notes in the form of slides are provided.
- Speech and Language Processing, Daniel Jurafsky and James H. Martin, Pearson Education, 2nd edition, 2009, ISBN-13: 978-0135041963. A draft of the 3rd edition is freely available (https://web.stanford.edu/~jurafsky/slp3/).
- Neural Network Methods for Natural Language Processing, Yoav Goldberg, Morgan & Claypool Publishers, 2017, ISBN-13: 978-1627052986.
- Introduction to Natural Language Processing, Jacob Eisenstein, MIT Press, 2019, ISBN-13: 978-0262042840.
- Foundations of Statistical Natural Language Processing, Christopher D. Manning and Hinrich Schütze, MIT Press, 1999, ISBN-13: 978-0262133609.
- An Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press, 2008, ISBN-13: 978-0521865715.
Basic knowledge of calculus, linear algebra, probability theory. For the programming assignments, programming experience in Python is required. An introduction to natural language processing and machine learning libraries (e.g., NLTK, spaCy, scikit-learn, Tensorflow/Keras, PyTorch) will be provided, and students will have the opportunity to use these libraries in the course’s assignments. For assignments that require training neural networks, cloud virtual machines with GPUs (e.g., in Google’s Colab) can be used.
In each part of the course, study exercises are provided (solved and unsolved, some requiring programming), some of which are handed in (as assignments). The final grade is the average of the final examination grade (50%) and the grade of the study and programming exercises to be submitted (50%), provided that the final examination grade is at least 5/10. Otherwise, the final grade equals the final examination grade.
Instructor: Ion Androutsopoulos (http://www.aueb.gr/users/ion/contact.html)
Labs/tutorials assistant (2021-22): Stratos Xenouleas (stratosxen at gmail com)