Natural Language Processing (MSc CS & MSc ISDS)

Ion Androutsopoulos

Description

This course is part of the MSc in Computer Science and the MSc in Information Systems Development and Security of the Department of Informatics, Athens University of Economics and Business. The course covers algorithms, models and systems that allow computers to "understand" and generate natural language text.

Course Objectives/Goals

The course is concerned with algorithms, models, and systems that can be used to "understand" and generate natural language text. Natural language processing methods are used, for example, in sentiment analysis and opinion mining, information extraction from documents, search engines and question answering systems. They are particularly important in corporate information systems, where knowledge is often expressed in natural language (e.g., minutes, reports, regulations, contracts, product descriptions, manuals, patents). Companies also interact with their customers mostly in natural language (e.g., via e-mail, call centers, web pages describing products, blogs and social media).

Course Syllabus

N-gram language models, entropy, cross-entropy, perplexity, context-aware spelling correction, beam-search decoding. Boolean and TF-IDF features. Information gain, SVD. k-NN. Naive Bayes. Precision, recall, F1, AUC. k-means. Linear and logistic regression, stochastic gradient descent. Perceptrons, Multi-Layer Perceptrons (MLPs), backpropagation. Dropout, batch/layer normalization. Pre-training word embeddings, Word2Vec. Recurrent neural networks (RNNs), GRUs/LSTMs, RNN language models, RNNs with self-attention, bidirectional, stacked, hierarchical RNNs, encoder-decoder RNNs. Text processing with Convolutional Neural Networks (CNNs). Transformer encoders and decoders. Pre-trained Transformers, fine-tuning, prompting, BERT, BART, T5, GPT-3, InstructGPT, ChatGPT. Retrieval-augmented generation (RAG). Data augmentation in NLP. Applications in spelling correction, sentiment analysis, information extraction, machine translation, image captioning, question answering, dialogue systems.

After successfully completing the course, students will be able to:
• describe a wide range of possible applications of Natural Language Processing,
• describe Natural Language Processing algorithms that can be used in particular applications,
• select and implement appropriate Natural Language Processing algorithms for particular applications,
• evaluate the effectiveness and efficiency of Natural Language Processing methods and systems.

Bibliography

There is no required textbook. Extensive notes in the form of slides are provided.

Recommended books:

  • Speech and Language Processing, Daniel Jurafsky and James H. Martin, Pearson Education, 2nd edition, 2009, ISBN-13: 978-0135041963. A draft of the 3rd edition is freely available (https://web.stanford.edu/~jurafsky/slp3/).
  • Neural Network Methods for Natural Language Processing, Yoav Goldberg, Morgan & Claypool Publishers, 2017, ISBN-13: 978-1627052986.
  • Introduction to Natural Language Processing, Jacob Eisenstein, MIT Press, 2019, ISBN-13: 978-0262042840.
  • Foundations of Statistical Natural Language Processing, Christopher D. Manning and Hinrich Schütze, MIT Press, 1999, ISBN-13: 978-0262133609.
  • An Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press, 2008, ISBN-13: 978-0521865715.
Prerequisites/Prior Knowledge

Basic knowledge of calculus, linear algebra, probability theory. For the programming assignments, programming experience in Python is required. An introduction to natural language processing and machine learning libraries (e.g., NLTK, spaCy, scikit-learn, Tensorflow/Keras, PyTorch) will be provided, and students will have the opportunity to use these libraries in the course’s assignments. For assignments that require training neural networks, cloud virtual machines with GPUs (e.g., in Google’s Colab) can be used.

Assessment Methods

In each part of the course, study exercises are provided (solved and unsolved, some requiring programming), some of which are handed in (as assignments). The final grade is the average of the final examination grade (50%) and the grade of the study and programming exercises to be submitted (50%), provided that the final examination grade is at least 5/10. Otherwise, the final grade equals the final examination grade.

Instructors

Instructor: Ion Androutsopoulos (http://www.aueb.gr/users/ion/contact.html)