Files for the in-class demos of 2023-24.
TypeFilename DownSizeDate
Lab 1 (N-gram Language Models)
Introduction to the NLTK library. Tokenization and N-gram Language Models (LMs). Cross-entropy and perplexity of LMs. Part-of-speech (POS) tagging, stemming. Beam-search decoding.
570.51 KB4/11/24
Lab 2 (mostly Linear Models)
Introduction to the scikit-learn library. Text classification with both linear & non-linear classifiers. Lazypredict library. Learning curves. Pipelines and hyper-parameter tuning via grid or randomized search.
622.97 KB4/23/24
Lab 3 (MLPs)
Introduction to Keras and Keras Tuner. Text classification with MLPs using tf-idf and centroids of pretrained word2vec embeddings. Example with word2vec word embeddings with gensim.
152.75 KB4/24/24
Lab 4 (RNNs)
Text classification with RNNs in Keras. Linear / deep self attention mechanisms.
178.38 KB2/18/24
Lab 5 (CNNs)
Text classification with (multi-filter) CNNs in Keras.
611.4 KB2/22/24
Lab 6 (Transformers)
Text classification with CNNs in Keras. Fine tuning BERT for text classification and NER (token classification) tasks using transformers library.
75.31 KB2/29/24
Lab 7 (Prompting)
Short introduction to open-source LLMs and prompt engineering.
1.4 MB3/27/24