Files for the in-class demos of 2023-24.
TypeFilename DownSizeDate
Lab 1 (N-gram Language Models)
Introduction to the NLTK library. Tokenization and N-gram Language Models (LMs). Cross-entropy and perplexity of LMs. Part-of-speech (POS) tagging, stemming. Beam-search decoding.
570.51 KB4/11/24
Lab 2 (mostly Linear Models)
Introduction to the scikit-learn library. Text classification with both linear & non-linear classifiers. Lazypredict library. Learning curves. Pipelines and hyper-parameter tuning via grid or randomized search.
622.97 KB4/23/24
Lab 3 (MLPs)
Introduction to Keras and Keras Tuner. Text classification with MLPs using tf-idf and centroids of pretrained word2vec embeddings. Example with word2vec word embeddings with gensim.
152.75 KB4/24/24
Lab 4 (RNNs)
Text classification with RNNs in Keras. Linear / deep self attention mechanisms.
203.23 KB5/16/24
Lab 5 (CNNs)
Text classification with (multi-filter) CNNs in Keras.
159.36 KB5/22/24
Lab 6 (Transformers)
Text classification with transformers (BERT) in Keras (custom layers on top of BERT, freeze BERT layers). Fine tuning BERT for text classification and NER (token classification) tasks using transformers library.
85.03 KB5/30/24
Lab 7 (LLMs)
Prompt templates, Chatbots with memory and Agents with tools using LangChain. LangChain Expression Language (LCEL). Zero-shot NER extractions with GPT-4. Parameter efficient fine-tuning with LoRA. Faster inference with TGI. Serve LLM apps with Gradio.
87.57 KB6/13/24
Lab 7 (Prompting)
Short introduction to open-source LLMs and prompt engineering.
1.4 MB3/27/24