Type | Filename ![]() | Size | Date | |
---|---|---|---|---|
ta_slides_part07_speech_recognition.pdf Introduction to automatic speech recognition (ASR). Encoding speech frames with pre-trained Transformers, wav2vec, HuBERT. ASR models: encoder/decoder models, encoder-only models. ASR evaluation measures. Optional older material: MFCC vectors, HMM models. | 2.82 MB | 12/30/24 | ||
ta_slides_part06_nlp_with_transformers.pdf Key-query-value attention, multi-head attention, Transformer encoders and decoders. Pre-trained Transformers and Large Language Models (LLMs), BERT, SMITH, BART, T5, GPT-3, InstructGPT, ChatGPT, and open-source alternatives, fine-tuning them, prompting them. Parameter efficient training, LoRA. Retrieval-augmented generation (RAG), LLMs with tools. Data augmentation for NLP. Adding vision to LLMs, LLaVA, InstructBLIP. | 4.93 MB | 12/30/24 | ||
ta_slides_part05_nlp_with_cnns.pdf Quick background on Convolutional Neural Networks (CNNs) in Computer Vision. Text processing with CNNs. Image to text generation with CNN encoders and RNN decoders. | 2.34 MB | 12/30/24 | ||
ta_slides_part04_nlp_with_rnns.pdf Recurrent neural networks (RNNs), GRUs/LSTMs. Applications in token classification (e.g., named entity recognition). RNN language models. RNNs with self-attention or global max-pooling, and applications in text classification. Bidirectional and stacked RNNs. Obtaining word embeddings from character-based RNNs. Hierarchical RNNs. Sequence-to-sequence RNN models with attention, applications in machine translation. Optional slides: Universal sentence encoders, LASER. Pre-training RNN language models, ELMo. | 3.45 MB | 12/30/24 | ||
ta_slides_part03_text_classification_with_mlps.pdf Perceptrons, training them with SGD, limitations. Multi-Layer Perceptrons (MLPs) and backpropagation. Dropout, batch and layer normalization. MLPs for text classification, regression, token classification (e.g., for POS tagging, named entity recognition). Pre-training word embeddings, Word2Vec. Advice for training large neural networks. | 2.7 MB | 12/30/24 | ||
ta_slides_part02_text_classification_with_mostly_linear_models.pdf Text classification with (mostly) linear models: Representing texts as bags of words. Boolean and TF-IDF features. Feature selection and extraction using information gain and SVD. Obtaining word embeddings from PMI scores. Word and text clustering with k-means. Text classification with k nearest neighbors. Linear and logistic regression, stochastic gradient descent. Evaluating classifiers with precision, recall, F1, ROC AUC. Practical advice and diagnostics for text classification with supervised machine learning. Optional slides: Naive Bayes, semi-supervised classification with Expectation Maximization (EM), lexicon-based features, sentiment lexica, Support Vector Machines (SVMs) and kernels. | 5.69 MB | 12/30/24 | ||
ta_slides_part01_ngrams.pdf n-gram language models, estimating probabilities from corpora, entropy, cross-entropy, perplexity, edit distance, context-aware spelling correction, beam-search decoding. | 3.26 MB | 12/30/24 | ||
ta_slides_part00_introduction.pdf Introduction and course organization. | 2.26 MB | 12/30/24 |