Umar Jamil Transcripts

back to index
DateTitleDurationWhisper TranscriptTranscript Only
2024-11-13Flash Attention derived and coded from first principles with Triton (Python)458 minWhisper TranscriptTranscript Only
2024-08-07Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation346 minWhisper TranscriptTranscript Only
2024-05-21ML Interpretability: feature visualization, adversarial example, interp. for language models60 minWhisper TranscriptTranscript Only
2023-04-16Wav2Lip (generate talking avatar videos) - Paper reading and explanation6 minWhisper TranscriptTranscript Only
2023-04-24CLIP - Paper explanation (training and inference)14 minWhisper TranscriptTranscript Only
2023-05-25Coding a Transformer from scratch on PyTorch, with full explanation, training and inference.179 minWhisper TranscriptTranscript Only
2023-05-28Attention is all you need (Transformer) - Model explanation (including math), Inference and Training58 minWhisper TranscriptTranscript Only
2023-06-07Variational Autoencoder - Model, ELBO, loss function and maths explained easily!27 minWhisper TranscriptTranscript Only
2023-07-04How diffusion models work - explanation and code!21 minWhisper TranscriptTranscript Only
2023-07-15LongNet: Scaling Transformers to 1,000,000,000 tokens: Python Code + Explanation29 minWhisper TranscriptTranscript Only
2023-07-25LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch26 minWhisper TranscriptTranscript Only
2023-08-14Segment Anything - Model explanation with code42 minWhisper TranscriptTranscript Only
2023-08-24LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU70 minWhisper TranscriptTranscript Only
2023-09-03Coding LLaMA 2 from scratch in PyTorch - KV Cache, Grouped Query Attention, Rotary PE, RMSNorm184 minWhisper TranscriptTranscript Only
2023-09-27Coding Stable Diffusion from scratch in PyTorch303 minWhisper TranscriptTranscript Only
2023-10-26BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token54 minWhisper TranscriptTranscript Only
2023-11-27Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)49 minWhisper TranscriptTranscript Only
2023-12-11Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training50 minWhisper TranscriptTranscript Only
2023-12-19Distributed Training with PyTorch: complete tutorial with cloud infrastructure and code72 minWhisper TranscriptTranscript Only
2023-12-27Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer86 minWhisper TranscriptTranscript Only
2024-01-07Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math74 minWhisper TranscriptTranscript Only
2024-02-27Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.135 minWhisper TranscriptTranscript Only
2024-04-14Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math48 minWhisper TranscriptTranscript Only
2024-05-11Kolmogorov-Arnold Networks: MLP vs KAN, Math, B-Splines, Universal Approximation Theorem75 minWhisper TranscriptTranscript Only