back to indexDate | Title | Duration | Whisper Transcript | Transcript Only |
---|
2024-11-13 | Flash Attention derived and coded from first principles with Triton (Python) | 458 min | Whisper Transcript | Transcript Only |
2024-08-07 | Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation | 346 min | Whisper Transcript | Transcript Only |
2024-05-21 | ML Interpretability: feature visualization, adversarial example, interp. for language models | 60 min | Whisper Transcript | Transcript Only |
2023-04-16 | Wav2Lip (generate talking avatar videos) - Paper reading and explanation | 6 min | Whisper Transcript | Transcript Only |
2023-04-24 | CLIP - Paper explanation (training and inference) | 14 min | Whisper Transcript | Transcript Only |
2023-05-25 | Coding a Transformer from scratch on PyTorch, with full explanation, training and inference. | 179 min | Whisper Transcript | Transcript Only |
2023-05-28 | Attention is all you need (Transformer) - Model explanation (including math), Inference and Training | 58 min | Whisper Transcript | Transcript Only |
2023-06-07 | Variational Autoencoder - Model, ELBO, loss function and maths explained easily! | 27 min | Whisper Transcript | Transcript Only |
2023-07-04 | How diffusion models work - explanation and code! | 21 min | Whisper Transcript | Transcript Only |
2023-07-15 | LongNet: Scaling Transformers to 1,000,000,000 tokens: Python Code + Explanation | 29 min | Whisper Transcript | Transcript Only |
2023-07-25 | LoRA: Low-Rank Adaptation of Large Language Models - Explained visually + PyTorch code from scratch | 26 min | Whisper Transcript | Transcript Only |
2023-08-14 | Segment Anything - Model explanation with code | 42 min | Whisper Transcript | Transcript Only |
2023-08-24 | LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU | 70 min | Whisper Transcript | Transcript Only |
2023-09-03 | Coding LLaMA 2 from scratch in PyTorch - KV Cache, Grouped Query Attention, Rotary PE, RMSNorm | 184 min | Whisper Transcript | Transcript Only |
2023-09-27 | Coding Stable Diffusion from scratch in PyTorch | 303 min | Whisper Transcript | Transcript Only |
2023-10-26 | BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token | 54 min | Whisper Transcript | Transcript Only |
2023-11-27 | Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW) | 49 min | Whisper Transcript | Transcript Only |
2023-12-11 | Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training | 50 min | Whisper Transcript | Transcript Only |
2023-12-19 | Distributed Training with PyTorch: complete tutorial with cloud infrastructure and code | 72 min | Whisper Transcript | Transcript Only |
2023-12-27 | Mistral / Mixtral Explained: Sliding Window Attention, Sparse Mixture of Experts, Rolling Buffer | 86 min | Whisper Transcript | Transcript Only |
2024-01-07 | Mamba and S4 Explained: Architecture, Parallel Scan, Kernel Fusion, Recurrent, Convolution, Math | 74 min | Whisper Transcript | Transcript Only |
2024-02-27 | Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code. | 135 min | Whisper Transcript | Transcript Only |
2024-04-14 | Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math | 48 min | Whisper Transcript | Transcript Only |
2024-05-11 | Kolmogorov-Arnold Networks: MLP vs KAN, Math, B-Splines, Universal Approximation Theorem | 75 min | Whisper Transcript | Transcript Only |