1. DeBERTa: Decoding-enhanced BERT with Disentangled Attention (Machine Learning Paper Explained)

    DeBERTa: Decoding-enhanced BERT with Disentangled Attention (Machine Learning Paper Explained)

    2
    0
    43
  2. Multimodal Neurons in Artificial Neural Networks (w/ OpenAI Microscope, Research Paper Explained)

    Multimodal Neurons in Artificial Neural Networks (w/ OpenAI Microscope, Research Paper Explained)

    128
  3. [ML News] AI-generated patent approved | Germany gets an analog to OpenAI | ML cheats video games

    [ML News] AI-generated patent approved | Germany gets an analog to OpenAI | ML cheats video games

    34
  4. Decision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)

    Decision Transformer: Reinforcement Learning via Sequence Modeling (Research Paper Explained)

    53
  5. The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)

    The hidden dangers of loading open-source AI models (ARBITRARY CODE EXPLOIT!)

    21
    2
    22
  6. MLP-Mixer: An all-MLP Architecture for Vision (Machine Learning Research Paper Explained)

    MLP-Mixer: An all-MLP Architecture for Vision (Machine Learning Research Paper Explained)

    170
  7. [ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

    [ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

    60
  8. Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

    Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

    64
  9. This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News

    This ChatGPT Skill will earn you $10B (also, AI reads your mind!) | ML News

    51
  10. Linear Transformers Are Secretly Fast Weight Memory Systems (Machine Learning Paper Explained)

    Linear Transformers Are Secretly Fast Weight Memory Systems (Machine Learning Paper Explained)

    32
  11. [ML News] DeepMind tackles Math | Microsoft does more with less | Timnit Gebru launches DAIR

    [ML News] DeepMind tackles Math | Microsoft does more with less | Timnit Gebru launches DAIR

    14
    1
  12. Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Review)

    Learning Rate Grafting: Transferability of Optimizer Tuning (Machine Learning Research Paper Review)

    55
  13. Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (+Author)

    Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (+Author)

    36
  14. Gradients are Not All You Need (Machine Learning Research Paper Explained)

    Gradients are Not All You Need (Machine Learning Research Paper Explained)

    10
  15. GPT-NeoX-20B - Open-Source huge language model by EleutherAI (Interview w/ co-founder Connor Leahy)

    GPT-NeoX-20B - Open-Source huge language model by EleutherAI (Interview w/ co-founder Connor Leahy)

    29
  16. PonderNet: Learning to Ponder (Machine Learning Research Paper Explained)

    PonderNet: Learning to Ponder (Machine Learning Research Paper Explained)

    126
    55
    93
  17. Fastformer: Additive Attention Can Be All You Need (Machine Learning Research Paper Explained)

    Fastformer: Additive Attention Can Be All You Need (Machine Learning Research Paper Explained)

    59
    15
    25
  18. This is a game changer! (AlphaTensor by DeepMind explained)

    This is a game changer! (AlphaTensor by DeepMind explained)

    27
    5
    123
  19. Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)

    Grokking: Generalization beyond Overfitting on small algorithmic datasets (Paper Explained)

    48
    14
    23
  20. Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained)

    Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (Paper Explained)

    69
    16
    47
  21. [ML News] Microsoft trains 530B model | ConvMixer model fits into single tweet | DeepMind profitable

    [ML News] Microsoft trains 530B model | ConvMixer model fits into single tweet | DeepMind profitable

    110
    30
    121
  22. Recipe AI suggests FATAL CHLORINE GAS Recipe

    Recipe AI suggests FATAL CHLORINE GAS Recipe

    6
  23. Parti - Scaling Autoregressive Models for Content-Rich Text-to-Image Generation (Paper Explained)

    Parti - Scaling Autoregressive Models for Content-Rich Text-to-Image Generation (Paper Explained)

    2
    0
    3
  24. The Man behind Stable Diffusion

    The Man behind Stable Diffusion

    25
  25. Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos (Paper Explained)

    Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos (Paper Explained)

    19