Argmax

Mixture of Experts

Vahe Hagopian, Taka Hasegawa, Farrukh Rahman

In this episode we talk about the paper "Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean.