xLSTM

xLSTM, or Extended Long Short-Term Memory, adapts the traditional LSTM architecture to include new gating mechanisms and memory structures, specifically designed to address the limitations of LSTMs in handling large-scale data. This model incorporates exponential gating and stabilization techniques, alongside a modified memory system that can be fully parallelized. These innovations allow the xLSTM to compete favorably with more recent architectures like Transformers, especially in applications involving large language models where scalability and performance are crucial.

The concept of xLSTM was introduced in 2024, as part of efforts to scale LSTMs to the realm of billions of parameters, a domain typically dominated by Transformers. This development reflects the ongoing evolution in deep learning architectures aimed at improving efficiency and model capacity.

The xLSTM model was developed by a team including Maximilian Beck, Korbinian Pöppel, and Sepp Hochreiter, among others. Sepp Hochreiter is notably one of the co-inventors of the original LSTM, lending significant historical continuity and expertise to this advancement.

xLSTM

Key Contributors

Newsletter

Academic Papers

xLSTM: Extended Long Short-Term Memory

Advanced stock price prediction with xlstm-based models: Improving long-term forecasting

Vision-LSTM: xLSTM as Generic Vision Backbone

xlstm: Extended long short-term memory

xLSTMTime: Long-Term Time Series Forecasting with xLSTM