Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 14m ago

Self-Gating Attention for Efficient Time Series Forecasting

Transformer architectures have shown strong potential in time series forecasting, where multi-head self-attention is widely used to capture temporal dependencies across historical timestamps. However, standard self-attention has quadratic time and memory complexity with respect to the look-back length. This cost may limit its use in resource-constrained or high-throughput forecasting systems, where fast and memory-efficient inference is important. Through qualitative and quantitative analyses, we observe that self-attention maps in time series forecasting often contain redundant patterns acros

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorDezheng Wang

    Self-Gating Attention for Efficient Time Series Forecasting

  • Linked via arxiv authorTong Chen

    Self-Gating Attention for Efficient Time Series Forecasting

  • Linked via arxiv authorWei Yuan

    Self-Gating Attention for Efficient Time Series Forecasting

  • Linked via arxiv authorCongyan Chen

    Self-Gating Attention for Efficient Time Series Forecasting

  • Linked via arxiv authorShihua Li

    Self-Gating Attention for Efficient Time Series Forecasting

  • Linked via arxiv authorHongzhi Yin

    Self-Gating Attention for Efficient Time Series Forecasting

Implements

authored (incoming)

Related across the graph

Topics