Siyuan Liang

Long-context modeling & recurrent architectures

profile.png

Siyuan Liang (梁思远) focuses on long-context modeling and recurrent architectures for sequence models, and proposed the Truncated Recurrent Transformer for strong length extrapolation under train-short, test-long settings.

Previously, he worked as an algorithm researcher at Megvii in Beijing, delivering production algorithms for fingerprint and face liveness, display demura, and XR hand tracking.

He received his M.S. in Electronic and Communication Engineering from Xidian University, where his research centered on deep-learning-based radar anti-jamming detection and intelligent electromagnetic games.


llama2RNN.c — Truncated Recurrent Transformer implementations in C

LEDiT — PyTorch Implementation, NeurIPS 2025

SimpleDG — Training and test code for ECCV2022 workshop NICO challenge

latest posts

selected publications

  1. recurrent_transformer.jpg
    Truncated Recurrent Transformer: Unlocking Length Extrapolation via TBPTT
    Siyuan Liang
    arXiv preprint, 2025
  2. NeurIPS
    LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding
    Shen Zhang, Siyuan Liang, Yaning Tan, and 9 more authors
    In Advances in Neural Information Processing Systems, 2025
  3. IEEE Sensors
    An End-to-End Anti-Jamming Target Detection Method Based on CNN
    Yu Zhang, Bo Jiu, Penghui Wang, and 2 more authors
    IEEE Sensors Journal, 2021