A One-Layer Decoder-Only Transformer is a Two-Layer RNN, With an Application to Certified Robustness

Yuhao Zhang
Yuhao Zhang
Ph.D. student in Computer Science