(Offline reinforcement learning)
TTO performs on par with or better than the best prior
offline reinforcement learning algorithms on the D4RL benchmark suite. Results for TTO correspond
to the mean over 15 random seeds (5 independently trained Transformers and 3 trajectories per
Transformer), with error bars depicting standard deviation between runs. We detail the sources of the
performance for other methods in Appendix C. A listing of these results in tabular form is provided
in Appendix E.