Skip to content
Snippets Groups Projects
user avatar
emailweixu authored
* dropout for TransformerBlock

Include dropout to conform with the original implementation in "Attention is all you need".
Though it seems to hurt the unittest and ppo_babyai.py example.

* Configurable activation for TransformerBlock

Also only use position embedding for the first tranformer block of TransformerNetwork, which is the common practice.

* Slight refactoring of trainer for supervised learning

Also adds an alf conf example of language modeling task for supervised learning.

* Fix hypernetwork_algorithm_test

* Fix policy_trainer_test

* Address review comments
6ff31512
History
Name Last commit Last update