emailweixu
authored
* dropout for TransformerBlock Include dropout to conform with the original implementation in "Attention is all you need". Though it seems to hurt the unittest and ppo_babyai.py example. * Configurable activation for TransformerBlock Also only use position embedding for the first tranformer block of TransformerNetwork, which is the common practice. * Slight refactoring of trainer for supervised learning Also adds an alf conf example of language modeling task for supervised learning. * Fix hypernetwork_algorithm_test * Fix policy_trainer_test * Address review comments
Name | Last commit | Last update |
---|