Commits · gin-search-path · Philipp Sauer / alf

Aug 03, 2021

Add the directory of the entry point gin file to search path (#914) · 4e477d45
Break Yang authored 3 years ago
```
This allows running alf train, play and other programs from
anywhere (no longer restricted to `alf/examples/`)
```
View commits for tag gin-search-path gin-search-path

4e477d45
Convert ppo_cart_pole to python configuration · 8753b8bb
Break Yang authored 3 years ago

8753b8bb

Neale Ratzlaff authored 3 years ago

This commits adds GPVI to alf, main changes to alf/algorithms/generator.py and tests are
* Add _create_mvp_network method, which instantiates an encoding network to compute a matrix-vector
        product,  used for computing the inverse jacobian vector product in GPVI.
* Add InverseMVPAlgorithm to specify a training step of the above InverseMVP network.
* Add rkhs_func_grad function to generator.py
    This is the main update routine for GPVI, and is related to the InverseMVPAlgorithm.
    I have not added a function value version of GPVI, nor have I added a minmax version
    ``rkhs_func_grad`` is called whenever the generator argument ``functional_gradient`` is set to ``rkhs``.
    Right now, ``functional_gradient`` only supports a ReluMLP generator, as it relies on the fast ``compute_vjp`` function, rather than autograd
* Add GPVI tests to generator_test.py.
* Add GPVI tests to hypernetwork_test.py.

--Small Changes--
* ReluMLP method ``compute_vjp`` now returns the output of the forward evaluation. needed for GPVI update
* Added arguments for GPVI to generator.py, such as force_fullrank, and fullrank_diag_weight
* Added arguments for pinverse_network to hypernetwork_algorithm.py
* Added arguments for GPVI to hypernetwork_algorithm.py

* addressing comments in PR discussion
* renamed pinverse network to InverseMVPNetwork for (matrix vector product)
* added a test for InverseMVPNetwork that explicitly computes the inverse Jacobian vector product of an MLP with respect to some random input. This is evaluated against the solution found by training an InverseMVPNetwork.
* Implemented fixes and suggestions to generator and hypernetwork files.
* Changed naming of Pinverse network to the new InverseMVP network.

* Addressing PR comments.
* Removed InverseMVPNetwork file. Opted for EncodingNetwork as suggested by Wei.
* Refactored InverseMVP test accordingly, to show that the idea still works.
* Added helper function to generator.py, to create this network
* Added/changed docstrings to generator.py as suggested.

1bfcf5fc

Introduce PerProcessContext to adjust DDP per process behavior (#955) · 774f51f2
Break Yang authored 3 years ago

774f51f2

Aug 02, 2021
- support multi-dim reward for AC and PPO (#952) · d9f35898
  Haonan Yu authored 3 years ago
  
  * support multi-dim reward for AC and PPO * address comments * more updates
  d9f35898
Jul 30, 2021
- Enable DDP for on policy RLTrainer (#913) (#951) · 4317e0d9
  Break Yang authored 3 years ago
  
  * Add UnrollPerformer as the module being wrapped by DistributedDataParallel * Enable DDP for on policy RLTrainer
  4317e0d9
- Add UnrollPerformer as the module being wrapped by DistributedDataParallel (#950) · 7f541762
  Break Yang authored 3 years ago
  
  7f541762
Jul 29, 2021
- Disable optimizer's mnemonic naming in algorithm.state_dict() (#949) · 4eca2e16
  Break Yang authored 3 years ago
  
  * Disable optimizer's mnemonic naming in algorithm.state_dict() * Remove unnecessary commented out class
  4eca2e16
- Handle keyboard interrupt nicely and quietly (#947) · 1c59ec71
  Break Yang authored 3 years ago
  
  1c59ec71
Jul 28, 2021
- Force using fork start method to create single process environment (#946) · bcfdeb41
  Break Yang authored 3 years ago
  
  bcfdeb41
Jul 27, 2021

chapter 2 of tutorial (#945) · b4be8d22

Haonan Yu authored 3 years ago

* chapter 2 of tutorial

* proper cross referencing

* fix syntax error

* address comments

b4be8d22

[REFACTOR] train.py to consolidate common logic for both single GPU and multi... · 704e5c9d

Break Yang authored 3 years ago

[REFACTOR] train.py to consolidate common logic for both single GPU and multi GPU training (#913) (#944)

* [REFACTOR] train.py to consolidate common logic for both single GPU and multi GPU training

* Address Wei's comments

* Address Haonan's comments

* Specify authoritative url and port as well

* Remove unused Optional typing

704e5c9d

Jul 25, 2021
- Import alf-nix-devenv as the nix-based development environment. (#922) · b75b40be
  Break Yang authored 3 years ago
  
  b75b40be
Jul 23, 2021
- fix suite_babyai TimeLimit wrapper (#936) · 23646987
  Haonan Yu authored 3 years ago
  
  23646987
Jul 22, 2021

update oac algorithm with latest alf updates (#943) · 7acb5d56

Qinxun Bai authored 3 years ago

* update oac algorithm with latest alf updates

* update oac test hyper parameters to past test more stably

7acb5d56

Extract worker logic out of ProcessEnvironment #913 (#939) · f2d378ad

Break Yang authored 3 years ago

This is part of the effort to unblock #913. Two reasons for this change

1. `worker` definitely does not rely on `ProcessEnvironment` at all, and therefore it is cleaner to make it independent of `ProcessEnvironment`.
2. If it stays as a member method of `ProcessEnvironment`, `multiprocess.Process` will get stuck on `start()` if the parent process is also a `multiprocess.Process`, for unknown reason though (tried investigation but haven't figured out).

f2d378ad

first section of ALF tutorial (#935) · 5b277769
Haonan Yu authored 3 years ago
```
* first section of ALF tutorial

* address comments
```
5b277769

Extract message types as MessageType Enum in ProcessEnvironment (#938) · d1d1cc61

Break Yang authored 3 years ago

This is part of the effort to address #913. A sub-task requires extract the worker logic to be out of the class (for some reason it will prevent `multiprocessing` to work correctly). Without such change the `multiprocessing.Process` will just be stuck on `start()`.

d1d1cc61

Jul 20, 2021
- add config num_summaries and num_evals (#937) · 6389e07f
  Haonan Yu authored 3 years ago
  
  * add config num_summaries and num_evals * fix the docstring format
  6389e07f
- set untransformed field in _step() (#941) · 6a677a2d
  Haonan Yu authored 3 years ago
  
  6a677a2d
Jul 18, 2021

Load BSuite Environment (#933) · 25b23774

pd-perry authored 3 years ago

* added suite load for bsuite environment

* fixed observation shape bug in suite_bsuite and added alf_bsuite_wrapper

* Revert "Correct relative path import in py configurations (#928)"

This reverts commit 33c0ba21.

* removed alf_bsuite_wrapper and added method to suite_bsuite instead

* removed alf_bsuite_wrapper import from suite_bsuite file

* change loadbsuite to bsuitewrapper and edit docstrings

* edited description

* fixed PR review changes

* added check for max steps and change copyright year to this year

25b23774

Jul 13, 2021
- Correct relative path import in py configurations (#928) · 33c0ba21
  Break Yang authored 3 years ago
  
  33c0ba21
- write pre configs to conf file when grid searching (#925) · 8e27c6db
  hnyu authored 3 years ago
  
  * write pre configs to conf file when grid searching * correct message
  8e27c6db
- Print stdout and stderr when run_cmd fails during train_play_test (#929) · 9ccfabaf
  Break Yang authored 3 years ago
  
  * Print stdout and stderr when run_cmd fails during train_play_test * Make cmd on its own line
  9ccfabaf
Jul 12, 2021
- Add additional time option to Carla (#916) · bac5c1da
  Haichao Zhang authored 3 years ago
  
  bac5c1da
Jul 10, 2021
- fix suite_carla_test (#926) · 72a029b6
  hnyu authored 3 years ago
  
  72a029b6
- Convert ac_breakout to using python configurable (#924) · b4d4dc9f
  Break Yang authored 3 years ago
  
  * Convert ac_breakout to using python configurable * Remove unnecessary print
  b4d4dc9f
Jul 08, 2021
- Update .gitignore to respect nix and direnv artifacts (#923) · f440c769
  Break Yang authored 3 years ago
  
  f440c769
Jun 28, 2021
- A number of typo fixes. (#912) · 4449181b
  Haichao Zhang authored 3 years ago
  
  4449181b
Jun 19, 2021
- Move parallel_network flag from algorithm to network (#911) · 26b3a26b
  Haichao Zhang authored 3 years ago
  
  * Remove use_parallel_network flag from alg * Move flag to network side
  26b3a26b
Jun 18, 2021

Optimistic actor critic (#899) · 61b31de4

Qinxun Bai authored 3 years ago

* add OacAlgorithm and replicate paper results on HalfCheetah

* add oac_humannoid_conf and oac_algorithm_test

* minor updates in oac_algorithm_test

* address code reviews

* add oac halfcheetah and humanoid result figures

* update oac_algorithm test

* address further code reviews

* address further code reviews

* remove the unroll_with_grad

* update oac_halfcheetah_conf for better performance

* minor update of oac_halfcheetah_conf

61b31de4

add the option of creating parallel env of N=1 for eval (#910) · c4b5a7bf
hnyu authored 3 years ago

c4b5a7bf

config_util improvement (#909) · addcaab7

emailweixu authored 3 years ago

* Minor improvement to config_util

1. Environment varible for not using gin. gin's wrapper is very complicated, which can make debugging unfriendly and slow down the execution.
2. Error report for misuse of alf.config()

* comment

* Fix message

addcaab7

Jun 17, 2021

Compositional FC (#907) · 50b3286c

Haichao Zhang authored 3 years ago

* Compositional FC

* Use bmm for weighted combination

* Better support to layer chaining

50b3286c

lean_function (#908) · b2efe2fc

emailweixu authored 3 years ago

Wrap a function to save memory for backward. The returned function performs same
computation as ``func``, but save memory by discarding intermediate results.
It calculates the gradient by recomputing ``func`` using the same input during backward.

b2efe2fc

Jun 16, 2021
- assert current path is not ALF root when using snapshot (#906) · bf93b593
  hnyu authored 3 years ago
  
  bf93b593
- add snapshot section (#905) · a7de949b
  hnyu authored 3 years ago
  
  a7de949b
Jun 15, 2021
- replace all gin.configurable with alf.configurable (#904) · 2c0a8cbf
  hnyu authored 3 years ago
  
  * replace all gin.configurable with alf.configurable * remove import gin * fix failed test case
  2c0a8cbf
Jun 12, 2021
- Fix Timelimit wrapper (#902) · 3b902aef
  Haichao Zhang authored 3 years ago
  
  3b902aef
Jun 10, 2021
- Fix issue that breaks the sac test sometimes due to randperm (#901) · 8aed77cb
  Haichao Zhang authored 3 years ago
  
  * Fix issue that breaks sac test * use convert device
  8aed77cb