Consider the actions in this page: https://davidkmarzagao.github.io/RL-bandits-exercise/
Perform a random algorithm to select actions.
How to use the system? There are 5 actions marked . By clicking on them you sample of of its values. You may need to make use of a random number generator to decide on which action to select, and you will find a button for exactly that. This will ensure that you will get the “random” numbers you are supposed to get.
Made a mistake? Refresh the page to restart the sequence of random numbers and action values.
You question is: what is the average of the first 8 rewards you obtain? This number should be rounded to 1 decimal place and it is the password for the next page:
Next page: https://kohan.uk/rl-2
