Support for contextual bandits with different number of actions to choose from on each time-step

shabib · June 6, 2022, 2:19am

I am new working with contextual bandits with TensorFlow, I wanted to ask if TensorFlow supports changing continious actions. In case of my application actions can be between 2 to 100 for different time-steps. Currently we are using vowel-wabbit, which does not provide a lot of flexibility and we want to port it to TensorFlow.

Aniket_Dubey · September 27, 2024, 10:47am

Hi @shabib ,

You can implement contextual bandits with continuous actions using this approach
Deep Deterministic Policy Gradient (DDPG) Specifically designed for continuous action spaces.

You can refer this official documentation for the same for better understanding .

Thank you .

Topic		Replies	Views
Dqn tutorial with multi dimensional actions General Discussion reinforcement-learni , tf-agents	0	221	December 18, 2023
Multi-objective optimization and Constraint optimization for bandits General Discussion tf-agent	1	255	August 26, 2024
Changing action_spec General Discussion tf_agents , help_request	2	896	July 28, 2021
Using a recurrent model in reinforcement learning TensorFlow models	1	256	September 2, 2023
Connection between Agents and Policies General Discussion docs , help_request	11	1142	June 3, 2021

Support for contextual bandits with different number of actions to choose from on each time-step

Related topics