I want to use REINFORCE and I’m creating an enviroment. I want to train the model to get to the top of a mountain, and it has 4 hikers that it can move to explore the map (it doesn’t know the mountain). I can define observation_specs like this?
self._observation_spec = array_spec.BoundedArraySpec(shape=(4,), dtype=dict, name='observation')
The idea is that the thing that outputs looks like this:
{“hiker1”: (x,y,z,dx,dy), “hiker2”: (x,y,z,dx,dy), “hiker3”: (x,y,z,dx,dy), “hiker4”: (x,y,z,dx,dy)}
It could also be given like this?
self._observation_spec = array_spec.BoundedArraySpec(shape=(4,), dtype=np.array, name='observation')
np.array[(x,y,z,dx,dy),(x,y,z,dx,dy),(x,y,z,dx,dy),(x,y,z,dx,dy)]