Story and Project
Hi, I am Aeneas, a beginner in Reinforcement Learning. I found there are rare open-sourced DQN agents on the gymnasium Car-Racing v3 environment (at least I could not find domain_randomize = True
Repo for me. So, I set a Summer Side Project to test the potential of a DQN-based model by using Keras/TensorFlow ( I know most people choose PPO or implement on PyTorch).
The average score over 100 episodes is ~ 830. without fine-tune.
My friend suggested that I release my private Jupyter Notebook for everyone. So, I want to share the model and the details of the whole notebook.
Brief Tech summary
In brief, I used the following techniques:
- Multiple CNN branches
- Residual CNN
- Visual-Cortex inspired image preprocessing
- Frame Stacked
- Multiple Q-head (5 types with dropout to mimic noisy Net)
- Ensemble Q-head ( a cheaper way to implement distributional Q, not very solid, but it seems worked)
- Double Network
- Dueling
- Prioritized multiple steps
- And other small tricks
The performance of the latest released agent is:
Evaluation complete after 100 Episodes
Average Score: 831.53
Max Score: 935.10
Min Score: -42.86
Std Score: 108.53
Resource
Link: GitHub - AeneasWeiChiHsu/CarRacing-v3-DQN-: An avg 800 Deep Q-Network agent for OpenAI CarRacing-v3 (domain_randomize=True)
Notebook: CarRacing-v3-DQN-/Safe[10_Q_Head]Randomized_Contrast_Enhance_Double_Residual_Prioritized_Multi_Step_Double_Dual_Head_DQN_CarRacing_RL_Project_Model.ipynb at main · AeneasWeiChiHsu/CarRacing-v3-DQN- · GitHub
In the Repo, you can find my original notebook (I decided to preserve my personal notes and comments–some are meditation notes, some are about philosophical reflection). I believe it is important to note that a project is a journey. I keep everything transparent and try to build a beginner-friendly version for learners.
How to use:
- You can download the notebook and run it on your local laptop
- You can download the model if you don’t want to train it.
- You can modify some blocks ( I modularised the model)
- You can find the whole GIF folder link
- You can test the model by running the evaluation cell
Because I am a new user, I could not put the evaluation figure here, but you can find it in my Repo. I am sorry for that.
Note:
I will write a short report on some interesting topics related to this agent and architecture. I planned a schedule for some analysis. This Summer Side Project keeps going. I hope this can help some learners to believe that DQN can still work and Keras/TensorFlow can be a platform for RL tasks.
If you find something wrong/issues, please let me know. And if you have some insights or suggestions, I am happy to know that
This is for educational purposes and is open-source. Please treat the agent kindly. It is a cute agent.
In some cases, if you extend the training episode number > 25,000, the reward collapse could occur ( I encountered this once). And if you run the real test, please note the possibility of performance collapse (visual-cortex failure, but it rarely happens).
In this project, I only focus on the model and architecture, so some parts were coded by AI (Gemini or GPT).
If you want to train it by yourself, it possibly consumes few days or 1 week ( I ran it on my MacBook Air M2, 1 week).