[Open Source] avg 800+ DQN agent on Car-Racing v3 Random Environment

Story and Project

Hi, I am Aeneas, a beginner in Reinforcement Learning. I found there are rare open-sourced DQN agents on the gymnasium Car-Racing v3 environment (at least I could not find domain_randomize = True Repo for me. So, I set a Summer Side Project to test the potential of a DQN-based model by using Keras/TensorFlow ( I know most people choose PPO or implement on PyTorch).

The average score over 100 episodes is ~ 830. without fine-tune.

My friend suggested that I release my private Jupyter Notebook for everyone. So, I want to share the model and the details of the whole notebook.

Brief Tech summary

In brief, I used the following techniques:

  • Multiple CNN branches
  • Residual CNN
  • Visual-Cortex inspired image preprocessing
  • Frame Stacked
  • Multiple Q-head (5 types with dropout to mimic noisy Net)
  • Ensemble Q-head ( a cheaper way to implement distributional Q, not very solid, but it seems worked)
  • Double Network
  • Dueling
  • Prioritized multiple steps
  • And other small tricks

The performance of the latest released agent is:
Evaluation complete after 100 Episodes
Average Score: 831.53
Max Score: 935.10
Min Score: -42.86
Std Score: 108.53

Resource

Link: GitHub - AeneasWeiChiHsu/CarRacing-v3-DQN-: An avg 800 Deep Q-Network agent for OpenAI CarRacing-v3 (domain_randomize=True)
Notebook: CarRacing-v3-DQN-/Safe[10_Q_Head]Randomized_Contrast_Enhance_Double_Residual_Prioritized_Multi_Step_Double_Dual_Head_DQN_CarRacing_RL_Project_Model.ipynb at main · AeneasWeiChiHsu/CarRacing-v3-DQN- · GitHub

In the Repo, you can find my original notebook (I decided to preserve my personal notes and comments–some are meditation notes, some are about philosophical reflection). I believe it is important to note that a project is a journey. I keep everything transparent and try to build a beginner-friendly version for learners.

How to use:

  • You can download the notebook and run it on your local laptop
  • You can download the model if you don’t want to train it.
  • You can modify some blocks ( I modularised the model)
  • You can find the whole GIF folder link
  • You can test the model by running the evaluation cell

episode_100-926

Because I am a new user, I could not put the evaluation figure here, but you can find it in my Repo. I am sorry for that.

Note:

I will write a short report on some interesting topics related to this agent and architecture. I planned a schedule for some analysis. This Summer Side Project keeps going. I hope this can help some learners to believe that DQN can still work and Keras/TensorFlow can be a platform for RL tasks.

If you find something wrong/issues, please let me know. And if you have some insights or suggestions, I am happy to know that :smiley:

This is for educational purposes and is open-source. Please treat the agent kindly. It is a cute agent.

In some cases, if you extend the training episode number > 25,000, the reward collapse could occur ( I encountered this once). And if you run the real test, please note the possibility of performance collapse (visual-cortex failure, but it rarely happens).

In this project, I only focus on the model and architecture, so some parts were coded by AI (Gemini or GPT).

If you want to train it by yourself, it possibly consumes few days or 1 week ( I ran it on my MacBook Air M2, 1 week).

(post deleted by author)

Hi, everyone, I just want to update something about this open-sourced agent.
The average score is 828~829 out of 1000 trials:

1000-Episode Evaluation GIF log is available for everyone:
1000-Episode-Kaggle-Link

The model is now available to download from Kaggle:
DQN_agent-You can find the evaluation notebook from the GitHub Repo or the Kaggle Page of the model.

:smiley: