I really enjoy using colab, but step-by-step debugging is hard, so I often use PyCharm to do this. Problem is getting a GPU to work well, so once I have a PyCharm environment working on a PC with a GPU, and TF works happily with that GPU, I’m in coding heaven.
Colab Pro is my favorite for day-to-day development needs, but I also love keeping a VS Code workspace always open and ready to quickly test ideas.
And my general workflow. I develop and varify my model can learn with the setup described above, then I clean and convert my code into a Python package using VS Code and push it into a private repo. Now I can easily go serverless for training with Cloud AI Platform (and Cloud TPU for giant networks). I always try to make TensorFlow IO API the only dependency for my data pipeline, which boosts the portability to different filesystems and accelerators
I do a lot of prototyping in Colab, including capability uplift work, but switch to VSCode as quickly as I can. Very very rarely do I work with a client that has a green fields setup; it’s always an existing mix of technologies from AWS, GCP & Azure. Emacs for all my hobby work.
I used to be in Ubuntu, now coding on my own GPU workstation with Windows + WSL2 Ubuntu. Both systems share the same disk storage so I can write code in Windows VS Code and run projects in linux command line. CUDA is also supported natively and TF seems to be working well so far!
When I need to have access to TPUs, I usually create TFRecords (if the dataset is pretty large) inside an AI Platform Notebook and use them inside a Colab Notebook to use its free TPUs. Using this approach, I have been able to maintain sanity on costs and also train large-scale models.
To clean up the work and modularize things, I defer to PyCharm.
Fun fact: I don’t prefer to set things up manually. This is why I LOVE working with Colab and AI Platform Notebooks so much.
I mostly use a mixture of PyCharm and IntelliJ depending on what language I’m working in. It’s a pity that neither PyCharm nor IntelliJ understand C/C++ code as it means I have to switch to CLion or VScode to be able to trace something through all of TF or TF-Java.
When developing models I prefer to do that in an IDE as well, the mutability of notebooks and their incompatibility with source control means that they tend to cause issues. I admit I’m not that familiar with colab as other clouds are not allowed at work, does it have good version control solutions?
Bookmark of the one and only Colab-Notebook Untitled.ipynb for everything. Prototype new ideas, and nowhere else.
Decide from time to time: a) Delete notebook cells because it’s rubbish, or b) keep it
If 2b) start to add docstring and dream up some possible unit tests in Colab (e.g. checking dimensions of intermediate results)
Start coding an actual python package with unit tests, setup.py, and so forth.
Step 1 is what I would call “data science” or “ml research”. With step 3 and 4 we enter software development or “ml engineering” territory. Imo it’s best to stay as long as possible in step 1 and 3 where creativity happens (=changes), what you don’t really want in step 4 anymore.
I have ubuntu under WSL so I use vs code + jupyter notebooks to work locally as I can work, leave, pick up later without worrying about losing a session.
However, I do not have a local GPU so for some tasks I rely on Colab.
If it wasn’t for session expiring, I would probably be doing everything on Colab