Hello Community
I have just developed an algorithm (I can say a family of algorithms) which have as performance a fast convergence while maintaining the generalization
I ask you for any help and advice
I want to join a company as a researcher in the field of artificial intelligence. having access to very expensive hardware (like GPUs) and more…
Here are the results of one of my algorithms in MNIST and IMBD data.( conduct on personal computer)
This seems interesting. I do observe, however, that certainly for a low number of epochs, your algorithm has lower accuracy compared to the industry standards. Perhaps your approach has some other benefits that are not represented on these graphs? (time wise?)
Hi there
As we know, there are two types of complexity (timing and memory requirements)
In this algorithm, I managed to achieve both qualities: less memory needed compared to adam and Adagrad RMSProp and high accuracy in finite time, even though I trained my algorithm with some time-consuming epochs on a personal computer.
I also reduced the hyperparameter used in many algorithms.
do you suggest any further research in this field?
cheers
SM3 (Square-root of Minima of Sums of Maxima of Squared-gradients Method): [1901.11150] Memory-Efficient Adaptive Optimization (memory efficient & adaptive/designed to decrease memory overhead especially with large models like Transformers-based BERT, etc.) (2019)