Best optmization algorithm

Double_Click · June 29, 2021, 12:43am

Hello Community
I have just developed an algorithm (I can say a family of algorithms) which have as performance a fast convergence while maintaining the generalization
I ask you for any help and advice
I want to join a company as a researcher in the field of artificial intelligence. having access to very expensive hardware (like GPUs) and more…
Here are the results of one of my algorithms in MNIST and IMBD data.( conduct on personal computer)

sincerely

TimoKer · June 29, 2021, 11:11am

Hi there,

This seems interesting. I do observe, however, that certainly for a low number of epochs, your algorithm has lower accuracy compared to the industry standards. Perhaps your approach has some other benefits that are not represented on these graphs? (time wise?)

Cheers,
Timo

Double_Click · June 29, 2021, 12:16pm

Hi there
As we know, there are two types of complexity (timing and memory requirements)
In this algorithm, I managed to achieve both qualities: less memory needed compared to adam and Adagrad RMSProp and high accuracy in finite time, even though I trained my algorithm with some time-consuming epochs on a personal computer.
I also reduced the hyperparameter used in many algorithms.

do you suggest any further research in this field?
cheers

Bhack · June 29, 2021, 1:46pm

I suggest to take a look at some strong benchmark protocols like ICLR 2021:

Double_Click · June 29, 2021, 2:15pm

Thank you sir, this is very helpful

sincerely

8bitmp3 · June 29, 2021, 6:07pm

Awesome work. There are a few other notable optimizers that may be worth comparing your work to, although I’m no expert:

AdaBelief: [2010.07468] AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients (fast convergence/stability) (2020)
Adafactor: [1804.04235] Adafactor: Adaptive Learning Rates with Sublinear Memory Cost (saves memory/fast training) (2018)
Fromage: [2002.03432] On the distance between two neural networks and the stability of learning (no learning rate tuning/works on GANs and Transformers) (2020)
LAMB: [1904.00962] Large Batch Optimization for Deep Learning: Training BERT in 76 minutes (consistent training performance/Transformers & ResNets)
SM3 (Square-root of Minima of Sums of Maxima of Squared-gradients Method): [1901.11150] Memory-Efficient Adaptive Optimization (memory efficient & adaptive/designed to decrease memory overhead especially with large models like Transformers-based BERT, etc.) (2019)
Yogi: http://www.sanjivk.com/yogi_nips2018.pdf (similar to Adam/convergence and generalisation/focuses on Adam and RMSprop’s issues) (2020)

Double_Click · June 29, 2021, 9:05pm

Thank you sir for your suggestions :innocent

Topic		Replies	Views
Challenges in Achieving Convergence of the Objective Function During Hyperparameter Optimization TensorFlow model-optimization	0	63	August 21, 2024
How does the optimizer `tf.keras.optimizers.Adam()` work? TensorFlow models , keras , gpu , help_request	1	806	April 13, 2023
Training a DNN: slow and steady, or fast and furious? General Discussion api , keras	2	664	March 3, 2023
Compact Convolutional Transformers Show and Tell keras , learning , education	15	3714	April 22, 2022
Self-supervised contrastive learning with SimSiam Show and Tell keras , learning , education	10	3441	July 8, 2021

Best optmization algorithm

Related topics