Calculate Flops in Tensorflow and Pytorch are not equal?

mrgreen3325 · May 20, 2021, 9:48am

Given the same model, I found that the calculated flops in pytorch and tensorflow are different. I used the keras_flops (keras-flops · PyPI) in tensorflow, and ptflops (ptflops · PyPI) in pytorch to calculate flops. It seems that flops in pytorch are closed to my own calculation by hands. Is that tensorflow has some tricks to speed up the computation so that few flops are measured? My model in tensorflow

d=56
s=12

inp = Input((750 ,750, 1))
x = Conv2D(d, (5,5), padding='same')(inp)
x = PReLU()(x)

x = Conv2D(s, (1,1), padding='valid')(x)
x = PReLU()(x)

x = Conv2D(s, (3,3), padding='same')(x)
x = PReLU()(x)
x = Conv2D(s, (3,3), padding='same')(x)
x = PReLU()(x)
x = Conv2D(s, (3,3), padding='same')(x)
x = PReLU()(x)
x = Conv2D(s, (3,3), padding='same')(x)
x = PReLU()(x)

x = Conv2D(d, (1,1), padding='same')(x)
x = PReLU()(x)
out = Conv2DTranspose(1 ,(9,9), strides=(4, 4),padding='same',output_padding = 3)(x)

My model in tensorflow

node name | # float_ops
Conv2D                   8.92b float_ops (100.00%, 61.95%)
Conv2DBackpropInput      5.10b float_ops (38.05%, 35.44%)
Neg                      180.00m float_ops (2.61%, 1.25%)
BiasAdd                  105.75m float_ops (1.36%, 0.73%)
Mul                      90.00m float_ops (0.63%, 0.63%)

======================End of Report==========================
The FLOPs is:14.3 GFlops

However, the FLops in pytorch is

Model_1(
  0.013 M, 100.000% Params, 45.486 GMac, 100.000% MACs, 
  (begin): Sequential(
    0.002 M, 11.804% Params, 0.851 GMac, 1.870% MACs, 
    (0): Conv2d(0.001 M, 11.367% Params, 0.819 GMac, 1.801% MACs, 1, 56, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (1): PReLU(0.0 M, 0.437% Params, 0.032 GMac, 0.069% MACs, num_parameters=56)
  )
  (middle): Sequential(
    0.007 M, 52.775% Params, 3.803 GMac, 8.360% MACs, 
    (0): Conv2d(0.001 M, 5.340% Params, 0.385 GMac, 0.846% MACs, 56, 12, kernel_size=(1, 1), stride=(1, 1))
    (1): PReLU(0.0 M, 0.094% Params, 0.007 GMac, 0.015% MACs, num_parameters=12)
    (2): Conv2d(0.001 M, 10.212% Params, 0.736 GMac, 1.618% MACs, 12, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): PReLU(0.0 M, 0.094% Params, 0.007 GMac, 0.015% MACs, num_parameters=12)
    (4): Conv2d(0.001 M, 10.212% Params, 0.736 GMac, 1.618% MACs, 12, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (5): PReLU(0.0 M, 0.094% Params, 0.007 GMac, 0.015% MACs, num_parameters=12)
    (6): Conv2d(0.001 M, 10.212% Params, 0.736 GMac, 1.618% MACs, 12, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): PReLU(0.0 M, 0.094% Params, 0.007 GMac, 0.015% MACs, num_parameters=12)
    (8): Conv2d(0.001 M, 10.212% Params, 0.736 GMac, 1.618% MACs, 12, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): PReLU(0.0 M, 0.094% Params, 0.007 GMac, 0.015% MACs, num_parameters=12)
    (10): Conv2d(0.001 M, 5.684% Params, 0.409 GMac, 0.900% MACs, 12, 56, kernel_size=(1, 1), stride=(1, 1))
    (11): PReLU(0.0 M, 0.437% Params, 0.032 GMac, 0.069% MACs, num_parameters=56)
  )
  (final): ConvTranspose2d(0.005 M, 35.420% Params, 40.833 GMac, 89.770% MACs, 56, 1, kernel_size=(9, 9), stride=(4, 4), padding=(4, 4), output_padding=(3, 3))
)
Computational complexity:       45.49 GMac

mrgreen3325 · May 22, 2021, 2:38pm

Can anyone help plz?

Sayak_Paul · May 23, 2021, 3:11am

@Bhack @seanpmorgan any concrete leads here?

Bhack · May 23, 2021, 9:23am

Having an official tool It is a quite popular features request in TF.

github.com/tensorflow/tensorflow

TF 2.0 Feature: Flops calculation

opened 09:54AM - 25 Sep 19 UTC

closed 10:21PM - 18 Jun 22 UTC

pzobel

stat:awaiting response type:feature stale comp:tfdbg

<em>Please make sure that this is a feature request. As per our [GitHub Policy](…https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md), we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em> **System information** - TensorFlow version (you are using): TF 2.0 RC2 - Are you willing to contribute it (Yes/No): **Describe the feature and the current behavior/state.** I am missing the opportunity to compute the number of floating point operations of a tf.keras Model in TF 2.0. In TF 1.x tf.profiler was available [see here](https://stackoverflow.com/questions/45085938) but I can find anything equivalent for TF 2.0 yet. **Will this change the current api? How?** **Who will benefit with this feature?** Everbody interested in the computational complexity of a TensorFlow model. **Any Other info.**

In the meantime I think that you need to interact with these two third party projects repos about their specific implementations.

mrgreen3325 · May 23, 2021, 2:09pm

Thanks for reply.
I think the keras-flops · PyPI is call the official tools’ function.
Is that tensorflow has some tricks to speed up the computation so that few flops are measured?

Bhack · May 23, 2021, 2:40pm

I don’t know ptflops internals and It is using the GMacs metric.

You could try to give a run with FB semi-official flops tool like:

github.com

facebookresearch/fvcore/blob/main/docs/flop_count.md

# Flop Counter for PyTorch Models

fvcore contains a flop-counting tool for pytorch models -- the __first__ tool that can provide both __operator-level__ and __module-level__ flop counts together. We also provide functions to display the results according to the module hierarchy. We hope this tool can help pytorch users analyze their models more easily!

## Existing Approaches:

To our knowledge, a good flop counter for pytorch models that satisfy our needs do not yet exist. We review some existing solutions below:

### Count per-module flops in module-hooks

There are many existing tools (in [pytorch-OpCounter](https://github.com/Lyken17/pytorch-OpCounter), [flops-counter.pytorch](https://github.com/sovrasov/flops-counter.pytorch), [mmcv](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/flops_counter.py), [pytorch_model_summary](https://github.com/ceykmc/pytorch_model_summary), and our own [mobile_cv](https://github.com/facebookresearch/mobile-vision/blob/master/mobile_cv/lut/lib/pt/flops_utils.py)) that count per-module flops using Pytorch’s [module forward hooks](https://pytorch.org/docs/stable/generated/torch.nn.Module.html?highlight=module%20hook#torch.nn.Module.register_forward_hook). They work well for many models, but suffer from the same limitation that makes it hard to get accurate results:

* They are accurate only if every custom module implements a corresponding flop counter. This is too much extra effort for users.
* In addition, determining the flop counter of a complicated module requires manual inspection of its forward code to see what raw ops are called. A refactor of the forward logic   (replacing raw ops by submodules) might require a change in its flop counter.
* When a module contains control flow internally, the counting code has to replicate some of the module’s forward logic.

### Count per-operator flops

These limitations of per-module counting suggest that counting flops at operator level would be better: unlike a large set of custom modules users typically create, custom operators are much less common. Also, operators typically don’t contain control logic that needs to be replicated in its flop counter. For accurate results, operator-level counting is a more preferable approach.

This file has been truncated. show original

Sayak_Paul · May 24, 2021, 8:43am

@markdaoust @jbgordon this thread leads me to request for a thorough tutorial on reporting of FLOPs, and similar metrics.

Mark_Daoust · May 24, 2021, 11:27am

There’s too much going on in the initial post. Start by comparing individual layers, not whole models. That will make things easier to untangle.

My first impression is that you’re not measuring the same thing. Do we know why in the PT model 90% of the GMac comes from the final ConvTrtanspose2d layer, but that’s not listed for tensorflow?

“MAC” is “multiply-add-calculations”. The Conv2 layers are 9 Gflops (TF) or ~4.5 GMac (PT). 2:1 is the exchange rate. So that part makes sense.

mrgreen3325 · May 25, 2021, 1:40am

Thanks for the clarification.
Yes the deconvolution is a bit weird.
I tried to calculate myself as follow
The flops for deconvolution is:
Cout * (1+Cin * k * k) * Hout * Wout
= 1 * (1+56 * 9 * 9) * 3000 * 3000
= 40.83 GFlops.

This value is closed to the pytorch calculated flops, but different to tensorflow did.

Remy_Wehrung · May 25, 2021, 10:55pm

thanks, this will avoid endless and bottomless debates for GPUs

Mohanish_Nehete · May 27, 2021, 5:17pm

but if you look at softmax activation function. It contains the calculations for e to the power x.
So, that will be counted as FLOPS not MACs.
My understanding is one cannot divide FLOPS/2 to get MACs.
Please correct me if I am wrong

mrgreen3325 · May 28, 2021, 1:29am

Yes I agree with you.
2FLops = MACs is a approximate estimation.
There will be a different depending on the model itself.

mrgreen3325 · May 28, 2021, 1:34am

Does anyone think there is a miscalculation in my equation, since the tensorflow count_flops reporting a different answer?
Thanks

Remy_Wehrung · May 28, 2021, 7:58am

thin, the return of the revenge of “how one calculates a Flop”, I thought this debate exceeded with the tensor cores (it is true, the GPUs in question are missing), we are not in 1995 to wonder about the power real Cray T3E -1200

Mohammad_Shahpouri · December 22, 2023, 3:48am

Hello @mrgreen3325. I do not know if it is still a problem. However, tensorflow computes FLOPs, while tools that compute FLOPs for pytorch actually calculate MACs.
According to the information provided in github issue, you can compute MACs and FLOPs in tensorflow using the following snippet code:

Tensorflow

import tensorflow as tf
from tensorflow.python.profiler.model_analyzer import profile
from tensorflow.python.profiler.option_builder import ProfileOptionBuilder

def get_flops(model):
  forward_pass = tf.function(model.call, input_signature=[tf.TensorSpec(shape=(1,) + model.input_shape[1:])])
  graph_info = profile(forward_pass.get_concrete_function().graph, options=ProfileOptionBuilder.float_operation())
  flops = graph_info.total_float_ops
  return flops

model = tf.keras.applications.ConvNeXtTiny()

# model.compile(optimizer='adam', loss='bce', metrics=['accuracy'])

flops = get_flops(model)
macs = flops / 2
print(f"MACs: {macs / 1e+9:,} G")
print(f"FLOPs: {flops / 1e+9:,} G")

MACs: 4.3900329795 G
FLOPs: 8.780065959 G

Just keep in mind that the source computes MACs .

Pytorch

According to medium post, it is possible to compute MACs and FLOPs using the mentioned method. Therefore, I believe you calculated MACs that were close to pytorch. I would like to suggest using fvcore, the official/semi-official flops counter provided by facebook/meta, instead of torchprofile which was employed in medium post.

!pip -q install fvcore

import torch
import torchvision
from fvcore.nn import FlopCountAnalysis

convnexttiny_weights = torchvision.models.ConvNeXt_Tiny_Weights.IMAGENET1K_V1
model = torchvision.models.convnext_tiny(weights=convnexttiny_weights)

inputs = (torch.randn(1,3,224,224), )

macs = FlopCountAnalysis(model, inputs)
macs = macs.total()
flops = macs * 2
print(f'MACs = {macs / 1e+9:,} G')
print(f'FLOPs = {flops / 1e+9:,} G')

MACs = 4.470437376 G
FLOPs = 8.940874752 G

If you refer to Figure 2 of the paper ConvNeXt, you will notice that they have provided MACs instead of FLOPs for ConvNeXtTiny. ConvNeXt code: tensorflow , pytorch

I executed the code on colab.

Disclaimer: I can not claim to be familiar with all pytorch tools. However, I am aware that torchprofile and fvcore are capable of computing MACs.

I hope It helps.
I kindly request that someone correct me if I am mistaken.

Topic		Replies	Views
How to calculate FLOPs of transformer in tensorflow? TensorFlow help_request	1	3275	September 24, 2021
WHY I can count FLOPs of a model without declare which model to be counted? General Discussion models , tflite , help_request	1	1736	January 13, 2025
CNN net much slower than DNN General Discussion models , keras , gpu , help_request	9	2504	October 30, 2021
Neural Network Benchmarking General Discussion help_request	4	1137	May 27, 2021
How to find out keras model memory size? General Discussion models , keras , help_request	3	9118	October 23, 2021

Calculate Flops in Tensorflow and Pytorch are not equal?

Tensorflow

Pytorch

Related topics