Why setting trainable to false after compiling the model will affect the model.summary

kindaichinisan · August 8, 2024, 9:01am

Why d_A and d_B summary() changes to have no trainable params when it is compiled before setting trainable = False.

code as follows:

        
class CycleGAN():
   def __init__(self):
        # Input shape
        self.img_rows = 128
        self.img_cols = 128
        self.channels = 3
        self.img_shape = (self.img_rows, self.img_cols, self.channels)

        # Configure data loader
        self.dataset_name = 'apple2orange'
        # Use the DataLoader object to import a preprocessed dataset
        self.data_loader = DataLoader(dataset_name=self.dataset_name,
                                      img_res=(self.img_rows, self.img_cols))

        # Calculate output shape of D (PatchGAN)
        patch = int(self.img_rows / 2**4)
        self.disc_patch = (patch, patch, 1)

        # Number of filters in the first layer of G and D
        self.gf = 32
        self.df = 64

        # Loss weights
        self.lambda_cycle = 10.0                    # Cycle-consistency loss
        self.lambda_id = 0.9 * self.lambda_cycle    # Identity loss

        optimizerA = Adam(0.0002, 0.5)
        optimizerB = Adam(0.0002, 0.5)
        optimizerC = Adam(0.0002, 0.5)
        
        # Build and compile the discriminators
        self.d_A = self.build_discriminator()
        self.d_B = self.build_discriminator()
        optimizerA.build(self.d_A.trainable_variables)
        self.d_A.compile(loss='mse',
                         optimizer=optimizerA,
                         metrics=['accuracy'])
        optimizerB.build(self.d_B.trainable_variables)
        self.d_B.compile(loss='mse',
                         optimizer=optimizerB,
                         metrics=['accuracy'])
        
        print(self.d_A.summary())
        print(self.d_B.summary())

        #-------------------------
        # Construct Computational
        #   Graph of Generators
        #-------------------------

        # Build the generators
        self.g_AB = self.build_generator()
        self.g_BA = self.build_generator()

        # Input images from both domains
        img_A = Input(shape=self.img_shape)
        img_B = Input(shape=self.img_shape)

        # Translate images to the other domain
        fake_B = self.g_AB(img_A)
        fake_A = self.g_BA(img_B)
        # Translate images back to original domain
        reconstr_A = self.g_BA(fake_B)
        reconstr_B = self.g_AB(fake_A)
        # Identity mapping of images
        img_A_id = self.g_BA(img_A)
        img_B_id = self.g_AB(img_B)

        # For the combined model we will only train the generators
        self.d_A.trainable = False
        self.d_B.trainable = False

        # Discriminators determines validity of translated images
        valid_A = self.d_A(fake_A)
        valid_B = self.d_B(fake_B)

        # Combined model trains generators to fool discriminators
        self.combined = Model(inputs=[img_A, img_B],
                              outputs=[valid_A, valid_B,
                                       reconstr_A, reconstr_B,
                                       img_A_id, img_B_id])
        
        optimizerC.build(self.combined.trainable_variables)
        self.combined.compile(loss=['mse', 'mse',
                                    'mae', 'mae',
                                    'mae', 'mae'],
                              loss_weights=[1, 1,
                                            self.lambda_cycle, self.lambda_cycle,
                                            self.lambda_id, self.lambda_id],
                              optimizer=optimizerC)
        
        
        print(self.d_A.summary())
        print(self.d_B.summary())

a = CycleGAN()

print out

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 128, 128, 3)]     0         
                                                                 
 conv2d (Conv2D)             (None, 64, 64, 64)        3136      
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 64, 64, 64)        0         
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 128)       131200    
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 32, 32, 128)       0         
                                                                 
 group_normalization (Group  (None, 32, 32, 128)       256       
 Normalization)                                                  
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 256)       524544    
                                                                 
 leaky_re_lu_2 (LeakyReLU)   (None, 16, 16, 256)       0         
                                                                 
 group_normalization_1 (Gro  (None, 16, 16, 256)       512       
 upNormalization)                                                
                                                                 
 conv2d_3 (Conv2D)           (None, 8, 8, 512)         2097664   
                                                                 
 leaky_re_lu_3 (LeakyReLU)   (None, 8, 8, 512)         0         
                                                                 
 group_normalization_2 (Gro  (None, 8, 8, 512)         1024      
 upNormalization)                                                
                                                                 
 conv2d_4 (Conv2D)           (None, 8, 8, 1)           8193      
                                                                 
=================================================================
Total params: 2766529 (10.55 MB)
Trainable params: 2766529 (10.55 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None
Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 128, 128, 3)]     0         
                                                                 
 conv2d_5 (Conv2D)           (None, 64, 64, 64)        3136      
                                                                 
 leaky_re_lu_4 (LeakyReLU)   (None, 64, 64, 64)        0         
                                                                 
 conv2d_6 (Conv2D)           (None, 32, 32, 128)       131200    
                                                                 
 leaky_re_lu_5 (LeakyReLU)   (None, 32, 32, 128)       0         
                                                                 
 group_normalization_3 (Gro  (None, 32, 32, 128)       256       
 upNormalization)                                                
                                                                 
 conv2d_7 (Conv2D)           (None, 16, 16, 256)       524544    
                                                                 
 leaky_re_lu_6 (LeakyReLU)   (None, 16, 16, 256)       0         
                                                                 
 group_normalization_4 (Gro  (None, 16, 16, 256)       512       
 upNormalization)                                                
                                                                 
 conv2d_8 (Conv2D)           (None, 8, 8, 512)         2097664   
                                                                 
 leaky_re_lu_7 (LeakyReLU)   (None, 8, 8, 512)         0         
                                                                 
 group_normalization_5 (Gro  (None, 8, 8, 512)         1024      
 upNormalization)                                                
                                                                 
 conv2d_9 (Conv2D)           (None, 8, 8, 1)           8193      
                                                                 
=================================================================
Total params: 2766529 (10.55 MB)
Trainable params: 2766529 (10.55 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None
Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 128, 128, 3)]     0         
                                                                 
 conv2d (Conv2D)             (None, 64, 64, 64)        3136      
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 64, 64, 64)        0         
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 128)       131200    
                                                                 
 leaky_re_lu_1 (LeakyReLU)   (None, 32, 32, 128)       0         
                                                                 
 group_normalization (Group  (None, 32, 32, 128)       256       
 Normalization)                                                  
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 256)       524544    
                                                                 
 leaky_re_lu_2 (LeakyReLU)   (None, 16, 16, 256)       0         
                                                                 
 group_normalization_1 (Gro  (None, 16, 16, 256)       512       
 upNormalization)                                                
                                                                 
 conv2d_3 (Conv2D)           (None, 8, 8, 512)         2097664   
                                                                 
 leaky_re_lu_3 (LeakyReLU)   (None, 8, 8, 512)         0         
                                                                 
 group_normalization_2 (Gro  (None, 8, 8, 512)         1024      
 upNormalization)                                                
                                                                 
 conv2d_4 (Conv2D)           (None, 8, 8, 1)           8193      
                                                                 
=================================================================
Total params: 2766529 (10.55 MB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 2766529 (10.55 MB)
_________________________________________________________________
None
Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 128, 128, 3)]     0         
                                                                 
 conv2d_5 (Conv2D)           (None, 64, 64, 64)        3136      
                                                                 
 leaky_re_lu_4 (LeakyReLU)   (None, 64, 64, 64)        0         
                                                                 
 conv2d_6 (Conv2D)           (None, 32, 32, 128)       131200    
                                                                 
 leaky_re_lu_5 (LeakyReLU)   (None, 32, 32, 128)       0         
                                                                 
 group_normalization_3 (Gro  (None, 32, 32, 128)       256       
 upNormalization)                                                
                                                                 
 conv2d_7 (Conv2D)           (None, 16, 16, 256)       524544    
                                                                 
 leaky_re_lu_6 (LeakyReLU)   (None, 16, 16, 256)       0         
                                                                 
 group_normalization_4 (Gro  (None, 16, 16, 256)       512       
 upNormalization)                                                
                                                                 
 conv2d_8 (Conv2D)           (None, 8, 8, 512)         2097664   
                                                                 
 leaky_re_lu_7 (LeakyReLU)   (None, 8, 8, 512)         0         
                                                                 
 group_normalization_5 (Gro  (None, 8, 8, 512)         1024      
 upNormalization)                                                
                                                                 
 conv2d_9 (Conv2D)           (None, 8, 8, 1)           8193      
                                                                 
=================================================================
Total params: 2766529 (10.55 MB)
Trainable params: 0 (0.00 Byte)
Non-trainable params: 2766529 (10.55 MB)
_________________________________________________________________
None

Kiran_Sai_Ramineni · August 9, 2024, 6:01am

Hi @kindaichinisan,

let me know if my understanding is correct like you want to know why the model parameter changes to no trainable even after setting the trainable parameter to False after compiling the model.

If that is the case, the main purpose of compiling the model is to set the configuration for training, such as the optimizer, loss function, and metrics. Even though you have set the trainable parameter to False after compiling the model the trainable parameters will become non trainable. Thank You.

kindaichinisan · August 12, 2024, 2:40am

Hi Kiran,

Thanks for replying.

For information, I am trying to work on CycleGAN which does training on discriminator and combined model of discriminator and generator. I would want the discriminator training to affect the disciminator weights, whereas the combined model training to freeze the discriminator weights, just affect the generator weights.

The github I am following is : gans-in-action/chapter-9/Chapter9_CycleGAN.ipynb at master · GANs-in-Action/gans-in-action · GitHub

I thought that when self.d_A.compile is called, discriminator will be configured with its weight as trainable. Then when self.combined.compile is called, the combined model will be configured with its discriminator weights as non-trainable as self.d_A.trainable is set to False. But, I do not expect setting self.d_A.trainable to False to affect the self.d_A as it has already been compiled.

Qn1: after compiling a model, can the weights of model change from trainable to non-trainable by modifying the trainable variable?
Qn2: How to achieve my objective of having the discriminator weight to be trainable for the discriminator but non-trainable for the combined model?

kindaichinisan · August 28, 2024, 3:26am

Any response to the 2 questions I pose?

Topic		Replies	Views
ValueError: No gradients provided for any variable in gan network General Discussion	2	533	October 14, 2023
My Gan network works perfect. But i cant save and load General Discussion models , keras	1	472	February 2, 2023
Why Deep Convolutional Neural Network GAN, trained using my step sequence, doesn't performs good as in the GFG DC-GAN Tutorial example? Even tho my code is similar in logic and resembles same to the one from GFG DC-GAN. General Discussion help_request	3	400	September 12, 2022
Tf.keras.model vs model class declaration General Discussion models , keras	1	418	July 12, 2023
Loss become nan after some epochs General Discussion datasets , keras , help_request	5	4189	February 13, 2023

Why setting trainable to false after compiling the model will affect the model.summary

Related topics