|
Gemma 4 27B-A4B-it (MoE) on Vertex AI: vLLM Dependency Triangles, LoRA, and Vision Blockers
|
|
1
|
172
|
April 27, 2026
|
|
Tf.keras.Model and model class adaptation
|
|
1
|
320
|
February 23, 2026
|
|
OpenAI’s GPT-2 param run model
|
|
1
|
1294
|
September 25, 2025
|
|
Optimizing seq2seq decoding script
|
|
1
|
536
|
July 25, 2025
|
|
Tensorflow ver 2.17.0
|
|
1
|
73
|
July 21, 2025
|
|
Help! When building or training model get error: "ValueError: The first argument to `Layer.call` must always be passed. "
|
|
1
|
2759
|
May 23, 2025
|
|
Keras Transformer Implementation Error
|
|
1
|
90
|
January 16, 2025
|
|
From tensorflow.keras.wrappers.scikit_learn import KerasClassifier
|
|
6
|
13403
|
April 15, 2024
|
|
Fine-tuning GPT2 for text summary
|
|
1
|
836
|
December 27, 2024
|
|
I have been training a decoder based transformer for word generation. But it keeps generating the same words over and over again
|
|
1
|
679
|
December 20, 2024
|
|
Create_padding_mask in Transformer code uses encoder input sequence for creating padding mask in 2nd attention block of the decoder
|
|
1
|
1297
|
December 12, 2024
|
|
Apply a traied model with tensorflow on transformer pipeline pop out error
|
|
1
|
861
|
November 22, 2024
|
|
How to calculate BLUE score, precision, recall, calibration, confusion matrix for transformer?
|
|
1
|
365
|
October 11, 2024
|
|
GPT NEO-For COVID-19 Question Answering
|
|
1
|
1607
|
October 10, 2024
|
|
Exception encountered when calling layer 'softmax' (type Softmax)
|
|
1
|
505
|
September 19, 2024
|
|
Masking propagation through layers
|
|
1
|
771
|
September 19, 2024
|
|
T5 fine-tuned model: one method ignores min_target_length parameter while one does not
|
|
1
|
381
|
August 23, 2024
|
|
Issue with Deserializing a Custom Transformer Model in TensorFlow
|
|
1
|
539
|
May 20, 2024
|
|
RESOURCE_EXHAUSTED when running TimeDistributed on MultiHeadAttention
|
|
1
|
467
|
January 29, 2024
|
|
How do I use sentence-transformers/all-MiniLM-L6-v2 tflite model in android studio (kotlin)
|
|
1
|
1703
|
January 23, 2024
|
|
Getting very less accuracy in vision transformer
|
|
0
|
425
|
October 17, 2023
|
|
What is the model suitable for time series forecasting?
|
|
2
|
632
|
October 13, 2023
|
|
Call Tensorflow Model in a loop leaks memory
|
|
1
|
1459
|
September 25, 2023
|
|
Issue with HuggingFace psuh_to_hub
|
|
1
|
977
|
June 20, 2023
|
|
Though Training accuracy is high performance on training data during inference in transformer translation is poor
|
|
0
|
638
|
June 9, 2023
|
|
How Hugging Face improved Text Generation performance with XLA
|
|
1
|
995
|
June 8, 2023
|
|
How to extract body of a transformer like models and fine tune with that body on different data
|
|
2
|
531
|
June 5, 2023
|
|
Does TransformerEncoder layer accept built-in mask?
|
|
1
|
803
|
May 8, 2023
|
|
Save and restore transformer model
|
|
1
|
1143
|
March 18, 2023
|
|
Main transformers use-cases and insights
|
|
0
|
755
|
February 7, 2023
|