I created a text summary model by fine-tuning t5 model, and while inferencing I got a one question.
Following two methods were given same input text,
min_target_length
and max_target_length
but two are yielding different result in terms of length.
Here method1
follow my parameter setting. If I set the min_target_length to 64 and max_target_length 256(same setting I used for training) it does generate summary length of 180.
but method2
always generate one sentence summary, no matter how I set the parameter.
Can anyone know me why this is happening?. Thanks
method1
from transformers import pipeline
summarizer = pipeline("summarization", model=model, tokenizer=tokenizer, framework="tf")
summarizer(
raw_datasets["train"][0]["document"],
min_length=MIN_TARGET_LENGTH,
max_length=MAX_TARGET_LENGTH,
)
method2
inputs = ["summarize: " + text]
inputs = tokenizer(inputs, max_length=MAX_INPUT_LENGTH, truncation=True, return_tensors="tf")
output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=64, max_length=256)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
predicted_summary = nltk.sent_tokenize(decoded_output.strip())[0]
print(predicted_summary)