What is the recommended size of MEDASR audio length for transcription. Or For all the audio is it recommended to use 20s sliding window with 2s overlap ?
1 Like
Hi @kamalkraj ,
Yes, that’s a solid approach. It is generally recommend splitting long-form audio into ~20-second chunks with a 2-second overlap (15–20s with 2–3s stride also works well).
The overlap ensures words spanning chunk boundaries are captured fully and helps maintain transcription continuity across segments.
Also ensure the audio is resampled to 16 kHz mono before processing.
Thank you!