Issue with Language Specific Transcription Using txtai and Whisper #593

Nondzu · 2023-11-03T21:54:48Z

Environment

txtai version: 6.2.0
whisper version:
Python version: 3.11.5
Operating System:
Description: Linux Mint 21.2
Release: 21.2
Codename: victoria

Description

I'm attempting to transcribe Polish audio using the Whisper model within txtai, and while I am able to get transcriptions, they appear to be in English rather than the native language of the audio.

Here's a snippet of the code I'm using:

from
 txtai
.
transcription
 import
 Transcription


transcribe
 =
 Transcription
(
"openai/whisper-large-v2"
)
for
 text
 in
 transcribe
(
files
):
    
print
(
text
)

Questions

Does txtai's transcription feature automatically translate the text to English, or is it supposed to return text in the language of the audio?
How can I disable any automatic translation feature or specify the language of the audio to ensure that the transcription is in Polish?

Any guidance or suggestions on this matter would be greatly appreciated.

Thank you!

Nondzu · 2023-11-03T21:58:01Z

davidmezzetti · 2023-11-04T01:54:23Z

It's possible Whisper runs the translation task by default. Here's an idea to test out using code from the model page .

from
 transformers
 import
 WhisperProcessor

from
 txtai
.
transcription
 import
 Transcription


transcribe
 =
 Transcription
(
"openai/whisper-large-v2"
)

# Test transcribe only

transcribe
.
pipeline
.
model
.
config
.
forced_decoder_ids
 =
 WhisperProcessor
.
get_decoder_prompt_ids
(
language
=
"polish"
, 
task
=
"transcribe"
)

for
 text
 in
 transcribe
(
files
):
    
print
(
text
)

If that works, I can add in a change that makes this more streamlined.

Nondzu · 2023-11-04T06:06:50Z

@davidmezzetti thank you for help, after small mod this code works fine

from
 transformers
 import
 WhisperProcessor

from
 txtai
.
pipeline
 import
 Transcription


# from txtai.transcription import Transcription

# model = "openai/whisper-large-v2"

model
 =
 "bardsai/whisper-large-v2-pl-v2"

transcribe
 =
 Transcription
(
model
)
processor
 =
 WhisperProcessor
.
from_pretrained
(
model
)
# Test transcribe only

transcribe
.
pipeline
.
model
.
config
.
forced_decoder_ids
 =
 processor
.
get_decoder_prompt_ids
(
language
=
"polish"
, 
task
=
"transcribe"
)

for
 text
 in
 transcribe
(
files
):
    
print
(
text
)

davidmezzetti · 2023-11-04T13:20:44Z

Thanks for confirming. I'll keep this issue open and add an argument to disable automatic translation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Language Specific Transcription Using txtai and Whisper #593

Issue with Language Specific Transcription Using txtai and Whisper #593

Nondzu commented Nov 3, 2023 •

edited

Nondzu commented Nov 3, 2023

davidmezzetti commented Nov 4, 2023

Nondzu commented Nov 4, 2023 •

edited

davidmezzetti commented Nov 4, 2023

Issue with Language Specific Transcription Using txtai and Whisper #593

Issue with Language Specific Transcription Using txtai and Whisper #593

Comments

Nondzu commented Nov 3, 2023 • edited

Environment

Description

Questions

Nondzu commented Nov 3, 2023

davidmezzetti commented Nov 4, 2023

Nondzu commented Nov 4, 2023 • edited

davidmezzetti commented Nov 4, 2023

Nondzu commented Nov 3, 2023 •

edited

Nondzu commented Nov 4, 2023 •

edited