You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When my script batch processes a bunch of audio files using the approach you gave me to use a list of files and their settings when processing, if a single file fails for any reason, it prevents the transcriptions of all files' transcriptions from being done? I created a workaround to process each file to the transcribe_with_vad method (each using its own tqdm) and added error handling, which works. I was wondering if there's a way to make it so I can you your most efficient approach and still have error handling for a specific audio file? Here is the original script and a comparison with the single audio file processing with error handling:
import os
from PySide6.QtCore import QThread, Signal
from pathlib import Path
import whisper_s2t
import time
class Worker(QThread):
finished = Signal(str)
progress = Signal(str)
def __init__(self, directory, recursive, output_format, device, size, quantization, beam_size, batch_size, task):
super().__init__()
self.directory = directory
self.recursive = recursive
self.output_format = output_format
self.device = device
self.size = size
self.quantization = quantization
self.beam_size = beam_size
self.batch_size = batch_size
self.task = task.lower()
def run(self):
directory_path = Path(self.directory)
patterns = ['*.mp3', '*.wav', '*.flac', '*.wma']
audio_files = []
if self.recursive:
for pattern in patterns:
audio_files.extend(directory_path.rglob(pattern))
else:
for pattern in patterns:
audio_files.extend(directory_path.glob(pattern))
max_threads = os.cpu_count()
cpu_threads = max((2 * max_threads) // 3, 4) if max_threads is not None else 4
model_identifier = f"ctranslate2-4you/whisper-{self.size}-ct2-{self.quantization}"
model = whisper_s2t.load_model(model_identifier=model_identifier, backend='CTranslate2', device=self.device, compute_type=self.quantization, asr_options={'beam_size': self.beam_size}, cpu_threads=cpu_threads)
audio_files_str = [str(file) for file in audio_files]
output_file_paths = [str(file.with_suffix(f'.{self.output_format}')) for file in audio_files]
lang_codes = 'en'
tasks = self.task
initial_prompts = None
start_time = time.time()
if audio_files_str:
self.progress.emit(f"Processing {len(audio_files_str)} files...")
out = model.transcribe_with_vad(audio_files_str, lang_codes=lang_codes, tasks=tasks, initial_prompts=initial_prompts, batch_size=self.batch_size)
whisper_s2t.write_outputs(out, format=self.output_format, op_files=output_file_paths)
for original_audio_file, output_file_path in zip(audio_files, output_file_paths):
self.progress.emit(f"{tasks.capitalize()} {original_audio_file} to {output_file_path}")
processing_time = time.time() - start_time
self.finished.emit(f"Total processing time: {processing_time:.2f} seconds")
The text was updated successfully, but these errors were encountered:
Here's the final version that I ended up incorporating into my latest release, to avoid the issue, but would still be very interested in knowing if there's a way to address a single file to cause the entire batch processing of multiple files to fail...
When my script batch processes a bunch of audio files using the approach you gave me to use a list of files and their settings when processing, if a single file fails for any reason, it prevents the transcriptions of all files' transcriptions from being done? I created a workaround to process each file to the
transcribe_with_vad
method (each using its own tqdm) and added error handling, which works. I was wondering if there's a way to make it so I can you your most efficient approach and still have error handling for a specific audio file? Here is the original script and a comparison with the single audio file processing with error handling:The text was updated successfully, but these errors were encountered: