faster-whisper 使用CTranslate2
重新實現 OpenAI 的 Whisper 模型,CTranslate2
是一個用於 Transformer 模型快速推論的引擎。整體速度提升不少,前提還是要有 GPU。
以下是產生字幕的簡單範例,請先安裝 faster_whisper
與 pysubs2
transcribe without progress bar
# pip install faster_whisper pysubs2
from faster_whisper import WhisperModel
import pysubs2
model = WhisperModel(model_size = 'large-v2')
segments, _ = model.transcribe(audio='audio.mp3')
# to use pysubs2, the argument must be a segment list-of-dicts
results= []
for s in segments:
segment_dict = {'start':s.start,'end':s.end,'text':s.text}
results.append(segment_dict)
subs = pysubs2.load_from_whisper(results)
subs.save('output.srt') #save srt file
我們可以這樣改寫,讓他透過 tqdm 產生進度條
transcribe with progress bar
from faster_whisper import WhisperModel
import pysubs2
model = WhisperModel(model_size = 'large-v2')
segments, _ = model.transcribe(audio='audio.mp3')
# Prepare results for SRT file format
results = []
timestamps = 0.0 # for progress bar
with tqdm(total=info.duration, unit=" audio seconds") as pbar:
for seg in segments:
segment_dict = {'start': seg.start, 'end': seg.end, 'text': seg.text}
results.append(segment_dict)
# Update progress bar based on segment duration
pbar.update(seg.end - timestamps)
timestamps = seg.end
# Handle silence at the end of the audio
if timestamps < info.duration:
pbar.update(info.duration - timestamps)
subs = pysubs2.load_from_whisper(results)
subs.save('output.srt') #save srt file
順便附上 Docker file
Dockerfile
# Use the official NVIDIA CUDA image as the base image
FROM nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu20.04
ARG DEBIAN_FRONTEND=noninteractive
# Install necessary dependencies
RUN apt-get update && apt-get install -y \
wget \
python3 \
python3-pip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set the working directory inside the container
WORKDIR /app
# Install required Python packages
RUN pip install faster_whisper pysubs2
# Create directories to store the models
RUN mkdir -p /models/faster-whisper-medium
# Download the medium model using wget to the specified directory
RUN wget -O /models/faster-whisper-medium/config.json https://huggingface.co/guillaumekln/faster-whisper-medium/resolve/main/config.json && \
wget -O /models/faster-whisper-medium/model.bin https://huggingface.co/guillaumekln/faster-whisper-medium/resolve/main/model.bin && \
wget -O /models/faster-whisper-medium/tokenizer.json https://huggingface.co/guillaumekln/faster-whisper-medium/resolve/main/tokenizer.json && \
wget -O /models/faster-whisper-medium/vocabulary.txt https://huggingface.co/guillaumekln/faster-whisper-medium/resolve/main/vocabulary.txt
COPY app.py /app/
# Run script
CMD ["python3", "app.py"]
Source Code: https://github.com/taka-wang/docker-whisper