ShareGPT4Video/captioner at master · ShareGPT4Omni/ShareGPT4Video

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
SimHei.ttf		SimHei.ttf
app.py		app.py
fast_captioner_lmdeploy.py		fast_captioner_lmdeploy.py
slide_captioner_lmdeploy.py		slide_captioner_lmdeploy.py
videos_to_describe.json		videos_to_describe.json

README.md

Batch Inference with ShareGPT4Video

We support two types of sampling strategy for video inference: (1) fixed sampling interval, Slide Caption (2) fixed sampling frames, and Fast Caption. For fixed sampling interval, we iteratively infer the frame based on the previous generated caption, and finally summarize the results. This strategy provides detailed captions, but it runs a little bit slow. For fixed sampling frames, we directly infer the video by concatenating 16 frames into one image and inferring it at one time. It runs much faster but the quality may be worse than the former one.

Usage

We use lmdeploy to speed up the inference.

pip install lmdeploy

List your video path at videos_to_describe.json.
run the inference code.

# fast caption (fixed sampling frames)
python fast_captioner_lmdeploy.py --batch-size 2
# slide caption (fixed sampling interval)
python slide_captioner_lmdeploy.py --batch-size 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

captioner

captioner

README.md

Batch Inference with ShareGPT4Video

Usage

Files

captioner

Directory actions

More options

Directory actions

More options

Latest commit

History

captioner

Folders and files

parent directory

README.md

Batch Inference with ShareGPT4Video

Usage