Skip to content

Supported Models

FastDeploy currently supports the following models, which can be downloaded automatically during FastDeploy deployment.Specify the model parameter as the model name in the table below to automatically download model weights (all supports resumable downloads). The following three download sources are supported:

When using automatic download, the default download source is AIStudio. Users can modify the default download source by setting the FD_MODEL_SOURCE environment variable, which can be set to “AISTUDIO”, ‘MODELSCOPE’ or “HUGGINGFACE”. The default download path is ~/ (i.e., the user's home directory). Users can modify the default download path by setting the FD_MODEL_CACHE environment variable, e.g.:

export FD_MODEL_SOURCE=AISTUDIO # "AISTUDIO", "MODELSCOPE" or "HUGGINGFACE"
export FD_MODEL_CACHE=/ssd1/download_models
Model Name Context Length Quantization Minimum Deployment Resources Notes
baidu/ERNIE-4.5-VL-424B-A47B-Paddle 32K/128K WINT4 4*80G GPU VRAM/1T RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-VL-424B-A47B-Paddle 32K/128K WINT8 8*80G GPU VRAM/1T RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-300B-A47B-Paddle 32K/128K WINT4 4*64G GPU VRAM/600G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-300B-A47B-Paddle 32K/128K WINT8 8*64G GPU VRAM/600G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-300B-A47B-2Bits-Paddle 32K/128K WINT2 1*141G GPU VRAM/600G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-300B-A47B-W4A8C8-TP4-Paddle 32K/128K W4A8C8 4*64G GPU VRAM/160G RAM Fixed 4-GPU setup, Chunked Prefill recommended
baidu/ERNIE-4.5-300B-A47B-FP8-Paddle 32K/128K FP8 8*64G GPU VRAM/600G RAM Chunked Prefill recommended, only supports PD Disaggragated Deployment with EP parallelism
baidu/ERNIE-4.5-300B-A47B-Base-Paddle 32K/128K WINT4 4*64G GPU VRAM/600G RAM Chunked Prefill recommended
baidu/ERNIE-4.5-300B-A47B-Base-Paddle 32K/128K WINT8 8*64G GPU VRAM/600G RAM Chunked Prefill recommended
baidu/ERNIE-4.5-VL-28B-A3B-Paddle 32K WINT4 1*24G GPU VRAM/128G RAM Chunked Prefill required
baidu/ERNIE-4.5-VL-28B-A3B-Paddle 128K WINT4 1*48G GPU VRAM/128G RAM Chunked Prefill required
baidu/ERNIE-4.5-VL-28B-A3B-Paddle 32K/128K WINT8 1*48G GPU VRAM/128G RAM Chunked Prefill required
baidu/ERNIE-4.5-21B-A3B-Paddle 32K/128K WINT4 1*24G GPU VRAM/128G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-21B-A3B-Paddle 32K/128K WINT8 1*48G GPU VRAM/128G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-21B-A3B-Base-Paddle 32K/128K WINT4 1*24G GPU VRAM/128G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-21B-A3B-Base-Paddle 32K/128K WINT8 1*48G GPU VRAM/128G RAM Chunked Prefill required for 128K
baidu/ERNIE-4.5-0.3B-Paddle 32K/128K BF16 1*6G/12G GPU VRAM/2G RAM
baidu/ERNIE-4.5-0.3B-Base-Paddle 32K/128K BF16 1*6G/12G GPU VRAM/2G RAM

More models are being supported. You can submit requests for new model support via Github Issues.