FastDeploy CLI User Guide
Introduction
FastDeploy CLI is a command-line tool provided by the FastDeploy inference framework, designed for running, deploying, and testing AI model inference tasks. It allows developers to quickly perform model loading, API calls, service deployment, performance benchmarking, and environment information collection directly from the command line.
With FastDeploy CLI, you can:
- đ Run and validate model inference: Generate chat responses or text completions directly in the command line (
chat,complete). - đ§Š Deploy models as services: Start an OpenAI-compatible API service with a single command (
serve). - đ Perform performance and evaluation tests: Conduct latency, throughput, and task benchmarks (
bench). - âī¸ Collect environment information: Output system, framework, GPU, and FastDeploy version information (
collect-env). - đ Run batch inference tasks: Supports batch input/output from files or URLs (
run-batch). - đĄ Manage model tokenizers: Encode/decode text and tokens, or export vocabulary (
tokenizer).
View Help Information
fastdeploy --help
Available Commands
fastdeploy {chat, complete, serve, bench, collect-env, run-batch, tokenizer}
| Command Name | Description | Detailed Documentation |
|---|---|---|
chat |
Run interactive chat generation tasks in the command line to verify chat model inference results | View chat command details |
complete |
Perform text completion tasks and test various language model outputs | View complete command details |
serve |
Launch a local inference service compatible with the OpenAI API protocol | View serve command details |
bench |
Evaluate model performance (latency, throughput) and accuracy | View bench command details |
collect-env |
Collect and print system, GPU, dependency, and FastDeploy environment information | View collect-env command details |
run-batch |
Run batch inference tasks with file or URL input/output | View run-batch command details |
tokenizer |
Encode/decode text and tokens, and export vocabulary | View tokenizer command details |