Skip to content

PaddleOCR MCP Server

PaddleOCR FastMCP

This project provides a lightweight Model Context Protocol (MCP) server designed to integrate the powerful capabilities of PaddleOCR into a compatible MCP Host.

Key Features

  • Currently Supported Pipelines
    • OCR: Performs text detection and recognition on images and PDF files.
    • PP-StructureV3: Recognizes and extracts text blocks, titles, paragraphs, images, tables, and other layout elements from an image or PDF file, converting the input into a Markdown document.
  • Supports the following working modes:
    • Local: Runs the PaddleOCR pipeline directly on your machine using the installed Python library.
    • AI Studio: Calls cloud services provided by the Paddle AI Studio community.
    • Self-hosted: Calls a PaddleOCR service that you deploy yourself (serving).

Table of Contents

1. Installation

# Install the wheel
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/PaddleX3.0/mcp/paddleocr_mcp/releases/v0.1.0/paddleocr_mcp-0.1.0-py3-none-any.whl

# Or, install from source
# git clone https://github.com/PaddlePaddle/PaddleOCR.git
# pip install -e mcp_server

Some working modes may require additional dependencies.

2. Quick Start

This section guides you through a quick setup using Claude Desktop as the MCP Host and the AI Studio mode. This mode is recommended for new users as it does not require complex local dependencies. Please refer to 3. Configuration for other working modes and more configuration options.

  1. Prepare the AI Studio Service

    • Visit the Paddle AI Studio community and log in.
    • In the "PaddleX Pipeline" section under "More" on the left, navigate to [Create Pipeline] - [OCR] - [General OCR] - [Deploy Directly] - [Text Recognition Module, select PP-OCRv5_server_rec] - [Start Deployment].
    • Once deployed, obtain your Service Base URL (e.g., https://xxxxxx.aistudio-hub.baidu.com).
    • Get your Access Token from this page.
  2. Locate the MCP Configuration File - For details, refer to the Official MCP Documentation.

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
    • Linux: ~/.config/Claude/claude_desktop_config.json
  3. Add MCP Server Configuration Open the claude_desktop_config.json file and add the configuration by referring to 5.1 AI Studio Service Configuration.

    Note: - Do not leak your Access Token. - If paddleocr_mcp is not in your system's PATH, set command to the absolute path of the executable.

  4. Restart the MCP Host Restart Claude Desktop. The new paddleocr-ocr tool should now be available in the application.

3. Configuration

3.1. MCP Host Configuration

In the Host's configuration file (e.g., claude_desktop_config.json), you need to define how to start the tool server. Key fields are: - command: paddleocr_mcp (if the executable is in your PATH) or an absolute path. - args: Configurable command-line arguments, e.g., ["--verbose"]. See 4. Parameter Reference. - env: Configurable environment variables. See 4. Parameter Reference.

3.2. Working Modes Explained

You can configure the MCP server to run in different modes based on your needs.

Mode 1: AI Studio Service (aistudio)

This mode calls services from the Paddle AI Studio community. - Use Case: Ideal for quickly trying out features, validating solutions, and for no-code development scenarios. - Procedure: Please refer to 2. Quick Start. - In addition to using the platform's preset model solutions, you can also train and deploy custom models on the platform.

Mode 2: Local Python Library (local)

This mode runs the model directly on your local machine and has certain requirements for the local environment and computer performance. It relies on the installed paddleocr inference package. - Use Case: Suitable for offline usage and scenarios with strict data privacy requirements. - Procedure: 1. Refer to the PaddleOCR Installation Guide to install the PaddlePaddle framework and PaddleOCR. It is strongly recommended to install them in a separate virtual environment to avoid dependency conflicts. 2. Refer to 5.2 Local Python Library Configuration to modify the claude_desktop_config.json file.

Mode 3: Self-hosted Service (self_hosted)

This mode calls a PaddleOCR inference service that you have deployed yourself. This corresponds to the Serving solutions provided by PaddleX. - Use Case: Offers the advantages of service-oriented deployment and high flexibility, making it well-suited for production environments, especially for scenarios requiring custom service configurations. - Procedure: 1. Refer to the PaddleOCR Installation Guide to install the PaddlePaddle framework and PaddleOCR. 2. Refer to the PaddleOCR Serving Deployment Guide to run the server. 3. Refer to 5.3 Self-hosted Service Configuration to modify the claude_desktop_config.json file. 4. Set your service address in PADDLEOCR_MCP_SERVER_URL (e.g., "http://127.0.0.1:8080").

4. Parameter Reference

You can control the server's behavior via environment variables or command-line arguments.

Environment Variable Command-line Argument Type Description Options Default
PADDLEOCR_MCP_PIPELINE --pipeline str The pipeline to run "OCR", "PP-StructureV3" "OCR"
PADDLEOCR_MCP_PPOCR_SOURCE --ppocr_source str The source of PaddleOCR capabilities "local", "aistudio", "self_hosted" "local"
PADDLEOCR_MCP_SERVER_URL --server_url str Base URL of the underlying service (required for aistudio or self_hosted mode) - None
PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN --aistudio_access_token str AI Studio authentication token (required for aistudio mode) - None
PADDLEOCR_MCP_TIMEOUT --timeout int Request timeout for the underlying service (in seconds) - 30
PADDLEOCR_MCP_DEVICE --device str Specify the device for inference (only effective in local mode) - None
PADDLEOCR_MCP_PIPELINE_CONFIG --pipeline_config str Path to the PaddleX pipeline configuration file (only effective in local mode) - None
- --http bool Use HTTP transport instead of stdio (for remote deployment and multiple clients) - False
- --host str Host address for HTTP mode - "127.0.0.1"
- --port int Port for HTTP mode - 8080
- --verbose bool Enable verbose logging for debugging - False

5. Configuration Examples

Below are complete configuration examples for different working modes. You can copy and modify them as needed.

5.1 AI Studio Service Configuration

{
  "mcpServers": {
    "paddleocr-ocr": {
      "command": "paddleocr_mcp",
      "args": [],
      "env": {
        "PADDLEOCR_MCP_PIPELINE": "OCR",
        "PADDLEOCR_MCP_PPOCR_SOURCE": "aistudio",
        "PADDLEOCR_MCP_SERVER_URL": "<your-server-url>", 
        "PADDLEOCR_MCP_AISTUDIO_ACCESS_TOKEN": "<your-access-token>"
      }
    }
  }
}

Note: - Replace <your-server-url> with your AI Studio Service Base URL, e.g., https://xxxxx.aistudio-hub.baidu.com. Do not include endpoint paths (like /ocr). - Replace <your-access-token> with your Access Token.

5.2 Local Python Library Configuration

{
  "mcpServers": {
    "paddleocr-ocr": {
      "command": "paddleocr_mcp",
      "args": [],
      "env": {
        "PADDLEOCR_MCP_PIPELINE": "OCR",
        "PADDLEOCR_MCP_PPOCR_SOURCE": "local"
      }
    }
  }
}

Note: - PADDLEOCR_MCP_PIPELINE_CONFIG is optional. If not set, the default pipeline configuration is used. To adjust settings, such as changing models, refer to the PaddleOCR and PaddleX documentation, export a pipeline configuration file, and set PADDLEOCR_MCP_PIPELINE_CONFIG to its absolute path.

5.3 Self-hosted Service Configuration

{
  "mcpServers": {
    "paddleocr-ocr": {
      "command": "paddleocr_mcp",
      "args": [],
      "env": {
        "PADDLEOCR_MCP_PIPELINE": "OCR",
        "PADDLEOCR_MCP_PPOCR_SOURCE": "self_hosted",
        "PADDLEOCR_MCP_SERVER_URL": "<your-server-url>"
      }
    }
  }
}

Note: - Replace <your-server-url> with the base URL of your underlying service (e.g., http://127.0.0.1:8080).