Interface Change Documentation¶

1.1 Model Configuration Files¶

Storage Directory Change: paddlex/configs has been updated to paddlex/configs/modules.
Module Name Changes, and related configuration file paths have also been updated:
anomaly_detection updated to image_anomaly_detection
face_recognition updated to face_feature
general_recognition updated to image_feature
multilabel_classification updated to image_multilabel_classification
pedestrian_attribute updated to pedestrian_attribute_recognition
structure_analysis updated to layout_detection
table_recognition updated to table_structure_recognition
text_detection_seal updated to seal_text_detection
vehicle_attribute updated to vehicle_attribute_recognition

1.2 Module Inference¶

1. `create_model()`¶

Parameter Change:
model_name: Only accepts model name.
New Parameters:
- model_dir: Specifies the local directory for inference model files, defaults to None, which means automatically downloading and using the official model.
- batch_size: Specifies the batch size during inference, defaults to 1.
- Supports specifying common model inference hyperparameters, with specific parameters related to the module, as detailed in the module tutorial documentation. For example, image classification module support topk.
- use_hpip and hpi_params: For supporting high-performance inference, not enabled by default.
Function Updates:
Supports using PDF files as input samples for CV modules.
Prediction results remain of dict type, but the format has changed: from {'key1': val} to {"res": {'key': val}}, using "res" as the key with the original result data as the value.
When using the save_to_xxx() method to save prediction results, if save_path is a directory, the name for stored files has changed. For example, saving in JSON format is {input_file_prefix}_res.json; saving in image format is {input_file_prefix}_res_img.{input_file_extension}.

2.1 Pipeline Configuration Files¶

Configuration File Storage Directory Change: paddlex/pipelines updated to paddlex/configs/pipelines.
Pipeline Name Changes:
ts_fc updated to ts_forecast
ts_ad updated to ts_anomaly_detection
ts_cls updated to ts_classification

2.2 Pipeline Inference¶

1. CLI Inference for Pipelines¶

New Support:
Inference hyperparameters, specific parameters related to the pipeline, detailed in the pipeline tutorial documentation. For example, image classification pipeline supports the --topk parameter to specify the topk results to return.
Removed:
--serial_number, high-performance inference no longer requires the serial number.

2. `create_pipeline()`¶

Removed:
The serial_number parameter in high-performance inference hpi_params, high-performance inference no longer requires the serial number.
No Longer Supported:
Setting pipeline inference hyperparameters, all related parameters must be set through the pipeline configuration file, such as batch_size, thresholds, etc.
Function Updates:
When using the save_to_xxx() method to save prediction results, if save_path is a directory, the name for stored files has updated.
CV model prediction results have a new page_index field, which indicates the page number of the current prediction result only when the input sample is a PDF file.
Model pipeline prediction results have new pipeline inference parameter fields, such as the text_det_params field in the OCR pipeline, with values for the post-processing settings of the text detection model.
Configuration File Format Update:

After updating the content of the pipeline configuration file, it is divided into three parts: pipeline name, pipeline-related parameter settings, and sub-pipelines and sub-modules composition. For example:

pipeline_name: pipeline # Pipeline Name
threshold: 0.5 # Pipeline Inference Related Parameters
SubPipelines: # Sub-pipelines
  DocPreprocessor:
    pipeline_name: doc_preprocessor
    use_doc_unwarping: True # Settings related to the sub-pipeline DocPreprocessor
SubModules: # Sub-modules
  TextDetection:
    module_name: text_detection
    model_name: PP-OCRv4_mobile_det
    model_dir: null
    limit_side_len: 960 # Settings related to the sub-module TextDetection
    limit_type: max
    thresh: 0.3
    box_thresh: 0.6
    unclip_ratio: 2.0

3. Pipeline Features Changes¶

3.1 OCR Pipeline¶

New Features:
Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the OCR.yaml configuration file.
Text Line Direction Classification: Controlled by relevant parameters in the configuration file.
Support for modifying model inference hyperparameters, such as post-processing parameters of the text detection model, controlled by relevant parameters in the configuration file.

3.2 Seal Recognition and Formula Recognition Pipeline¶

New Features:
Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the configuration file.
Option to use the layout detection model: Controlled by relevant parameters in the configuration file.

3.3 Table Recognition Pipeline¶

New Features:
Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the configuration file.
Option to use the OCR pipeline for text detection and recognition: Controlled by relevant parameters in the configuration file.

3.4 Layout Analysis Pipeline¶

Updated Features:
Supports more inference hyperparameter settings, such as document preprocessing, text recognition, and model post-processing parameter settings, all of which can be configured in the pipeline configuration file.

3.5 PP-ChatOCRv3-doc Pipeline¶

New Features:
Supports standard OpenAI API calls, which can be controlled through relevant parameters in the configuration file.
Allows switching large language models during Chat API calls by passing the relevant configuration through the API call parameters.
Updated Features:
Inference Module Initialization: Supports initialization of the inference module upon its first invocation, eliminating the need for full initialization at pipeline startup.
Vector Library: Enables setting block size for long text and removes the control of interval duration between vector library calls.