Interface Change Documentation¶
1. Model and Module Related¶
1.1 Model Configuration Files¶
- Storage Directory Change: 
paddlex/configshas been updated topaddlex/configs/modules. - Module Name Changes, and related configuration file paths have also been updated:
 anomaly_detectionupdated toimage_anomaly_detectionface_recognitionupdated toface_featuregeneral_recognitionupdated toimage_featuremultilabel_classificationupdated toimage_multilabel_classificationpedestrian_attributeupdated topedestrian_attribute_recognitionstructure_analysisupdated tolayout_detectiontable_recognitionupdated totable_structure_recognitiontext_detection_sealupdated toseal_text_detectionvehicle_attributeupdated tovehicle_attribute_recognition
1.2 Module Inference¶
1. create_model()¶
- Parameter Change:
 model_name: Only accepts model name.- 
New Parameters:
model_dir: Specifies the local directory for inference model files, defaults toNone, which means automatically downloading and using the official model.batch_size: Specifies the batch size during inference, defaults to1.- Supports specifying common model inference hyperparameters, with specific parameters related to the module, as detailed in the module tutorial documentation. For example, image classification module support 
topk. use_hpipandhpi_params: For supporting high-performance inference, not enabled by default.
 - 
Function Updates:
 - Supports using PDF files as input samples for CV modules.
 - Prediction results remain of 
dicttype, but the format has changed: from{'key1': val}to{"res": {'key': val}}, using"res"as the key with the original result data as the value. - When using the 
save_to_xxx()method to save prediction results, ifsave_pathis a directory, the name for stored files has changed. For example, saving in JSON format is{input_file_prefix}_res.json; saving in image format is{input_file_prefix}_res_img.{input_file_extension}. 
2. Pipeline Related¶
2.1 Pipeline Configuration Files¶
- Configuration File Storage Directory Change: 
paddlex/pipelinesupdated topaddlex/configs/pipelines. - Pipeline Name Changes:
 ts_fcupdated tots_forecastts_adupdated tots_anomaly_detectionts_clsupdated tots_classification
2.2 Pipeline Inference¶
1. CLI Inference for Pipelines¶
- New Support:
 - Inference hyperparameters, specific parameters related to the pipeline, detailed in the pipeline tutorial documentation. For example, image classification pipeline supports the 
--topkparameter to specify thetopkresults to return. - Removed:
 --serial_number, high-performance inference no longer requires the serial number.
2. create_pipeline()¶
- Removed:
 - The 
serial_numberparameter in high-performance inferencehpi_params, high-performance inference no longer requires the serial number. - No Longer Supported:
 - Setting pipeline inference hyperparameters, all related parameters must be set through the pipeline configuration file, such as 
batch_size, thresholds, etc. - Function Updates:
 - When using the 
save_to_xxx()method to save prediction results, ifsave_pathis a directory, the name for stored files has updated. - CV model prediction results have a new 
page_indexfield, which indicates the page number of the current prediction result only when the input sample is a PDF file. - Model pipeline prediction results have new pipeline inference parameter fields, such as the 
text_det_paramsfield in the OCR pipeline, with values for the post-processing settings of the text detection model. - Configuration File Format Update:
 - 
After updating the content of the pipeline configuration file, it is divided into three parts: pipeline name, pipeline-related parameter settings, and sub-pipelines and sub-modules composition. For example:
pipeline_name: pipeline # Pipeline Name threshold: 0.5 # Pipeline Inference Related Parameters SubPipelines: # Sub-pipelines DocPreprocessor: pipeline_name: doc_preprocessor use_doc_unwarping: True # Settings related to the sub-pipeline DocPreprocessor SubModules: # Sub-modules TextDetection: module_name: text_detection model_name: PP-OCRv4_mobile_det model_dir: null limit_side_len: 960 # Settings related to the sub-module TextDetection limit_type: max max_side_limit: 4000 thresh: 0.3 box_thresh: 0.6 unclip_ratio: 1.5 
3. Pipeline Features Changes¶
3.1 OCR Pipeline¶
- New Features:
 - Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the 
OCR.yamlconfiguration file. - Text Line Direction Classification: Controlled by relevant parameters in the configuration file.
 - Support for modifying model inference hyperparameters, such as post-processing parameters of the text detection model, controlled by relevant parameters in the configuration file.
 
3.2 Seal Recognition and Formula Recognition Pipeline¶
- New Features:
 - Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the configuration file.
 - Option to use the layout detection model: Controlled by relevant parameters in the configuration file.
 
3.3 Table Recognition Pipeline¶
- New Features:
 - Document Preprocessing: Supports whole image direction classification and correction, controlled by relevant parameters in the configuration file.
 - Option to use the OCR pipeline for text detection and recognition: Controlled by relevant parameters in the configuration file.
 
3.4 Layout Analysis Pipeline¶
- Updated Features:
 - Supports more inference hyperparameter settings, such as document preprocessing, text recognition, and model post-processing parameter settings, all of which can be configured in the pipeline configuration file.
 
3.5 PP-ChatOCRv3-doc Pipeline¶
- New Features:
 - Supports standard OpenAI API calls, which can be controlled through relevant parameters in the configuration file.
 - 
Allows switching large language models during Chat API calls by passing the relevant configuration through the API call parameters.
 - 
Updated Features:
 - Inference Module Initialization: Supports initialization of the inference module upon its first invocation, eliminating the need for full initialization at pipeline startup.
 - Vector Library: Enables setting block size for long text and removes the control of interval duration between vector library calls.