Text Image Unwarping Module Development Tutorial¶
I. Overview¶
The primary purpose of Text Image Unwarping is to perform geometric transformations on images in order to correct issues such as document distortion, tilt, perspective deformation, etc., enabling more accurate recognition by subsequent text recognition modules.
II. Supported Model List¶
Model Name | Model Download Link | MS-SSIM (%) | Model Size (M) | information |
---|---|---|---|---|
UVDoc | Inference Model/Trained Model | 54.40 | 30.3 M | High-precision Text Image Unwarping Model |
Test Environment Description:
- Performance Test Environment
- Test Dataset: DocUNet benchmark dataset.
-
Hardware Configuration:
- GPU: NVIDIA Tesla T4
- CPU: Intel Xeon Gold 6271C @ 2.60GHz
- Other Environments: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
-
Inference Mode Description
Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
---|---|---|---|
Normal Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
High-Performance Mode | Optimal combination of pre-selected precision types and acceleration strategies | FP32 Precision / 8 Threads | Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.) |
III. Quick Integration¶
❗ Before quick integration, please install the PaddleX wheel package. For detailed instructions, refer to the PaddleX Local Installation Guide
Just a few lines of code can complete the inference of the Text Image Unwarping module, allowing you to easily switch between models under this module. You can also integrate the model inference of the the Text Image Unwarping module into your project.
Before running the following code, please download the demo image to your local machine.
from paddlex import create_model
model = create_model(model_name="UVDoc")
output = model.predict("doc_test.jpg", batch_size=1)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/res.json")
After running, the result obtained is:
The meanings of the running result parameters are as follows:
- input_path
: Indicates the path of the input image to be corrected.
- doctr_img
: Indicates the result of the corrected image. Since there is too much data to print directly, ...
is used here as a placeholder. The prediction result can be saved as an image through res.save_to_img()
and as a JSON file through res.save_to_json()
.
The visualization image is as follows:
Note: Due to network issues, the above URL may not be successfully parsed. If you need the content of this webpage, please check the validity of the link and try again. Alternatively, if parsing this link is not necessary for your question, please proceed with other questions.
Relevant methods, parameters, and explanations are as follows:
create_model
instantiates an image correction model (here usingUVDoc
as an example). The specific explanation is as follows:
Parameter | Parameter Description | Parameter Type | Options | Default Value |
---|---|---|---|---|
model_name |
Name of the model | str |
All model names supported by PaddleX | None |
model_dir |
Path to store the model | str |
None | None |
-
The
model_name
must be specified. After specifyingmodel_name
, the default model parameters built into PaddleX will be used. Ifmodel_dir
is specified, the user-defined model will be used. -
The
predict()
method of the image correction model is called for inference prediction. The parameters of thepredict()
method areinput
andbatch_size
, with specific explanations as follows:
Parameter | Parameter Description | Parameter Type | Options | Default Value |
---|---|---|---|---|
input |
Data to be predicted, supporting multiple input types | Python Var /str /dict /list |
|
None |
batch_size |
Batch size | int |
Any integer | 1 |
- The prediction results are processed, with each sample's prediction result being of type
dict
, and supporting operations such as printing, saving as an image, and saving as ajson
file:
Method | Method Description | Parameter | Parameter Type | Parameter Description | Default Value |
---|---|---|---|---|---|
print() |
Print the result to the terminal | format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. This is only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Control whether non-ASCII characters are escaped to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. This is only effective when format_json is True |
False |
||
save_to_json() |
Save the result as a JSON file | save_path |
str |
The file path for saving. When it is a directory, the saved file name will match the input file name | None |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. This is only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Control whether non-ASCII characters are escaped to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. This is only effective when format_json is True |
False |
||
save_to_img() |
Save the result as an image file | save_path |
str |
The file path for saving. When it is a directory, the saved file name will match the input file name | None |
- Additionally, it also supports obtaining visualized images with results and prediction results through attributes, as follows:
Attribute | Attribute Description |
---|---|
json |
Get the prediction result in json format |
img |
Get the visualized image in dict format |
For more information on using PaddleX's single-model inference API, refer to the PaddleX Single Model Python Script Usage Instructions.
IV. Custom Development¶
The current module temporarily does not support fine-tuning training and only supports inference integration. Fine-tuning training for this module is planned to be supported in the future.
You can also use the PaddleX high-performance inference plugin to optimize the inference process of your model and further improve efficiency. For detailed procedures, please refer to the PaddleX High-Performance Inference Guide.