Add New Algorithm¶
PaddleOCR decomposes an algorithm into the following parts, and modularizes each part to make it more convenient to develop new algorithms.
- Data loading and processing
- Network
- Post-processing
- Loss
- Metric
- Optimizer
The following will introduce each part separately, and introduce how to add the modules required for the new algorithm.
Data loading and processing¶
Data loading and processing are composed of different modules, which complete the image reading, data augment and label production. This part is under ppocr/data. The explanation of each file and folder are as follows:
PaddleOCR has a large number of built-in image operation related modules. For modules that are not built-in, you can add them through the following steps:
- Create a new file under the ppocr/data/imaug folder, such as my_module.py.
-
Add code in the my_module.py file, the sample code is as follows:
-
Import the added module in the ppocr/data/imaug/_init_.py file.
All different modules of data processing are executed by sequence, combined and executed in the form of a list in the config file. Such as:
Network¶
The network part completes the construction of the network, and PaddleOCR divides the network into four parts, which are under ppocr/modeling. The data entering the network will pass through these four parts in sequence(transforms->backbones-> necks->heads).
PaddleOCR has built-in commonly used modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For modules that do not have built-in, you can add them through the following steps, the four parts are added in the same steps, take backbones as an example:
- Create a new file under the ppocr/modeling/backbones folder, such as my_backbone.py.
-
Add code in the my_backbone.py file, the sample code is as follows:
-
Import the added module in the ppocr/modeling/backbones/_init_.py file.
After adding the four-part modules of the network, you only need to configure them in the configuration file to use, such as:
Post-processing¶
Post-processing realizes decoding network output to obtain text box or recognized text. This part is under ppocr/postprocess. PaddleOCR has built-in post-processing modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For components that are not built-in, they can be added through the following steps:
- Create a new file under the ppocr/postprocess folder, such as my_postprocess.py.
-
Add code in the my_postprocess.py file, the sample code is as follows:
-
Import the added module in the ppocr/postprocess/_init_.py file.
After the post-processing module is added, you only need to configure it in the configuration file to use, such as:
Loss¶
The loss function is used to calculate the distance between the network output and the label. This part is under ppocr/losses. PaddleOCR has built-in loss function modules related to algorithms such as DB, EAST, SAST, CRNN and Attention. For modules that do not have built-in modules, you can add them through the following steps:
- Create a new file in the ppocr/losses folder, such as my_loss.py.
-
Add code in the my_loss.py file, the sample code is as follows:
-
Import the added module in the ppocr/losses/_init_.py file.
After the loss function module is added, you only need to configure it in the configuration file to use it, such as:
Metric¶
Metric is used to calculate the performance of the network on the current batch. This part is under ppocr/metrics. PaddleOCR has built-in evaluation modules related to algorithms such as detection, classification and recognition. For modules that do not have built-in modules, you can add them through the following steps:
- Create a new file under the ppocr/metrics folder, such as my_metric.py.
-
Add code in the my_metric.py file, the sample code is as follows:
-
Import the added module in the ppocr/metrics/_init_.py file.
After the metric module is added, you only need to configure it in the configuration file to use it, such as:
Optimizer¶
The optimizer is used to train the network. The optimizer also contains network regularization and learning rate decay modules. This part is under ppocr/optimizer. PaddleOCR has built-in
Commonly used optimizer modules such as Momentum
, Adam
and RMSProp
, common regularization modules such as Linear
, Cosine
, Step
and Piecewise
, and common learning rate decay modules such as L1Decay
and L2Decay
.
Modules without built-in can be added through the following steps, take optimizer
as an example:
-
Create your own optimizer in the ppocr/optimizer/optimizer.py file, the sample code is as follows:
After the optimizer module is added, you only need to configure it in the configuration file to use, such as: