Text Line Orientation Classification Module Tutorial¶

1. Overview¶

The text line orientation classification module identifies the orientation of text lines and corrects them through post-processing. During processes like document scanning or ID photo capture, users may rotate the shooting device for better clarity, resulting in text lines with varying orientations. Standard OCR workflows often struggle with such data. By employing image classification technology, this module pre-determines text line orientation and adjusts it, thereby enhancing OCR accuracy.

2. Supported Models¶

Model	Download Links	Top-1 Acc (%)	GPU Inference Time (ms)	CPU Inference Time (ms)	Model Size (M)	Description
PP-LCNet_x0_25_textline_ori	Inference Model/Training Model	95.54	-	-	0.32	A text line classification model based on PP-LCNet_x0_25, with two classes: 0° and 180°.

Testing Environment:

Performance Testing Environment
- Test Dataset: PaddleX's proprietary dataset, covering scenarios like IDs and documents, with 1,000 images.
- Hardware:
  - GPU: NVIDIA Tesla T4
  - CPU: Intel Xeon Gold 6271C @ 2.60GHz
  - Other: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
Inference Mode Description

Mode	GPU Configuration	CPU Configuration	Acceleration Techniques
Standard Mode	FP32 precision / No TRT acceleration	FP32 precision / 8 threads	PaddleInference
High-Performance Mode	Optimal combination of precision and acceleration strategies	FP32 precision / 8 threads	Optimal backend selection (Paddle/OpenVINO/TRT, etc.)

3. Quick Start¶

❗ Before starting, ensure you have installed the PaddleOCR wheel package. Refer to the Installation Guide for details.

Run the following command for a quick demo:

paddleocr text_line_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg

Alternatively, integrate the module into your project. Download the sample image locally before running the code below.

from paddleocr import TextLineOrientationClassification
model = TextLineOrientationClassification(model_name="PP-LCNet_x0_25_textline_ori")
output = model.predict("textline_rot180_demo.jpg",  batch_size=1)
for res in output:
    res.print(json_format=False)
    res.save_to_img("./output/demo.png")
    res.save_to_json("./output/res.json")

The output will be:

{'res': {'input_path': 'textline_rot180_demo.jpg', 'page_index': None, 'class_ids': array([1], dtype=int32), 'scores': array([1.], dtype=float32), 'label_names': ['180_degree']}}

Key output fields:
- input_path: Path of the input image.
- page_index: For PDF inputs, indicates the page number; otherwise, None.
- class_ids: Predicted class IDs (0° or 180°).
- scores: Confidence scores.
- label_names: Predicted class labels.

Visualization:

Method and Parameter Details¶

TextLineOrientationClassification Initialization (using PP-LCNet_x0_25_textline_ori as an example):

Parameter	Description	Type	Options	Default
`model_name`	Model name	`str`	N/A	`None`
`model_dir`	Custom model path	`str`	N/A	None
`device`	Inference device	`str`	E.g., "gpu:0", "npu:0", "cpu"	`gpu:0`
`use_hpip`	Enable high-performance inference	`bool`	N/A	`False`
`hpi_config`	HPI configuration	`dict` \| `None`	N/A	`None`

predict() Method:
input: Supports various input types (numpy array, file path, URL, directory, or list).
batch_size: Batch size (default: 1).
Result Handling:
Each prediction result is a Result object with methods like print(), save_to_img(), and save_to_json().

Method	Description	Parameters	Type	Details	Default
`print()`	Print results	`format_json`, `indent`, `ensure_ascii`	`bool`, `int`, `bool`	Control JSON formatting and ASCII escaping	`True`, 4, `False`
`save_to_json()`	Save results as JSON	`save_path`, `indent`, `ensure_ascii`	`str`, `int`, `bool`	Same as `print()`	N/A, 4, `False`
`save_to_img()`	Save visualized results	`save_path`	`str`	Output path	N/A

Attributes:
json: Get results in JSON format.
img: Get visualized images as a dictionary.

4. Custom Development¶

Since PaddleOCR does not natively support training for text line orientation classification, refer to PaddleX's Custom Development Guide for training. Trained models can seamlessly integrate into PaddleOCR's API for inference.