Skip to content

Text Line Orientation Classification Module Tutorial

1. Overview

The text line orientation classification module identifies the orientation of text lines and corrects them through post-processing. During processes like document scanning or ID photo capture, users may rotate the shooting device for better clarity, resulting in text lines with varying orientations. Standard OCR workflows often struggle with such data. By employing image classification technology, this module pre-determines text line orientation and adjusts it, thereby enhancing OCR accuracy.

2. Supported Models

Model Download Links Top-1 Acc (%) GPU Inference Time (ms) CPU Inference Time (ms) Model Size (M) Description
PP-LCNet_x0_25_textline_oriInference Model/Training Model 95.54 - - 0.32 A text line classification model based on PP-LCNet_x0_25, with two classes: 0° and 180°.

Testing Environment:

  • Performance Testing Environment
    • Test Dataset: PaddleX's proprietary dataset, covering scenarios like IDs and documents, with 1,000 images.
    • Hardware:
      • GPU: NVIDIA Tesla T4
      • CPU: Intel Xeon Gold 6271C @ 2.60GHz
      • Other: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
  • Inference Mode Description
Mode GPU Configuration CPU Configuration Acceleration Techniques
Standard Mode FP32 precision / No TRT acceleration FP32 precision / 8 threads PaddleInference
High-Performance Mode Optimal combination of precision and acceleration strategies FP32 precision / 8 threads Optimal backend selection (Paddle/OpenVINO/TRT, etc.)

3. Quick Start

❗ Before starting, ensure you have installed the PaddleOCR wheel package. Refer to the Installation Guide for details.

Run the following command for a quick demo:

paddleocr text_line_orientation_classification -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/textline_rot180_demo.jpg

Alternatively, integrate the module into your project. Download the sample image locally before running the code below.

from paddleocr import TextLineOrientationClassification
model = TextLineOrientationClassification(model_name="PP-LCNet_x0_25_textline_ori")
output = model.predict("textline_rot180_demo.jpg",  batch_size=1)
for res in output:
    res.print(json_format=False)
    res.save_to_img("./output/demo.png")
    res.save_to_json("./output/res.json")

The output will be:

{'res': {'input_path': 'textline_rot180_demo.jpg', 'page_index': None, 'class_ids': array([1], dtype=int32), 'scores': array([1.], dtype=float32), 'label_names': ['180_degree']}}

Key output fields:
- input_path: Path of the input image.
- page_index: For PDF inputs, indicates the page number; otherwise, None.
- class_ids: Predicted class IDs (0° or 180°).
- scores: Confidence scores.
- label_names: Predicted class labels.

Visualization:

Method and Parameter Details

  • TextLineOrientationClassification Initialization (using PP-LCNet_x0_25_textline_ori as an example):
Parameter Description Type Options Default
model_name Model name str N/A None
model_dir Custom model path str N/A None
device Inference device str E.g., "gpu:0", "npu:0", "cpu" gpu:0
use_hpip Enable high-performance inference bool N/A False
hpi_config HPI configuration dict | None N/A None
  • predict() Method:
  • input: Supports various input types (numpy array, file path, URL, directory, or list).
  • batch_size: Batch size (default: 1).

  • Result Handling:
    Each prediction result is a Result object with methods like print(), save_to_img(), and save_to_json().

Method Description Parameters Type Details Default
print() Print results format_json, indent, ensure_ascii bool, int, bool Control JSON formatting and ASCII escaping True, 4, False
save_to_json() Save results as JSON save_path, indent, ensure_ascii str, int, bool Same as print() N/A, 4, False
save_to_img() Save visualized results save_path str Output path N/A
  • Attributes:
  • json: Get results in JSON format.
  • img: Get visualized images as a dictionary.

4. Custom Development

Since PaddleOCR does not natively support training for text line orientation classification, refer to PaddleX's Custom Development Guide for training. Trained models can seamlessly integrate into PaddleOCR's API for inference.

Comments