Home
đ Introduction¶
PaddleX 3.0 is a low-code development tool for AI models built on the PaddlePaddle framework. It integrates numerousready-to-use pre-trained models, enablingfull-process developmentfrom model training to inference, supportinga variety of mainstream hardware both domestic and international, and aiding AI developers in industrial practice.
đ ī¸ Installation¶
Warning
Before installing PaddleX, please ensure that you have a basic Python runtime environment (Note: Currently supports Python 3.8 to Python 3.12). The PaddleX 3.0-rc0 version depends on PaddlePaddle version 3.0.0rc0.
Installing PaddlePaddle¶
Tip
There is no need to pay attention to the CUDA version on the physical machine; just focus on the GPU driver version. For more information on PaddlePaddle Wheel versions, please refer to the PaddlePaddle Official Website.
Installing PaddleX¶
pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/whl/paddlex-3.0.0rc0-py3-none-any.whl
â For more installation methods, please refer to the PaddleX Installation Guide
đģ Command Line Usage¶
A single command can quickly experience the production line effect, with a unified command line format as follows:
Each pipeline in PaddleX corresponds to specific parameters. You can find detailed parameter descriptions in the respective production line documentation. Each production line requires three essential parameters:
pipeline
: The name of the production line or the path to the production line configuration file.input
: The local path, directory, or URL of the input file to be processed (e.g., an image).device
: The hardware device and its index to be used (e.g.,gpu:0
indicates using the first GPU). You can also choose to use NPU (npu:0
), XPU (xpu:0
), CPU (cpu
), etc.
OCR-related Pipelines CLI
paddlex --pipeline OCR \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--use_textline_orientation False \
--save_path ./output \
--device gpu:0
What's the result
{'res': {'input_path': 'general_ocr_002.png', 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'doc_preprocessor_res': {'input_path': None, 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': False}, 'angle': 0},'dt_polys': [array([[ 3, 10],[82, 10],[82, 33],[ 3, 33]], dtype=int16), ...], 'text_det_params': {'limit_side_len': 960, 'limit_type': 'max', 'thresh': 0.3, 'box_thresh': 0.6, 'unclip_ratio': 2.0}, 'text_type': 'general', 'textline_orientation_angles': [-1, ...], 'text_rec_score_thresh': 0.0, 'rec_texts': ['www.99*', ...], 'rec_scores': [0.8980069160461426, ...], 'rec_polys': [array([[ 3, 10],[82, 10],[82, 33],[ 3, 33]], dtype=int16), ...], 'rec_boxes': array([[ 3, 10, 82, 33], ...], dtype=int16)}}
paddlex --pipeline table_recognition \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg \
--save_path ./output \
--device gpu:0
What's the result
{'res': {'input_path': 'table_recognition.jpg', 'model_settings': {'use_doc_preprocessor': True, 'use_layout_detection': True, 'use_ocr_model': True}, 'doc_preprocessor_res': {'input_path': '0.jpg', 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 0}, 'layout_det_res': {'input_path': None, 'boxes': [{'cls_id': 0, 'label': 'Table', 'score': 0.9196816086769104, 'coordinate': [0, 8.614925, 550.9877, 132]}]}, 'overall_ocr_res': {'input_path': '0.jpg', 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': [array([[232, 0],
[318, 1],
[318, 24],
[232, 21]], dtype=int16), array([[32, 38],
[67, 38],
[67, 55],
[32, 55]], dtype=int16), array([[119, 34],
[196, 34],
[196, 57],
[119, 57]], dtype=int16), array([[222, 29],
[396, 31],
[396, 60],
[222, 58]], dtype=int16), array([[420, 30],
[542, 32],
[542, 61],
[419, 59]], dtype=int16), array([[29, 71],
[72, 71],
[72, 92],
[29, 92]], dtype=int16), array([[287, 72],
[329, 72],
[329, 93],
[287, 93]], dtype=int16), array([[458, 68],
[501, 71],
[499, 94],
[456, 91]], dtype=int16), array([[ 9, 101],
[ 89, 103],
[ 89, 130],
[ 8, 128]], dtype=int16), array([[139, 105],
[172, 105],
[172, 126],
[139, 126]], dtype=int16), array([[274, 103],
[339, 101],
[340, 128],
[275, 130]], dtype=int16), array([[451, 103],
[508, 103],
[508, 126],
[451, 126]], dtype=int16)], 'text_det_params': {'limit_side_len': 960, 'limit_type': 'max', 'thresh': 0.3, 'box_thresh': 0.6, 'unclip_ratio': 2.0}, 'text_type': 'general', 'textline_orientation_angles': [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], 'text_rec_score_thresh': 0, 'rec_texts': ['CRuncover', 'Dres', 'čŋįģåˇĨäŊ3', 'ååēæĨæžå¨įŊä¸īŧæ˛Ąæŗ', 'æąãæąįå
Ģ大', 'Abstr', 'rSrivi', '$709.', 'cludingGiv', '2.72', 'Ingcubic', '$744.78'], 'rec_scores': [0.9943075180053711, 0.9951075315475464, 0.9907732009887695, 0.9975494146347046, 0.9974043369293213, 0.9983242750167847, 0.991967499256134, 0.9898287653923035, 0.9961177110671997, 0.9975040555000305, 0.9986456632614136, 0.9987970590591431], 'rec_polys': [array([[232, 0],
[318, 1],
[318, 24],
[232, 21]], dtype=int16), array([[32, 38],
[67, 38],
[67, 55],
[32, 55]], dtype=int16), array([[119, 34],
[196, 34],
[196, 57],
[119, 57]], dtype=int16), array([[222, 29],
[396, 31],
[396, 60],
[222, 58]], dtype=int16), array([[420, 30],
[542, 32],
[542, 61],
[419, 59]], dtype=int16), array([[29, 71],
[72, 71],
[72, 92],
[29, 92]], dtype=int16), array([[287, 72],
[329, 72],
[329, 93],
[287, 93]], dtype=int16), array([[458, 68],
[501, 71],
[499, 94],
[456, 91]], dtype=int16), array([[ 9, 101],
[ 89, 103],
[ 89, 130],
[ 8, 128]], dtype=int16), array([[139, 105],
[172, 105],
[172, 126],
[139, 126]], dtype=int16), array([[274, 103],
[339, 101],
[340, 128],
[275, 130]], dtype=int16), array([[451, 103],
[508, 103],
[508, 126],
[451, 126]], dtype=int16)], 'rec_boxes': array([[232, 0, 318, 24],
[ 32, 38, 67, 55],
[119, 34, 196, 57],
[222, 29, 396, 60],
[419, 30, 542, 61],
[ 29, 71, 72, 92],
[287, 72, 329, 93],
[456, 68, 501, 94],
[ 8, 101, 89, 130],
[139, 105, 172, 126],
[274, 101, 340, 130],
[451, 103, 508, 126]], dtype=int16)}, 'table_res_list': [{'cell_box_list': array([[ 8. , 9.61492538, 532. , 26.61492538],
[ 3. , 27.61492538, 104. , 65.61492538],
[109. , 28.61492538, 215. , 66.61492538],
[219. , 28.61492538, 396. , 64.61492538],
[396. , 29.61492538, 546. , 66.61492538],
[ 1. , 65.61492538, 110. , 93.61492538],
[111. , 65.61492538, 215. , 94.61492538],
[220. , 66.61492538, 397. , 94.61492538],
[398. , 67.61492538, 544. , 94.61492538],
[ 2. , 98.61492538, 111. , 131.61492538],
[113. , 98.61492538, 216. , 131.61492538],
[219. , 98.61492538, 400. , 131.61492538],
[403. , 99.61492538, 545. , 130.61492538]]), 'pred_html': '<html><body><table><tr><td colspan="4">CRuncover</td></tr><tr><td>Dres</td><td>čŋįģåˇĨäŊ3</td><td>ååēæĨæžå¨įŊä¸īŧæ˛Ąæŗ</td><td>æąãæąįå
Ģ大</td></tr><tr><td>Abstr</td><td></td><td>rSrivi</td><td>$709.</td></tr><tr><td>cludingGiv</td><td>2.72</td><td>Ingcubic</td><td>$744.78</td></tr></table></body></html>', 'table_ocr_pred': {'rec_polys': [array([[232, 0],
[318, 1],
[318, 24],
[232, 21]], dtype=int16), array([[32, 38],
[67, 38],
[67, 55],
[32, 55]], dtype=int16), array([[119, 34],
[196, 34],
[196, 57],
[119, 57]], dtype=int16), array([[222, 29],
[396, 31],
[396, 60],
[222, 58]], dtype=int16), array([[420, 30],
[542, 32],
[542, 61],
[419, 59]], dtype=int16), array([[29, 71],
[72, 71],
[72, 92],
[29, 92]], dtype=int16), array([[287, 72],
[329, 72],
[329, 93],
[287, 93]], dtype=int16), array([[458, 68],
[501, 71],
[499, 94],
[456, 91]], dtype=int16), array([[ 9, 101],
[ 89, 103],
[ 89, 130],
[ 8, 128]], dtype=int16), array([[139, 105],
[172, 105],
[172, 126],
[139, 126]], dtype=int16), array([[274, 103],
[339, 101],
[340, 128],
[275, 130]], dtype=int16), array([[451, 103],
[508, 103],
[508, 126],
[451, 126]], dtype=int16)], 'rec_texts': ['CRuncover', 'Dres', 'čŋįģåˇĨäŊ3', 'ååēæĨæžå¨įŊä¸īŧæ˛Ąæŗ', 'æąãæąįå
Ģ大', 'Abstr', 'rSrivi', '$709.', 'cludingGiv', '2.72', 'Ingcubic', '$744.78'], 'rec_scores': [0.9943075180053711, 0.9951075315475464, 0.9907732009887695, 0.9975494146347046, 0.9974043369293213, 0.9983242750167847, 0.991967499256134, 0.9898287653923035, 0.9961177110671997, 0.9975040555000305, 0.9986456632614136, 0.9987970590591431], 'rec_boxes': [array([232, 0, 318, 24], dtype=int16), array([32, 38, 67, 55], dtype=int16), array([119, 34, 196, 57], dtype=int16), array([222, 29, 396, 60], dtype=int16), array([419, 30, 542, 61], dtype=int16), array([29, 71, 72, 92], dtype=int16), array([287, 72, 329, 93], dtype=int16), array([456, 68, 501, 94], dtype=int16), array([ 8, 101, 89, 130], dtype=int16), array([139, 105, 172, 126], dtype=int16), array([274, 101, 340, 130], dtype=int16), array([451, 103, 508, 126], dtype=int16)]}}]}}
paddlex --pipeline table_recognition_v2 \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/table_recognition.jpg \
--save_path ./output \
--device gpu:0
What's the result
{'res': {'input_path': 'table_recognition.jpg', 'model_settings': {'use_doc_preprocessor': False, 'use_layout_detection': True, 'use_ocr_model': True}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 0, 'label': 'Table', 'score': 0.9922188520431519, 'coordinate': [3.0127392, 0.14648987, 547.5102, 127.72023]}]}, 'overall_ocr_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': [array([[234, 6],
[316, 6],
[316, 25],
[234, 25]], dtype=int16), array([[38, 39],
[73, 39],
[73, 57],
[38, 57]], dtype=int16), array([[122, 32],
[201, 32],
[201, 58],
[122, 58]], dtype=int16), array([[227, 34],
[346, 34],
[346, 57],
[227, 57]], dtype=int16), array([[351, 34],
[391, 34],
[391, 58],
[351, 58]], dtype=int16), array([[417, 35],
[534, 35],
[534, 58],
[417, 58]], dtype=int16), array([[34, 70],
[78, 70],
[78, 90],
[34, 90]], dtype=int16), array([[287, 70],
[328, 70],
[328, 90],
[287, 90]], dtype=int16), array([[454, 69],
[496, 69],
[496, 90],
[454, 90]], dtype=int16), array([[ 17, 101],
[ 95, 101],
[ 95, 124],
[ 17, 124]], dtype=int16), array([[144, 101],
[178, 101],
[178, 122],
[144, 122]], dtype=int16), array([[278, 101],
[338, 101],
[338, 124],
[278, 124]], dtype=int16), array([[448, 101],
[503, 101],
[503, 121],
[448, 121]], dtype=int16)], 'text_det_params': {'limit_side_len': 960, 'limit_type': 'max', 'thresh': 0.3, 'box_thresh': 0.6, 'unclip_ratio': 2.0}, 'text_type': 'general', 'textline_orientation_angles': [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], 'text_rec_score_thresh': 0, 'rec_texts': ['CRuncover', 'Dres', 'čŋįģåˇĨäŊ3', 'ååēæĨæžå¨įŊä¸', 'æ˛Ąæŗ', 'æąãæ´æąįå
Ģ大', 'Abstr', 'rSrivi', '$709.', 'cludingGiv', '2.72', 'Ingcubic', '$744.78'], 'rec_scores': [0.9951260685920715, 0.9943379759788513, 0.9968608021736145, 0.9978817105293274, 0.9985721111297607, 0.9616036415100098, 0.9977153539657593, 0.987593948841095, 0.9906861186027527, 0.9959743618965149, 0.9970152378082275, 0.9977849721908569, 0.9984450936317444], 'rec_polys': [array([[234, 6],
[316, 6],
[316, 25],
[234, 25]], dtype=int16), array([[38, 39],
[73, 39],
[73, 57],
[38, 57]], dtype=int16), array([[122, 32],
[201, 32],
[201, 58],
[122, 58]], dtype=int16), array([[227, 34],
[346, 34],
[346, 57],
[227, 57]], dtype=int16), array([[351, 34],
[391, 34],
[391, 58],
[351, 58]], dtype=int16), array([[417, 35],
[534, 35],
[534, 58],
[417, 58]], dtype=int16), array([[34, 70],
[78, 70],
[78, 90],
[34, 90]], dtype=int16), array([[287, 70],
[328, 70],
[328, 90],
[287, 90]], dtype=int16), array([[454, 69],
[496, 69],
[496, 90],
[454, 90]], dtype=int16), array([[ 17, 101],
[ 95, 101],
[ 95, 124],
[ 17, 124]], dtype=int16), array([[144, 101],
[178, 101],
[178, 122],
[144, 122]], dtype=int16), array([[278, 101],
[338, 101],
[338, 124],
[278, 124]], dtype=int16), array([[448, 101],
[503, 101],
[503, 121],
[448, 121]], dtype=int16)], 'rec_boxes': array([[234, 6, 316, 25],
[ 38, 39, 73, 57],
[122, 32, 201, 58],
[227, 34, 346, 57],
[351, 34, 391, 58],
[417, 35, 534, 58],
[ 34, 70, 78, 90],
[287, 70, 328, 90],
[454, 69, 496, 90],
[ 17, 101, 95, 124],
[144, 101, 178, 122],
[278, 101, 338, 124],
[448, 101, 503, 121]], dtype=int16)}, 'table_res_list': [{'cell_box_list': [array([3.18822289e+00, 1.46489874e-01, 5.46996138e+02, 3.08782365e+01]), array([ 3.21032453, 31.1510637 , 110.20750237, 65.14108063]), array([110.18174553, 31.13076188, 213.00813103, 65.02860047]), array([212.96108818, 31.09959008, 404.19618034, 64.99535157]), array([404.08112907, 31.18304802, 547.00864983, 65.0847223 ]), array([ 3.21772957, 65.0738733 , 110.33685875, 96.07921387]), array([110.23703575, 65.02486207, 213.08839226, 96.01378419]), array([213.06095695, 64.96230103, 404.28425407, 95.97141816]), array([404.23704338, 65.04879548, 547.01273918, 96.03654267]), array([ 3.22793937, 96.08334137, 110.38572502, 127.08698823]), array([110.40586662, 96.10539795, 213.19943047, 127.07002045]), array([213.12627983, 96.0539148 , 404.42686272, 127.02842499]), array([404.33042717, 96.07251526, 547.01273918, 126.45088746])], 'pred_html': '<html><body><table><tr><td colspan="4">CRuncover</td></tr><tr><td>Dres</td><td>čŋįģåˇĨäŊ3</td><td>ååēæĨæžå¨įŊä¸ æ˛Ąæŗ</td><td>æąãæ´æąįå
Ģ大</td></tr><tr><td>Abstr</td><td></td><td>rSrivi</td><td>$709.</td></tr><tr><td>cludingGiv</td><td>2.72</td><td>Ingcubic</td><td>$744.78</td></tr></table></body></html>', 'table_ocr_pred': {'rec_polys': [array([[234, 6],
[316, 6],
[316, 25],
[234, 25]], dtype=int16), array([[38, 39],
[73, 39],
[73, 57],
[38, 57]], dtype=int16), array([[122, 32],
[201, 32],
[201, 58],
[122, 58]], dtype=int16), array([[227, 34],
[346, 34],
[346, 57],
[227, 57]], dtype=int16), array([[351, 34],
[391, 34],
[391, 58],
[351, 58]], dtype=int16), array([[417, 35],
[534, 35],
[534, 58],
[417, 58]], dtype=int16), array([[34, 70],
[78, 70],
[78, 90],
[34, 90]], dtype=int16), array([[287, 70],
[328, 70],
[328, 90],
[287, 90]], dtype=int16), array([[454, 69],
[496, 69],
[496, 90],
[454, 90]], dtype=int16), array([[ 17, 101],
[ 95, 101],
[ 95, 124],
[ 17, 124]], dtype=int16), array([[144, 101],
[178, 101],
[178, 122],
[144, 122]], dtype=int16), array([[278, 101],
[338, 101],
[338, 124],
[278, 124]], dtype=int16), array([[448, 101],
[503, 101],
[503, 121],
[448, 121]], dtype=int16)], 'rec_texts': ['CRuncover', 'Dres', 'čŋįģåˇĨäŊ3', 'ååēæĨæžå¨įŊä¸', 'æ˛Ąæŗ', 'æąãæ´æąįå
Ģ大', 'Abstr', 'rSrivi', '$709.', 'cludingGiv', '2.72', 'Ingcubic', '$744.78'], 'rec_scores': [0.9951260685920715, 0.9943379759788513, 0.9968608021736145, 0.9978817105293274, 0.9985721111297607, 0.9616036415100098, 0.9977153539657593, 0.987593948841095, 0.9906861186027527, 0.9959743618965149, 0.9970152378082275, 0.9977849721908569, 0.9984450936317444], 'rec_boxes': [array([234, 6, 316, 25], dtype=int16), array([38, 39, 73, 57], dtype=int16), array([122, 32, 201, 58], dtype=int16), array([227, 34, 346, 57], dtype=int16), array([351, 34, 391, 58], dtype=int16), array([417, 35, 534, 58], dtype=int16), array([34, 70, 78, 90], dtype=int16), array([287, 70, 328, 90], dtype=int16), array([454, 69, 496, 90], dtype=int16), array([ 17, 101, 95, 124], dtype=int16), array([144, 101, 178, 122], dtype=int16), array([278, 101, 338, 124], dtype=int16), array([448, 101, 503, 121], dtype=int16)]}}]}}
paddlex --pipeline layout_parsing \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/demo_paper.png \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--use_textline_orientation False \
--save_path ./output \
--device gpu:0 \
What's the result
```bash {'res': {'input_path': 'demo_paper.png', 'model_settings': {'use_doc_preprocessor': True, 'use_general_ocr': True, 'use_seal_recognition': True, 'use_table_recognition': True, 'use_formula_recognition': True}, 'parsing_res_list': [{'layout_bbox': [46.905365, 44.05746, 565.6911, 217.74211], 'image': array([[[255, ..., 255], ..., [188, ..., 175]],
...,
[[255, ..., 255],
...,
[255, ..., 255]]], dtype=uint8), 'image_text': 'Efficient Hybrid Encoder\nCaevIst s1\nCaaveJs2\nCCFF\n\nFusion\n00âĄ0âĄ000000', 'layout': 'single'}, {'layout_bbox': [42.178703, 227.34215, 570.1248, 284.377], 'text': 'Figure 4, Overview of RT-DETR. We feed the features from the last three stages of the backbone into the encoder. The eficient hybrid\nencoder transforms multi-scale features into a sequence of image features through the Attention-based Intra-scale Feature Interaction (AIFI)\nfeatures to serve as initial object queries for the decoder, Finaly, the decoder with auxiliary prediction heads iteratively optimizes object\nand the CNN-based Cross-scale Feature Fusion (CCFF), Then, the uncertainty-minimal query selection selects a fixed number of encoder\nqueries to generate categories and boxes.', 'layout': 'single'}, {'layout_bbox': [53.227856, 294.16644, 283.854, 396.24164], 'image': array([[[255, ..., 255],
...,
[255, ..., 255]],
...,
[[255, ..., 255],
...,
[255, ..., 255]]], dtype=uint8), 'image_text': '1x1Conv\nFusion\nC\nNX\n1x1Conv\nRepBlock\nCConcatenate\nElement-wise add\nFlatten', 'layout': 'double'}, {'layout_bbox': [99.52045, 401.63477, 240.27536, 411.01334], 'text': 'Figure 5. The fusion block in CCFF', 'layout': 'double'}, {'layout_bbox': [41.776196, 418.61166, 296.82672, 554.4149], 'text': 'D, Ds, not only significantly reduces latency (35% faster),\nbut also improves accuracy (0.4% AP higher). CCFF is opti-\nmized based on the cross-scale fusion module, which inserts\nseveral fusion blocks consisting of convolutional layers into\nthe fusion path. The role of the fusion block is to fuse two\nadjacent scale features into a new feature, and its structure is\nillustrated in Figure 5. The fusion block contains two 1 Ã 1\nconvolutions to adjust the number of channels, N RepBlocks\ncomposed of RepConv [8] are used for feature fusion, and\nthe two-path outputs are fused by element-wise add. We\nformulate the calculation of the hybrid encoder as:', 'layout': 'double'}, {'layout_bbox': [92.092064, 563.1221, 297.57217, 607.2598], 'formula': '\\begin{array}{r l}{\\mathcal{Q}}&{=\\mathcal{K}=\\mathcal{V}=\\mathtt{F l a t t e r n}(\\mathcal{S}_{5}),}\\\\ {\\mathcal{F}_{5}}&{=\\mathtt{R e s h a p e}(\\mathtt{A I F I}(\\mathcal{Q},\\mathcal{K},\\mathcal{V})),\\qquad\\quad(1)}\\\\ {\\mathcal{O}}&{=\\mathtt{C C F F}(\\{\\mathcal{S}_{3},\\mathcal{S}_{4},\\mathcal{F}_{5}\\}),}\\end{array}', 'layout': 'double'}, {'layout_bbox': [41.275124, 613.64154, 298.0696, 636.9947], 'text': 'where Reshape represents restoring the shape of the flat-\ntened feature to the same shape as S5', 'layout': 'double'}, {'layout_bbox': [41.01993, 645.3085, 253.87302, 656.61505], 'text': '4.3.Uncertainty-minimal Ouery Selection', 'layout': 'double'}, {'layout_bbox': [39.780045, 664.1547, 296.5302, 724.974], 'text': 'To reduce the difficulty of optimizing object queries in\nDETR, several subsequent works [42, 44, 45] propose query\nselection schemes, which have in common that they use the\nconfidence score to select the top K features from the en-\ncoder to initialize object queries (or just position queries).', 'layout': 'double'}, {'layout_bbox': [316.3008, 289.54156, 573.4635, 415.4659], 'text': 'The confidence score represents the likelihood that the fea-\nture includes foreground objects. Nevertheless, the detector\nare required to simultaneously model the category and loca-\ntion of objects, both of which determine the quality of the\nfeatures. Hence, the performance score of the feature is a la-\ntent variable that is jointly correlated with both classification\nand localization. Based on the analysis, the current query\nselection lead to a considerable level of uncertainty in the\nselected features, resulting in sub-optimal initialization for\nthe decoder and hindering the performance of the detector.', 'layout': 'double'}, {'layout_bbox': [316.1587, 417.67807, 575.0031, 541.93054], 'text': 'To address this problem, we propose the uncertainty mini-\nmal query selection scheme, which explicitly constructs and\noptimizes the epistemic uncertainty to model the joint latent\nvariable of encoder features, thereby providing high-quality\nqueries for the decoder. Specifically, the feature uncertainty\nL/ is defined as the discrepancy between the predicted dis-\ntributions of localization P and classification C in Eq. (2).\nTo minimize the uncertainty of the queries, we integrate\nthe uncertainty into the loss function for the gradient-based\noptimization in Eq. (3).', 'layout': 'double'}, {'layout_bbox': [343.82712, 551.06995, 573.45465, 589.9438], 'formula': '\\begin{array}{r l r}{\\mathcal{U}(\\hat{\\mathcal{X}})=\\|\\mathcal{P}(\\hat{\\mathcal{X}})-\\mathcal{C}(\\hat{\\mathcal{X}})\\|,\\hat{\\mathcal{X}}\\in\\mathbb{R}^{D}}&{{}(2)}&{}\\\\ {\\mathcal{L}(\\hat{\\mathcal{X}},\\hat{\\mathcal{Y}},\\mathcal{Y})=\\mathcal{L}_{t o x}(\\hat{\\mathbf{b}},\\mathbf{b})+\\mathcal{L}_{c l s}(\\mathcal{U}(\\hat{\\mathcal{X}}),\\hat{\\mathbf{c}},\\mathbf{c})}&{{}(3)}\\end{array}', 'layout': 'double'}, {'layout_bbox': [316.74704, 598.45776, 573.39526, 636.35236], 'text': 'where and y denote the prediction and ground truth.\n= (e, b), C and b represent the category and bounding\nbox respectively, X represent the encoder feature.', 'layout': 'double'}, {'layout_bbox': [315.35437, 638.09393, 572.0008, 724.53687], 'text': 'Effectiveness analysis. To analyze the effectiveness of thc\nuncertainty-minimal query selection, we visualize the clas-\nsification scores and IoU scores of the selected features on\nCOCO va1.2017, Figure 6. We draw the scatterplot with\nclassification scores greater than 0.5. The purple and green\nwith uncertainty-minimal query selection and vanilla query\ndots represent the selected features from the model trained', 'layout': 'double'}], 'doc_preprocessor_res': {'input_path': None, 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 0}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 1, 'label': 'image', 'score': 0.9874590635299683, 'coordinate': [46.905365, 44.05746, 565.6911, 217.74211]}, {'cls_id': 2, 'label': 'text', 'score': 0.9869957566261292, 'coordinate': [41.776196, 418.61166, 296.82672, 554.4149]}, {'cls_id': 2, 'label': 'text', 'score': 0.9792540073394775, 'coordinate': [39.780045, 664.1547, 296.5302, 724.974]}, {'cls_id': 2, 'label': 'text', 'score': 0.9792136549949646, 'coordinate': [316.3008, 289.54156, 573.4635, 415.4659]}, {'cls_id': 2, 'label': 'text', 'score': 0.9789648652076721, 'coordinate': [316.1587, 417.67807, 575.0031, 541.93054]}, {'cls_id': 1, 'label': 'image', 'score': 0.9786934852600098, 'coordinate': [53.227856, 294.16644, 283.854, 396.24164]}, {'cls_id': 2, 'label': 'text', 'score': 0.9765349626541138, 'coordinate': [315.35437, 638.09393, 572.0008, 724.53687]}, {'cls_id': 6, 'label': 'figure_title', 'score': 0.9575827717781067, 'coordinate': [42.178703, 227.34215, 570.1248, 284.377]}, {'cls_id': 2, 'label': 'text', 'score': 0.9554654359817505, 'coordinate': [41.275124, 613.64154, 298.0696, 636.9947]}, {'cls_id': 7, 'label': 'formula', 'score': 0.951255738735199, 'coordinate': [92.092064, 563.1221, 297.57217, 607.2598]}, {'cls_id': 7, 'label': 'formula', 'score': 0.9501133561134338, 'coordinate': [343.82712, 551.06995, 573.45465, 589.9438]}, {'cls_id': 6, 'label': 'figure_title', 'score': 0.9381633400917053, 'coordinate': [99.52045, 401.63477, 240.27536, 411.01334]}, {'cls_id': 2, 'label': 'text', 'score': 0.9283379316329956, 'coordinate': [316.74704, 598.45776, 573.39526, 636.35236]}, {'cls_id': 0, 'label': 'paragraph_title', 'score': 0.9257320761680603, 'coordinate': [41.01993, 645.3085, 253.87302, 656.61505]}]}, 'overall_ocr_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': array([[[169, 50],
...,
[169, 62]],
...,
[[ 39, 711],
...,
[ 39, 726]]], dtype=int16), 'text_det_params': {'limit_side_len': 960, 'limit_type': 'max', 'thresh': 0.3, 'box_thresh': 0.6, 'unclip_ratio': 2.0}, 'text_type': 'general', 'textline_orientation_angles': array([-1, ..., -1]), 'text_rec_score_thresh': 0, 'rec_texts': ['Efficient Hybrid Encoder', 'CaevIst s1', 'CaaveJs2', 'CCFF', '', 'Fusion', '00âĄ0âĄ000000', 'Figure 4, Overview of RT-DETR. We feed the features from the last three stages of the backbone into the encoder. The eficient hybrid', 'encoder transforms multi-scale features into a sequence of image features through the Attention-based Intra-scale Feature Interaction (AIFI)', 'features to serve as initial object queries for the decoder, Finaly, the decoder with auxiliary prediction heads iteratively optimizes object', 'and the CNN-based Cross-scale Feature Fusion (CCFF), Then, the uncertainty-minimal query selection selects a fixed number of encoder', 'queries to generate categories and boxes.', '1x1Conv', 'Fusion', 'The confidence score represents the likelihood that the fea-', 'C', 'ture includes foreground objects. Nevertheless, the detector', 'are required to simultaneously model the category and loca-', 'NX', 'tion of objects, both of which determine the quality of the', '1x1Conv', 'features. Hence, the performance score of the feature is a la-', 'RepBlock', 'tent variable that is jointly correlated with both classification', 'and localization. Based on the analysis, the current query', 'CConcatenate', 'Element-wise add', 'Flatten', 'selection lead to a considerable level of uncertainty in the', 'selected features, resulting in sub-optimal initialization for', 'Figure 5. The fusion block in CCFF', 'the decoder and hindering the performance of the detector.', 'D, Ds, not only significantly reduces latency (35% faster),', 'To address this problem, we propose the uncertainty mini-', 'but also improves accuracy (0.4% AP higher). CCFF is opti-', 'mal query selection scheme, which explicitly constructs and', 'mized based on the cross-scale fusion module, which inserts', 'optimizes the epistemic uncertainty to model the joint latent', 'several fusion blocks consisting of convolutional layers into', 'variable of encoder features, thereby providing high-quality', 'the fusion path. The role of the fusion block is to fuse two', 'queries for the decoder. Specifically, the feature uncertainty', 'adjacent scale features into a new feature, and its structure is', 'L/ is defined as the discrepancy between the predicted dis-', 'illustrated in Figure 5. The fusion block contains two 1 à 1', 'tributions of localization P and classification C in Eq. (2).', 'convolutions to adjust the number of channels, N RepBlocks', 'To minimize the uncertainty of the queries, we integrate', 'composed of RepConv [8] are used for feature fusion, and', 'the uncertainty into the loss function for the gradient-based', 'the two-path outputs are fused by element-wise add. We', 'optimization in Eq. (3).', 'formulate the calculation of the hybrid encoder as:', 'u(x)=P(xīŧ-C(xīŧ,xâRD', '(2)', '=K =V =F1atten(Ss),', 'F = Reshape(AIFI(Q,K,V)),', '(1)', 'C(x.y)= Loīŧb,bīŧ+Cc(u(xīŧ,e,c)īŧ3)', 'O=CCFFīŧ{S,S,F}īŧ', 'where and y denote the prediction and ground truth.', 'where Reshape represents restoring the shape of the flat-', '= (e, b), C and b represent the category and bounding', 'tened feature to the same shape as S5', 'box respectively, X represent the encoder feature.', '4.3.Uncertainty-minimal Ouery Selection', 'Effectiveness analysis. To analyze the effectiveness of thc', 'uncertainty-minimal query selection, we visualize the clas-', 'To reduce the difficulty of optimizing object queries in', 'sification scores and IoU scores of the selected features on', 'DETR, several subsequent works [42, 44, 45] propose query', 'COCO va1.2017, Figure 6. We draw the scatterplot with', 'selection schemes, which have in common that they use the', 'classification scores greater than 0.5. The purple and green', 'confidence score to select the top K features from the en-', 'with uncertainty-minimal query selection and vanilla query', 'dots represent the selected features from the model trained', 'coder to initialize object queries (or just position queries).'], 'rec_scores': array([0.95921248, ..., 0.99757016]), 'rec_polys': array([[[169, 50],
...,
[169, 62]],
...,
[[ 39, 711],
...,
[ 39, 726]]], dtype=int16), 'rec_boxes': array([[169, ..., 62],
...,
[ 39, ..., 726]], dtype=int16)}, 'text_paragraphs_ocr_res': {'rec_polys': array([[[169, 50],
...,
[169, 62]],
...,
[[ 39, 711],
...,
[ 39, 726]]], dtype=int16), 'rec_texts': ['Efficient Hybrid Encoder', 'CaevIst s1', 'CaaveJs2', 'CCFF', '', 'Fusion', '00âĄ0âĄ000000', 'Figure 4, Overview of RT-DETR. We feed the features from the last three stages of the backbone into the encoder. The eficient hybrid', 'encoder transforms multi-scale features into a sequence of image features through the Attention-based Intra-scale Feature Interaction (AIFI)', 'features to serve as initial object queries for the decoder, Finaly, the decoder with auxiliary prediction heads iteratively optimizes object', 'and the CNN-based Cross-scale Feature Fusion (CCFF), Then, the uncertainty-minimal query selection selects a fixed number of encoder', 'queries to generate categories and boxes.', '1x1Conv', 'Fusion', 'The confidence score represents the likelihood that the fea-', 'C', 'ture includes foreground objects. Nevertheless, the detector', 'are required to simultaneously model the category and loca-', 'NX', 'tion of objects, both of which determine the quality of the', '1x1Conv', 'features. Hence, the performance score of the feature is a la-', 'RepBlock', 'tent variable that is jointly correlated with both classification', 'and localization. Based on the analysis, the current query', 'CConcatenate', 'Element-wise add', 'Flatten', 'selection lead to a considerable level of uncertainty in the', 'selected features, resulting in sub-optimal initialization for', 'Figure 5. The fusion block in CCFF', 'the decoder and hindering the performance of the detector.', 'D, Ds, not only significantly reduces latency (35% faster),', 'To address this problem, we propose the uncertainty mini-', 'but also improves accuracy (0.4% AP higher). CCFF is opti-', 'mal query selection scheme, which explicitly constructs and', 'mized based on the cross-scale fusion module, which inserts', 'optimizes the epistemic uncertainty to model the joint latent', 'several fusion blocks consisting of convolutional layers into', 'variable of encoder features, thereby providing high-quality', 'the fusion path. The role of the fusion block is to fuse two', 'queries for the decoder. Specifically, the feature uncertainty', 'adjacent scale features into a new feature, and its structure is', 'L/ is defined as the discrepancy between the predicted dis-', 'illustrated in Figure 5. The fusion block contains two 1 à 1', 'tributions of localization P and classification C in Eq. (2).', 'convolutions to adjust the number of channels, N RepBlocks', 'To minimize the uncertainty of the queries, we integrate', 'composed of RepConv [8] are used for feature fusion, and', 'the uncertainty into the loss function for the gradient-based', 'the two-path outputs are fused by element-wise add. We', 'optimization in Eq. (3).', 'formulate the calculation of the hybrid encoder as:', 'where and y denote the prediction and ground truth.', 'where Reshape represents restoring the shape of the flat-', '= (e, b), C and b represent the category and bounding', 'tened feature to the same shape as S5', 'box respectively, X represent the encoder feature.', '4.3.Uncertainty-minimal Ouery Selection', 'Effectiveness analysis. To analyze the effectiveness of thc', 'uncertainty-minimal query selection, we visualize the clas-', 'To reduce the difficulty of optimizing object queries in', 'sification scores and IoU scores of the selected features on', 'DETR, several subsequent works [42, 44, 45] propose query', 'COCO va1.2017, Figure 6. We draw the scatterplot with', 'selection schemes, which have in common that they use the', 'classification scores greater than 0.5. The purple and green', 'confidence score to select the top K features from the en-', 'with uncertainty-minimal query selection and vanilla query', 'dots represent the selected features from the model trained', 'coder to initialize object queries (or just position queries).'], 'rec_scores': array([0.95921248, ..., 0.99757016]), 'rec_boxes': array([[169, ..., 62],
...,
[ 39, ..., 726]], dtype=int16)}, 'formula_res_list': [{'input_path': None, 'page_index': None, 'rec_formula': '\\begin{array}{r l}{\\mathcal{Q}}&{=\\mathcal{K}=\\mathcal{V}=\\mathtt{F l a t t e r n}(\\mathcal{S}_{5}),}\\\\ {\\mathcal{F}_{5}}&{=\\mathtt{R e s h a p e}(\\mathtt{A I F I}(\\mathcal{Q},\\mathcal{K},\\mathcal{V})),\\qquad\\quad(1)}\\\\ {\\mathcal{O}}&{=\\mathtt{C C F F}(\\{\\mathcal{S}_{3},\\mathcal{S}_{4},\\mathcal{F}_{5}\\}),}\\end{array}', 'formula_region_id': 1, 'dt_polys': [92.092064, 563.1221, 297.57217, 607.2598]}, {'input_path': None, 'page_index': None, 'rec_formula': '\\begin{array}{r l r}{\\mathcal{U}(\\hat{\\mathcal{X}})=\\|\\mathcal{P}(\\hat{\\mathcal{X}})-\\mathcal{C}(\\hat{\\mathcal{X}})\\|,\\hat{\\mathcal{X}}\\in\\mathbb{R}^{D}}&{{}(2)}&{}\\\\ {\\mathcal{L}(\\hat{\\mathcal{X}},\\hat{\\mathcal{Y}},\\mathcal{Y})=\\mathcal{L}_{t o x}(\\hat{\\mathbf{b}},\\mathbf{b})+\\mathcal{L}_{c l s}(\\mathcal{U}(\\hat{\\mathcal{X}}),\\hat{\\mathbf{c}},\\mathbf{c})}&{{}(3)}\\end{array}', 'formula_region_id': 2, 'dt_polys': [343.82712, 551.06995, 573.45465, 589.9438]}]}}
```
paddlex --pipeline layout_parsing_v2 \
--input layout_parsing_v2_demo.png \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--use_textline_orientation False \
--save_path ./output \
--device gpu:0
What's the result
```bash {'res': {'input_path': 'layout_parsing_v2_demo.png', 'model_settings': {'use_doc_preprocessor': False, 'use_general_ocr': True, 'use_seal_recognition': True, 'use_table_recognition': True, 'use_formula_recognition': True}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 2, 'label': 'text', 'score': 0.9853514432907104, 'coordinate': [770.9531, 776.6814, 1122.6057, 1058.7322]}, {'cls_id': 1, 'label': 'image', 'score': 0.9848673939704895, 'coordinate': [775.7434, 202.27979, 1502.8113, 686.02136]}, {'cls_id': 2, 'label': 'text', 'score': 0.983731746673584, 'coordinate': [1152.3197, 1113.3275, 1503.3029, 1346.586]}, {'cls_id': 2, 'label': 'text', 'score': 0.9832221865653992, 'coordinate': [1152.5602, 801.431, 1503.8436, 986.3563]}, {'cls_id': 2, 'label': 'text', 'score': 0.9829439520835876, 'coordinate': [9.549545, 849.5713, 359.1173, 1058.7488]}, {'cls_id': 2, 'label': 'text', 'score': 0.9811657667160034, 'coordinate': [389.58298, 1137.2659, 740.66235, 1346.7488]}, {'cls_id': 2, 'label': 'text', 'score': 0.9775941371917725, 'coordinate': [9.1302185, 201.85, 359.0409, 339.05692]}, {'cls_id': 2, 'label': 'text', 'score': 0.9750366806983948, 'coordinate': [389.71454, 752.96924, 740.544, 889.92456]}, {'cls_id': 2, 'label': 'text', 'score': 0.9738152027130127, 'coordinate': [389.94565, 298.55988, 740.5585, 435.5124]}, {'cls_id': 2, 'label': 'text', 'score': 0.9737328290939331, 'coordinate': [771.50256, 1065.4697, 1122.2582, 1178.7324]}, {'cls_id': 2, 'label': 'text', 'score': 0.9728517532348633, 'coordinate': [1152.5154, 993.3312, 1503.2349, 1106.327]}, {'cls_id': 2, 'label': 'text', 'score': 0.9725610017776489, 'coordinate': [9.372787, 1185.823, 359.31738, 1298.7227]}, {'cls_id': 2, 'label': 'text', 'score': 0.9724331498146057, 'coordinate': [389.62848, 610.7389, 740.83234, 746.2377]}, {'cls_id': 2, 'label': 'text', 'score': 0.9720287322998047, 'coordinate': [389.29898, 897.0936, 741.41516, 1034.6616]}, {'cls_id': 2, 'label': 'text', 'score': 0.9713053703308105, 'coordinate': [10.323685, 1065.4663, 359.6786, 1178.8872]}, {'cls_id': 2, 'label': 'text', 'score': 0.9689728021621704, 'coordinate': [9.336395, 537.6609, 359.2901, 652.1881]}, {'cls_id': 2, 'label': 'text', 'score': 0.9684857130050659, 'coordinate': [10.7608185, 345.95068, 358.93616, 434.64087]}, {'cls_id': 2, 'label': 'text', 'score': 0.9681928753852844, 'coordinate': [9.674866, 658.89075, 359.56528, 770.4319]}, {'cls_id': 2, 'label': 'text', 'score': 0.9634978175163269, 'coordinate': [770.9464, 1281.1785, 1122.6522, 1346.7156]}, {'cls_id': 2, 'label': 'text', 'score': 0.96304851770401, 'coordinate': [390.0113, 201.28055, 740.1684, 291.53073]}, {'cls_id': 2, 'label': 'text', 'score': 0.962053120136261, 'coordinate': [391.21393, 1040.952, 740.5046, 1130.32]}, {'cls_id': 2, 'label': 'text', 'score': 0.9565253853797913, 'coordinate': [10.113251, 777.1482, 359.439, 842.437]}, {'cls_id': 2, 'label': 'text', 'score': 0.9497362375259399, 'coordinate': [390.31357, 537.86285, 740.47595, 603.9285]}, {'cls_id': 2, 'label': 'text', 'score': 0.9371236562728882, 'coordinate': [10.2034, 1305.9753, 359.5958, 1346.7295]}, {'cls_id': 0, 'label': 'paragraph_title', 'score': 0.9338151216506958, 'coordinate': [791.6062, 1200.8479, 1103.3257, 1259.9324]}, {'cls_id': 0, 'label': 'paragraph_title', 'score': 0.9326773285865784, 'coordinate': [408.0737, 457.37024, 718.9509, 516.63464]}, {'cls_id': 0, 'label': 'paragraph_title', 'score': 0.9274250864982605, 'coordinate': [29.448685, 456.6762, 340.99194, 515.6999]}, {'cls_id': 2, 'label': 'text', 'score': 0.8742568492889404, 'coordinate': [1154.7095, 777.3624, 1330.3086, 794.5853]}, {'cls_id': 2, 'label': 'text', 'score': 0.8442489504814148, 'coordinate': [586.49316, 160.15454, 927.468, 179.64203]}, {'cls_id': 11, 'label': 'doc_title', 'score': 0.8332607746124268, 'coordinate': [133.80017, 37.41908, 1380.8601, 124.1429]}, {'cls_id': 6, 'label': 'figure_title', 'score': 0.6770150661468506, 'coordinate': [812.1718, 705.1199, 1484.6973, 747.1692]}]}, 'overall_ocr_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': array([[[ 133, 35], ..., [ 133, 131]],
...,
[[1154, 1323],
...,
[1152, 1355]]], dtype=int16), 'text_det_params': {'limit_side_len': 960, 'limit_type': 'max', 'thresh': 0.3, 'box_thresh': 0.6, 'unclip_ratio': 2.0}, 'text_type': 'general', 'textline_orientation_angles': array([-1, ..., -1]), 'text_rec_score_thresh': 0.0, 'rec_texts': ['åŠååæšäē¤åž', 'æåģēåč°æĄĨæĸ', 'æŦæĨ莰č
', 'æ˛å°æ', 'äģģ', 'åŊĻ', 'éģåšæ', 'čēĢįä¸åŊäŧ įģæ°ææčŖ
įåįĢįšéäēé', 'åįĢįšéäēéĢįæč˛ä¸į įŠļéĸåäŊåģēįĢīŧåŧ', 'åš´äžæŦĄįģå°čĄ¨æŧä¸åŊæ°æčãį°äģŖčãæåč', '莞äēä¸åŊč¯č¨č¯žį¨åä¸åŊæåč¯žį¨īŧæŗ¨ååĻ', 'į,æŧåĻįčå§ŋčĩĸåžį°åēč§äŧéĩéĩæåŖ°ãčŋ', 'į2ä¸äŊäēēæŦĄã10äŊåš´æĨīŧåįšåéĸåˇ˛æä¸ē', 'æ¯æĨååįĢįšéäēéĢįæč˛ä¸į įŠļéĸåååĻ', 'åŊå°æ°äŧäēč§Ŗä¸åŊįä¸æįĒåŖã', 'éĸ(äģĨä¸įŽį§°"åįšåéĸ")丞å"åčŋæ°åš´"ä¸åŊ', 'éģé¸ŖéŖ襨į¤ēīŧéįæĨåĻäš ä¸æįäēēæĨį', 'æčæ¯čĩįåēæ¯ã', 'åĸå¤īŧéŋæ¯éŠŦæ大åĻæåĻįšåˇ˛éžäģĨæģĄčļŗæåĻ', 'ä¸åŊååįĢįšéäēäŧ įģåč°æˇąåãčŋåš´', 'éčĻã2024åš´4æīŧįąä¸äŧčééåĸæåąå', 'æĨ,å¨éĢč´¨éå
ąåģē"ä¸å¸Ļä¸čˇ¯"æĄæļä¸īŧä¸å两', 'åˇčˇ¯æĄĨæŋåģēįåéĸæåĻæĨŧ饚įŽå¨éŋæ¯éŠŦæåŧ', 'åŊäēēæäē¤æĩä¸ææˇąåīŧäēåŠåäŊįæ°æåēįĄ', 'åˇĨåģē莞īŧéĸ莥äģåš´ä¸ååš´åŗģåˇĨīŧåģēæåå°ä¸ēå', 'æĨįæˇąåã', 'įšåéĸæäžå
¨æ°įååĻåēå°ã', 'âåĻåĨŊä¸æīŧæäģŦį', 'âå¨ä¸åŊåĻäš įįģå', 'æĒæĨä¸æ¯æĸĻâ', '莊æįå°æ´åšŋéįä¸įâ', 'âé˛čąæžåč¯æäŊ ææ ˇčĩ°čŋīŧ大å°įĨéäŊ ', 'å¤åš´æĨīŧåįĢįšéäēåšŋ大čĩ´åįåĻįå', 'åŋä¸įæ¯ä¸ä¸Ēč§čŊâĻ"åįĢįšéäēéŋæ¯éŠŦæ', 'åščŽäēēåį§¯ææčēĢåŊåŽļåģē莞īŧæä¸ēåŠåč¯ĨåŊ', '大åĻįģŧåæĨŧäēåąīŧä¸éĩäŧįžįæåŖ°å¨čĩ°åģéå', 'ååąįäēēæååä¸ååĨŊįč§č¯č
åæ¨å¨č
ã', 'åãåžĒįįæįæåžčŊģčŊģæ¨åŧä¸é´æ厤įé¨īŧ', 'å¨åįĢįšéäēå
¨åŊåĻåĨŗčįåˇĨäŊįįēĻįŋ°', 'åĻįäģŦæŖčˇįčå¸åĻåąä¸æææ˛ãåä¸éĻæãã', 'å¨ÂˇįšéĻå°åžˇÂˇå¯čąåĄå°ąæ¯å
ļä¸ä¸äŊãåĨšæžå¨', 'čŋæ¯åįšåéĸéŋæ¯éŠŦæ大åĻæåĻįšįä¸', 'ä¸ååĨŗååĻéĸæģč¯ģįĄåŖĢåĻäŊīŧį įŠļæšåæ¯åĨŗ', 'čä¸æææ˛č¯žãä¸ēäē莊åĻįäģŦæ´åĨŊå°įč§Ŗæ', 'æ§éĸå¯ŧåä¸į¤žäŧååąãå
ļé´īŧåĨšåŽå°čĩ°čŽŋä¸åŊ', 'č¯å¤§æīŧčå¸å°¤æ¯æ¡įŠįŊéģåžˇč¨å°Âˇäž¯čĩå é', 'å¤ä¸Ēå°åēīŧčˇåžäēč§å¯ä¸åŊį¤žäŧååąįįŦŦä¸', 'å¨åįĢįšéäēä¸äš
å丞åįįŦŦå
åąä¸åŊéŖįæåčä¸īŧåŊå°å°åĻįäŊéĒéŖįåļäŊã', 'åįŋģč¯åč§Ŗéæč¯ãéįäŧ´åĨåŖ°åčĩˇīŧåĻįäģŦ', 'æčĩæã', 'ä¸åŊéŠģåįĢįšéäē大äŊŋéĻäžåž', 'čžšåąčžšéįčææå¨čēĢäŊīŧį°åēæ°æ°įįã', 'č°čĩˇå¨ä¸åŊæąåĻįįģåīŧįēĻįŋ°å¨čŽ°åŋįš', 'âčŋæ¯ä¸æææ˛åįē§įīŧå
ąæ32äēēãåĻ', 'æ°īŧ"ä¸åŊįååąå¨åŊäģä¸įæ¯įŦä¸æ äēįã', 'âä¸įŽĄčŋčŋéŊæ¯åŽĸäēēīŧč¯ˇä¸į¨åŽĸæ°īŧį¸įēĻ', 'įĻįåįēĸæĩˇįåįŠéĻã', 'į大é¨åæĨčĒéĻéŊéŋæ¯éŠŦæįä¸å°åĻīŧåš´éž', 'æ˛ŋįä¸åŊįšč˛į¤žäŧä¸ģäšéčˇ¯ååŽåčĄīŧä¸åŊ', 'åĨŊäēå¨ä¸čĩˇæäģŦæŦĸčŋäŊ "å¨ä¸åēä¸åé', 'åįŠéĻäēåąéåįä¸ä¸ĒåæčĒéŋæåŠ', 'æå°įäģ
æ6å˛ãâå°¤æ¯æåč¯čŽ°č
ã', 'åé äēååąåĨčŋšīŧčŋä¸åéŊįĻģä¸åŧä¸åŊå
ąäē§å
', 'åš´čč°æ´ģå¨ä¸īŧååˇčˇ¯æĄĨä¸æšååˇĨååŊå°å¤§', 'æ¯å¤åįä¸åŊå¤äģŖéļåļé
å¨īŧįŊčēĢä¸åį', 'å°¤æ¯æäģåš´23å˛īŧæ¯åįĢįšéäēä¸æå
ŦįĢ', 'įéĸå¯ŧãä¸åŊįååąįģéĒåŧåžčŽ¸å¤åŊåŽļåĻäš ', 'åĻįååąãåäēŦæŦĸčŋäŊ ããåįĢįšéäēææ¯åĻ', 'âä¸""å""įĻ
"âåąą"įæąåãâčŋäģļæįŠč¯', 'åĻæ Ąįčēæ¯čå¸ãåĨš12å˛åŧå§å¨åįšåéĸåĻ', 'åé´ãâ', 'éĸ莥įŽæēį§åĻä¸åˇĨį¨ä¸ä¸åĻįé˛å¤ĢåĄÂˇč°ĸæ', 'æ,åžæŠäģĨåæäģŦå°ąéčŋæĩˇä¸ä¸įģ¸äščˇ¯čŋčĄ', 'äš ä¸æ,å¨2017åš´įŦŦååą"æąč¯æĄĨ"ä¸įä¸åĻį', 'æŖå¨čĨŋå大åĻåĻäš įåįĢįšéäēååŖĢį', 'æ¯å
ļä¸ä¸åæŧåąč
īŧåĨšåžæŠäžŋå¨åéĸåĻäš ä¸', 'č´¸æåžæĨä¸æåäē¤æĩãčŋäšæ¯åįĢįšéäē', 'ä¸ææ¯čĩä¸čˇåžåįĢįšéäēčĩåēįŦŦä¸åīŧåšļå', 'įŠåĸįåĄÂˇæŗŊįŠäŧ寚ä¸åŊæææˇąåææ
ã8', 'æīŧä¸į´å¨ä¸ēåģä¸åŊįåĻäŊåå¤ã"čŋåĨæč¯', 'ä¸ä¸åŊååĨŊäē¤åžåå˛įæåč¯æã"åįēĸæĩˇ', 'åäŧ´äģŖ襨åįĢįšéäēååžä¸åŊåå åŗčĩīŧčˇåž', 'åš´åīŧå¨åäēŦå¸č大åĻčˇåžįĄåŖĢåĻäŊåīŧįŠåĸ', 'æ¯æäģŦ两åŊäēēæ°åč°įįå¨åį
§ãæ čŽēæ¯æ', 'įåįŠéĻį įŠļä¸æįŽé¨č´č´Ŗäēēäŧč¨äēæ¯Âˇįš', 'åĸäŊäŧčåĨã2022åš´čĩˇīŧå°¤æ¯æåŧå§å¨åįšå', 'įåĄå¨į¤žäē¤åĒäŊä¸åä¸čŋæ ˇä¸æŽĩč¯īŧ"čŋæ¯æ', 'čēĢäēåįĢįšéäēåēįĄčŽžæŊåģē莞įä¸äŧååˇĨīŧ', 'æ¯æŗå
šåč¯´ã', 'éĸå
ŧčææä¸æææ˛īŧæ¯å¨æĢ两ä¸Ēč¯žæļãâä¸åŊ', 'äēēįįéčĻä¸æĨīŧčĒæ¤ææĨæäēä¸åååēį', 'čŋæ¯å¨ä¸åŊįåĻįåįĢįšéäēåĻåīŧ两åŊäēē', 'åįĢįšéäēåŊåŽļåįŠéĻčå¤åĻåäēēįąģåĻ', 'æåå大į˛žæˇąīŧæå¸ææįåĻįäģŦčŊå¤éčŋä¸', 'éåīŧčĩäēæįŠŋčļčæŖįåéãâ', 'æ°æēæåĒåīŧåŋ
å°æ¨å¨ä¸¤åŊå
ŗįŗģä¸æååå', 'į įŠļåč˛å°č¡įšéĻå°åžˇåååįąä¸åŊæ', 'æææ˛æ´åĨŊå°įč§Ŗä¸åŊæåã"åĨšč¯´ã', 'įŠåĸįåĄå¯åå
ŗæŗ¨ä¸åŊå¨įģæĩãį§æãæ', 'åąã"é˛å¤ĢåĄč¯´ã', 'åãäģ襨į¤ēīŧâåĻäš åŊŧæ¤įč¯č¨åæåīŧå°å¸Ž', 'âå§å§īŧäŊ æŗåģä¸åŊåīŧ""é常æŗīŧææŗ', 'č˛įéĸåįååąīŧä¸åŊå¨į§į įæšéĸįåŽå', 'åįĢįšéäēéĢįæč˛å§åäŧä¸ģäģģåŠįč¨', 'åŠåä¸ä¸¤åŊäēēæ°æ´åĨŊå°įč§ŖåŊŧæ¤īŧåŠååæš', 'åģįæ
åŽĢãįŦéŋåã"å°¤æ¯æįåĻįä¸æä¸å¯š', 'ä¸æĨäŋąåĸãå¨ä¸åŊåĻäš įįģå莊æįå°æ´åšŋ', 'éŠŦį襨į¤ēīŧ"æ¯åš´æäģŦéŊäŧįģįģåĻįå°ä¸åŊčŽŋ', 'äē¤åžīŧæåģēåč°æĄĨæĸã"', 'čŊæåčįå§åĻšīŧå§å§é˛å¨
äģåš´15å˛īŧåĻšåĻš', 'éįä¸įīŧäģä¸åįåĒæĩ
ã', 'éŽåĻäš īŧįŽåæčļ
čŋ5000ååįĢįšéäēåĻį', 'åįĢįšéäēåŊåŽļåįŠéĻéĻéŋåĄåä¸ÂˇåĒ', 'čå¨
14å˛īŧ两äēēéŊåˇ˛å¨åįšåéĸåĻäš å¤åš´īŧ', '23å˛įččŋĒäē¡åæ¯čæŗč¯ēæ¯åˇ˛å¨åįš', 'å¨ä¸åŊįåĻãåĻäš ä¸åŊįæč˛įģéĒīŧæåŠäē', 'éčžžå§Âˇäŧį´ įĻæžå¤æŦĄčŽŋéŽä¸åŊīŧ寚ä¸åææ', 'ä¸æč¯´åžæ ŧå¤æĩåŠã', 'åéĸåĻäš 3åš´īŧå¨ä¸åŊäšĻæŗãä¸åŊįģįæšéĸ襨', 'æååįĢįšéäēįæč˛æ°´åšŗãâ', 'įäŧ æŋä¸åæ°ãį°äģŖååįŠéĻįåģē莞ä¸ååą', 'é˛å¨
å¯ščŽ°č
č¯´īŧ"čŋäēåš´æĨīŧæį寚ä¸æ', 'į°ååäŧį§īŧå¨2024åš´åįĢįšéäēčĩåēį', 'âå
ąååä¸įåąį¤ēé', 'å°čąĄæˇąåģãâä¸åŊåįŠéĻä¸äģ
æ莸å¤äŋååŽåĨŊ', 'åä¸åŊæåįįįąīŧæäģŦå§åĻšäŋŠå§įģį¸äēéŧ', 'âæąč¯æĄĨ"æ¯čĩä¸čˇåžä¸įåĨãččŋĒäēč¯´īŧ"åĻ', 'įæįŠīŧčŋå
åčŋį¨å
čŋį§æææŽĩčŋčĄåąį¤ēīŧ', 'åąīŧä¸čĩˇåĻäš ãæäģŦįä¸æä¸å¤Šæ¯ä¸å¤ŠåĨŊīŧčŋ', 'äš ä¸åŊäšĻæŗ莊æįå
åŋååžåŽåŽåįē¯į˛šãæ', 'æ´˛åäēæ´˛įįŋįææâ', '帎åŠäēēäģŦæ´åĨŊįč§Ŗä¸åææã"åĄåä¸č¯´īŧ"å', 'åĻäŧäēä¸ææåä¸åŊčãæäģŦä¸åŽčĻå°ä¸åŊ', 'äšåæŦĸä¸åŊįæéĨ°,å¸ææĒæĨčŊåģä¸åŊåĻäš īŧ', 'įĢįšéäēä¸ä¸åŊéŊæĨææ äš
įææīŧå§įģį¸', 'åģãåĻåĨŊä¸æīŧæäģŦįæĒæĨä¸æ¯æĸĻīŧ"', 'æä¸åŊä¸åæ°æå
į´ čå
ĨæčŖ
莞莥ä¸īŧåäŊ', 'äģéŋæ¯éŠŦæåēåīŧæ˛ŋįčŋčæ˛æįįåąą', 'äēįč§Ŗãį¸äēå°éãæå¸ææĒæĨä¸ä¸åŊåčĄ', 'æŽåįšåéĸä¸æšéĸéŋéģé¸ŖéŖäģįģīŧčŋæ', 'åēæ´å¤į˛žįžäŊåīŧäšæåįšæååäēĢįģæ´å¤', 'å
Ŧčˇ¯ä¸čˇ¯åä¸å¯ģæžä¸čˇ¯å°čŋšã銹čŊĻ两ä¸Ēå°', 'å åŧēåäŊīŧå
ąååä¸įåąį¤ēéæ´˛åäēæ´˛įįŋ', 'åéĸæįĢäē2013åš´3æīŧįąč´ĩåˇč´ĸįģ大åĻå', 'įä¸åŊæåãâ', 'æļīŧ莰č
æĨå°äŊäēåįĢįšéäē港åŖåå¸éŠŦč¨', 'įææãâ'], 'rec_scores': array([0.99943757, ..., 0.98181838]), 'rec_polys': array([[[ 133, 35],
...,
[ 133, 131]],
...,
[[1154, 1323],
...,
[1152, 1355]]], dtype=int16), 'rec_boxes': array([[ 133, ..., 131],
...,
[1152, ..., 1359]], dtype=int16)}, 'text_paragraphs_ocr_res': {'rec_polys': array([[[ 133, 35],
...,
[ 133, 131]],
...,
[[1154, 1323],
...,
[1152, 1355]]], dtype=int16), 'rec_texts': ['åŠååæšäē¤åž', 'æåģēåč°æĄĨæĸ', 'æŦæĨ莰č
', 'æ˛å°æ', 'äģģ', 'åŊĻ', 'éģåšæ', 'čēĢįä¸åŊäŧ įģæ°ææčŖ
įåįĢįšéäēé', 'åįĢįšéäēéĢįæč˛ä¸į įŠļéĸåäŊåģēįĢīŧåŧ', 'åš´äžæŦĄįģå°čĄ¨æŧä¸åŊæ°æčãį°äģŖčãæåč', '莞äēä¸åŊč¯č¨č¯žį¨åä¸åŊæåč¯žį¨īŧæŗ¨ååĻ', 'į,æŧåĻįčå§ŋčĩĸåžį°åēč§äŧéĩéĩæåŖ°ãčŋ', 'į2ä¸äŊäēēæŦĄã10äŊåš´æĨīŧåįšåéĸåˇ˛æä¸ē', 'æ¯æĨååįĢįšéäēéĢįæč˛ä¸į įŠļéĸåååĻ', 'åŊå°æ°äŧäēč§Ŗä¸åŊįä¸æįĒåŖã', 'éĸ(äģĨä¸įŽį§°"åįšåéĸ")丞å"åčŋæ°åš´"ä¸åŊ', 'éģé¸ŖéŖ襨į¤ēīŧéįæĨåĻäš ä¸æįäēēæĨį', 'æčæ¯čĩįåēæ¯ã', 'åĸå¤īŧéŋæ¯éŠŦæ大åĻæåĻįšåˇ˛éžäģĨæģĄčļŗæåĻ', 'ä¸åŊååįĢįšéäēäŧ įģåč°æˇąåãčŋåš´', 'éčĻã2024åš´4æīŧįąä¸äŧčééåĸæåąå', 'æĨ,å¨éĢč´¨éå
ąåģē"ä¸å¸Ļä¸čˇ¯"æĄæļä¸īŧä¸å两', 'åˇčˇ¯æĄĨæŋåģēįåéĸæåĻæĨŧ饚įŽå¨éŋæ¯éŠŦæåŧ', 'åŊäēēæäē¤æĩä¸ææˇąåīŧäēåŠåäŊįæ°æåēįĄ', 'åˇĨåģē莞īŧéĸ莥äģåš´ä¸ååš´åŗģåˇĨīŧåģēæåå°ä¸ēå', 'æĨįæˇąåã', 'įšåéĸæäžå
¨æ°įååĻåēå°ã', 'âåĻåĨŊä¸æīŧæäģŦį', 'âå¨ä¸åŊåĻäš įįģå', 'æĒæĨä¸æ¯æĸĻâ', '莊æįå°æ´åšŋéįä¸įâ', 'âé˛čąæžåč¯æäŊ ææ ˇčĩ°čŋīŧ大å°įĨéäŊ ', 'å¤åš´æĨīŧåįĢįšéäēåšŋ大čĩ´åįåĻįå', 'åŋä¸įæ¯ä¸ä¸Ēč§čŊâĻ"åįĢįšéäēéŋæ¯éŠŦæ', 'åščŽäēēåį§¯ææčēĢåŊåŽļåģē莞īŧæä¸ēåŠåč¯ĨåŊ', '大åĻįģŧåæĨŧäēåąīŧä¸éĩäŧįžįæåŖ°å¨čĩ°åģéå', 'ååąįäēēæååä¸ååĨŊįč§č¯č
åæ¨å¨č
ã', 'åãåžĒįįæįæåžčŊģčŊģæ¨åŧä¸é´æ厤įé¨īŧ', 'å¨åįĢįšéäēå
¨åŊåĻåĨŗčįåˇĨäŊįįēĻįŋ°', 'åĻįäģŦæŖčˇįčå¸åĻåąä¸æææ˛ãåä¸éĻæãã', 'å¨ÂˇįšéĻå°åžˇÂˇå¯čąåĄå°ąæ¯å
ļä¸ä¸äŊãåĨšæžå¨', 'čŋæ¯åįšåéĸéŋæ¯éŠŦæ大åĻæåĻįšįä¸', 'ä¸ååĨŗååĻéĸæģč¯ģįĄåŖĢåĻäŊīŧį įŠļæšåæ¯åĨŗ', 'čä¸æææ˛č¯žãä¸ēäē莊åĻįäģŦæ´åĨŊå°įč§Ŗæ', 'æ§éĸå¯ŧåä¸į¤žäŧååąãå
ļé´īŧåĨšåŽå°čĩ°čŽŋä¸åŊ', 'č¯å¤§æīŧčå¸å°¤æ¯æ¡įŠįŊéģåžˇč¨å°Âˇäž¯čĩå é', 'å¤ä¸Ēå°åēīŧčˇåžäēč§å¯ä¸åŊį¤žäŧååąįįŦŦä¸', 'å¨åįĢįšéäēä¸äš
å丞åįįŦŦå
åąä¸åŊéŖįæåčä¸īŧåŊå°å°åĻįäŊéĒéŖįåļäŊã', 'åįŋģč¯åč§Ŗéæč¯ãéįäŧ´åĨåŖ°åčĩˇīŧåĻįäģŦ', 'æčĩæã', 'ä¸åŊéŠģåįĢįšéäē大äŊŋéĻäžåž', 'čžšåąčžšéįčææå¨čēĢäŊīŧį°åēæ°æ°įįã', 'č°čĩˇå¨ä¸åŊæąåĻįįģåīŧįēĻįŋ°å¨čŽ°åŋįš', 'âčŋæ¯ä¸æææ˛åįē§įīŧå
ąæ32äēēãåĻ', 'æ°īŧ"ä¸åŊįååąå¨åŊäģä¸įæ¯įŦä¸æ äēįã', 'âä¸įŽĄčŋčŋéŊæ¯åŽĸäēēīŧč¯ˇä¸į¨åŽĸæ°īŧį¸įēĻ', 'įĻįåįēĸæĩˇįåįŠéĻã', 'į大é¨åæĨčĒéĻéŊéŋæ¯éŠŦæįä¸å°åĻīŧåš´éž', 'æ˛ŋįä¸åŊįšč˛į¤žäŧä¸ģäšéčˇ¯ååŽåčĄīŧä¸åŊ', 'åĨŊäēå¨ä¸čĩˇæäģŦæŦĸčŋäŊ "å¨ä¸åēä¸åé', 'åįŠéĻäēåąéåįä¸ä¸ĒåæčĒéŋæåŠ', 'æå°įäģ
æ6å˛ãâå°¤æ¯æåč¯čŽ°č
ã', 'åé äēååąåĨčŋšīŧčŋä¸åéŊįĻģä¸åŧä¸åŊå
ąäē§å
', 'åš´čč°æ´ģå¨ä¸īŧååˇčˇ¯æĄĨä¸æšååˇĨååŊå°å¤§', 'æ¯å¤åįä¸åŊå¤äģŖéļåļé
å¨īŧįŊčēĢä¸åį', 'å°¤æ¯æäģåš´23å˛īŧæ¯åįĢįšéäēä¸æå
ŦįĢ', 'įéĸå¯ŧãä¸åŊįååąįģéĒåŧåžčŽ¸å¤åŊåŽļåĻäš ', 'åĻįååąãåäēŦæŦĸčŋäŊ ããåįĢįšéäēææ¯åĻ', 'âä¸""å""įĻ
"âåąą"įæąåãâčŋäģļæįŠč¯', 'åĻæ Ąįčēæ¯čå¸ãåĨš12å˛åŧå§å¨åįšåéĸåĻ', 'åé´ãâ', 'éĸ莥įŽæēį§åĻä¸åˇĨį¨ä¸ä¸åĻįé˛å¤ĢåĄÂˇč°ĸæ', 'æ,åžæŠäģĨåæäģŦå°ąéčŋæĩˇä¸ä¸įģ¸äščˇ¯čŋčĄ', 'äš ä¸æ,å¨2017åš´įŦŦååą"æąč¯æĄĨ"ä¸įä¸åĻį', 'æŖå¨čĨŋå大åĻåĻäš įåįĢįšéäēååŖĢį', 'æ¯å
ļä¸ä¸åæŧåąč
īŧåĨšåžæŠäžŋå¨åéĸåĻäš ä¸', 'č´¸æåžæĨä¸æåäē¤æĩãčŋäšæ¯åįĢįšéäē', 'ä¸ææ¯čĩä¸čˇåžåįĢįšéäēčĩåēįŦŦä¸åīŧåšļå', 'įŠåĸįåĄÂˇæŗŊįŠäŧ寚ä¸åŊæææˇąåææ
ã8', 'æīŧä¸į´å¨ä¸ēåģä¸åŊįåĻäŊåå¤ã"čŋåĨæč¯', 'ä¸ä¸åŊååĨŊäē¤åžåå˛įæåč¯æã"åįēĸæĩˇ', 'åäŧ´äģŖ襨åįĢįšéäēååžä¸åŊåå åŗčĩīŧčˇåž', 'åš´åīŧå¨åäēŦå¸č大åĻčˇåžįĄåŖĢåĻäŊåīŧįŠåĸ', 'æ¯æäģŦ两åŊäēēæ°åč°įįå¨åį
§ãæ čŽēæ¯æ', 'įåįŠéĻį įŠļä¸æįŽé¨č´č´Ŗäēēäŧč¨äēæ¯Âˇįš', 'åĸäŊäŧčåĨã2022åš´čĩˇīŧå°¤æ¯æåŧå§å¨åįšå', 'įåĄå¨į¤žäē¤åĒäŊä¸åä¸čŋæ ˇä¸æŽĩč¯īŧ"čŋæ¯æ', 'čēĢäēåįĢįšéäēåēįĄčŽžæŊåģē莞įä¸äŧååˇĨīŧ', 'æ¯æŗå
šåč¯´ã', 'éĸå
ŧčææä¸æææ˛īŧæ¯å¨æĢ两ä¸Ēč¯žæļãâä¸åŊ', 'äēēįįéčĻä¸æĨīŧčĒæ¤ææĨæäēä¸åååēį', 'čŋæ¯å¨ä¸åŊįåĻįåįĢįšéäēåĻåīŧ两åŊäēē', 'åįĢįšéäēåŊåŽļåįŠéĻčå¤åĻåäēēįąģåĻ', 'æåå大į˛žæˇąīŧæå¸ææįåĻįäģŦčŊå¤éčŋä¸', 'éåīŧčĩäēæįŠŋčļčæŖįåéãâ', 'æ°æēæåĒåīŧåŋ
å°æ¨å¨ä¸¤åŊå
ŗįŗģä¸æååå', 'į įŠļåč˛å°č¡įšéĻå°åžˇåååįąä¸åŊæ', 'æææ˛æ´åĨŊå°įč§Ŗä¸åŊæåã"åĨšč¯´ã', 'įŠåĸįåĄå¯åå
ŗæŗ¨ä¸åŊå¨įģæĩãį§æãæ', 'åąã"é˛å¤ĢåĄč¯´ã', 'åãäģ襨į¤ēīŧâåĻäš åŊŧæ¤įč¯č¨åæåīŧå°å¸Ž', 'âå§å§īŧäŊ æŗåģä¸åŊåīŧ""é常æŗīŧææŗ', 'č˛įéĸåįååąīŧä¸åŊå¨į§į įæšéĸįåŽå', 'åįĢįšéäēéĢįæč˛å§åäŧä¸ģäģģåŠįč¨', 'åŠåä¸ä¸¤åŊäēēæ°æ´åĨŊå°įč§ŖåŊŧæ¤īŧåŠååæš', 'åģįæ
åŽĢãįŦéŋåã"å°¤æ¯æįåĻįä¸æä¸å¯š', 'ä¸æĨäŋąåĸãå¨ä¸åŊåĻäš įįģå莊æįå°æ´åšŋ', 'éŠŦį襨į¤ēīŧ"æ¯åš´æäģŦéŊäŧįģįģåĻįå°ä¸åŊčŽŋ', 'äē¤åžīŧæåģēåč°æĄĨæĸã"', 'čŊæåčįå§åĻšīŧå§å§é˛å¨
äģåš´15å˛īŧåĻšåĻš', 'éįä¸įīŧäģä¸åįåĒæĩ
ã', 'éŽåĻäš īŧįŽåæčļ
čŋ5000ååįĢįšéäēåĻį', 'åįĢįšéäēåŊåŽļåįŠéĻéĻéŋåĄåä¸ÂˇåĒ', 'čå¨
14å˛īŧ两äēēéŊåˇ˛å¨åįšåéĸåĻäš å¤åš´īŧ', '23å˛įččŋĒäē¡åæ¯čæŗč¯ēæ¯åˇ˛å¨åįš', 'å¨ä¸åŊįåĻãåĻäš ä¸åŊįæč˛įģéĒīŧæåŠäē', 'éčžžå§Âˇäŧį´ įĻæžå¤æŦĄčŽŋéŽä¸åŊīŧ寚ä¸åææ', 'ä¸æč¯´åžæ ŧå¤æĩåŠã', 'åéĸåĻäš 3åš´īŧå¨ä¸åŊäšĻæŗãä¸åŊįģįæšéĸ襨', 'æååįĢįšéäēįæč˛æ°´åšŗãâ', 'įäŧ æŋä¸åæ°ãį°äģŖååįŠéĻįåģē莞ä¸ååą', 'é˛å¨
å¯ščŽ°č
č¯´īŧ"čŋäēåš´æĨīŧæį寚ä¸æ', 'į°ååäŧį§īŧå¨2024åš´åįĢįšéäēčĩåēį', 'âå
ąååä¸įåąį¤ēé', 'å°čąĄæˇąåģãâä¸åŊåįŠéĻä¸äģ
æ莸å¤äŋååŽåĨŊ', 'åä¸åŊæåįįįąīŧæäģŦå§åĻšäŋŠå§įģį¸äēéŧ', 'âæąč¯æĄĨ"æ¯čĩä¸čˇåžä¸įåĨãččŋĒäēč¯´īŧ"åĻ', 'įæįŠīŧčŋå
åčŋį¨å
čŋį§æææŽĩčŋčĄåąį¤ēīŧ', 'åąīŧä¸čĩˇåĻäš ãæäģŦįä¸æä¸å¤Šæ¯ä¸å¤ŠåĨŊīŧčŋ', 'äš ä¸åŊäšĻæŗ莊æįå
åŋååžåŽåŽåįē¯į˛šãæ', 'æ´˛åäēæ´˛įįŋįææâ', '帎åŠäēēäģŦæ´åĨŊįč§Ŗä¸åææã"åĄåä¸č¯´īŧ"å', 'åĻäŧäēä¸ææåä¸åŊčãæäģŦä¸åŽčĻå°ä¸åŊ', 'äšåæŦĸä¸åŊįæéĨ°,å¸ææĒæĨčŊåģä¸åŊåĻäš īŧ', 'įĢįšéäēä¸ä¸åŊéŊæĨææ äš
įææīŧå§įģį¸', 'åģãåĻåĨŊä¸æīŧæäģŦįæĒæĨä¸æ¯æĸĻīŧ"', 'æä¸åŊä¸åæ°æå
į´ čå
ĨæčŖ
莞莥ä¸īŧåäŊ', 'äģéŋæ¯éŠŦæåēåīŧæ˛ŋįčŋčæ˛æįįåąą', 'äēįč§Ŗãį¸äēå°éãæå¸ææĒæĨä¸ä¸åŊåčĄ', 'æŽåįšåéĸä¸æšéĸéŋéģé¸ŖéŖäģįģīŧčŋæ', 'åēæ´å¤į˛žįžäŊåīŧäšæåįšæååäēĢįģæ´å¤', 'å
Ŧčˇ¯ä¸čˇ¯åä¸å¯ģæžä¸čˇ¯å°čŋšã銹čŊĻ两ä¸Ēå°', 'å åŧēåäŊīŧå
ąååä¸įåąį¤ēéæ´˛åäēæ´˛įįŋ', 'åéĸæįĢäē2013åš´3æīŧįąč´ĩåˇč´ĸįģ大åĻå', 'įä¸åŊæåãâ', 'æļīŧ莰č
æĨå°äŊäēåįĢįšéäē港åŖåå¸éŠŦč¨', 'įææãâ'], 'rec_scores': array([0.99943757, ..., 0.98181838]), 'rec_boxes': array([[ 133, ..., 131],
...,
[1152, ..., 1359]], dtype=int16)}}}
```
paddlex --pipeline formula_recognition \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/demo_image/pipelines/general_formula_recognition_001.png \
--use_layout_detection True \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--layout_threshold 0.5 \
--layout_nms True \
--layout_unclip_ratio 1.0 \
--layout_merge_bboxes_mode large \
--save_path ./output \
--device gpu:0
What's the result
{'res': {'input_path': 'general_formula_recognition.png', 'model_settings': {'use_doc_preprocessor': False,'use_layout_detection': True}, 'layout_det_res': {'input_path': None, 'boxes': [{'cls_id': 2, 'label': 'text', 'score': 0.9778407216072083, 'coordinate': [271.257, 648.50824, 1040.2291, 774.8482]}, ...]}, 'formula_res_list': [{'rec_formula': '\\small\\begin{aligned}{p(\\mathbf{x})=c(\\mathbf{u})\\prod_{i}p(x_{i}).}\\\\ \\end{aligned}', 'formula_region_id': 1, 'dt_polys': ([553.0718, 802.0996, 758.75635, 853.093],)}, ...]}}
paddlex --pipeline seal_recognition \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/seal_text_det.png \
--use_doc_orientation_classify False \
--use_doc_unwarping False \
--device gpu:0 \
--save_path ./output
What's the result
bash
{'res': {'input_path': 'seal_text_det.png', 'model_settings': {'use_doc_preprocessor': False, 'use_layout_detection': True}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 16, 'label': 'seal', 'score': 0.975529670715332, 'coordinate': [6.191284, 0.16680908, 634.39325, 628.85345]}]}, 'seal_res_list': [{'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': [array([[320, 38],
[479, 92],
[483, 94],
[486, 97],
[579, 226],
[582, 230],
[582, 235],
[584, 383],
[584, 388],
[582, 392],
[578, 396],
[573, 398],
[566, 398],
[502, 380],
[497, 377],
[494, 374],
[491, 369],
[491, 366],
[488, 259],
[424, 172],
[318, 136],
[251, 154],
[200, 174],
[137, 260],
[133, 366],
[132, 370],
[130, 375],
[126, 378],
[123, 380],
[ 60, 398],
[ 55, 398],
[ 49, 397],
[ 45, 394],
[ 43, 390],
[ 41, 383],
[ 43, 236],
[ 44, 230],
[ 45, 227],
[141, 96],
[144, 93],
[148, 90],
[311, 38],
[315, 38]]), array([[461, 347],
[465, 350],
[468, 354],
[470, 360],
[470, 425],
[469, 429],
[467, 433],
[462, 437],
[456, 439],
[169, 439],
[165, 439],
[160, 436],
[157, 432],
[155, 426],
[154, 360],
[155, 356],
[158, 352],
[161, 348],
[168, 346],
[456, 346]]), array([[439, 445],
[441, 447],
[443, 451],
[444, 453],
[444, 497],
[443, 502],
[440, 504],
[437, 506],
[434, 507],
[189, 505],
[184, 504],
[182, 502],
[180, 498],
[179, 496],
[181, 453],
[182, 449],
[184, 446],
[188, 444],
[434, 444]]), array([[158, 468],
[199, 502],
[242, 522],
[299, 534],
[339, 532],
[373, 526],
[417, 508],
[459, 475],
[462, 474],
[467, 474],
[472, 476],
[502, 507],
[503, 510],
[504, 515],
[503, 518],
[501, 521],
[452, 559],
[450, 560],
[391, 584],
[390, 584],
[372, 590],
[370, 590],
[305, 596],
[302, 596],
[224, 581],
[221, 580],
[164, 553],
[162, 551],
[114, 509],
[112, 507],
[111, 503],
[112, 498],
[114, 496],
[146, 468],
[149, 466],
[154, 466]])], 'text_det_params': {'limit_side_len': 736, 'limit_type': 'min', 'thresh': 0.2, 'box_thresh': 0.6, 'unclip_ratio': 0.5}, 'text_type': 'seal', 'textline_orientation_angles': [-1, -1, -1, -1], 'text_rec_score_thresh': 0, 'rec_texts': ['夊æ´Ĩååįŧåč´¸æéå
Ŧå¸', 'åįĨ¨ä¸į¨įĢ ', 'åįšįŠ', '5263647368706'], 'rec_scores': [0.9934046268463135, 0.9999403953552246, 0.998250424861908, 0.9913849234580994], 'rec_polys': [array([[320, 38],
[479, 92],
[483, 94],
[486, 97],
[579, 226],
[582, 230],
[582, 235],
[584, 383],
[584, 388],
[582, 392],
[578, 396],
[573, 398],
[566, 398],
[502, 380],
[497, 377],
[494, 374],
[491, 369],
[491, 366],
[488, 259],
[424, 172],
[318, 136],
[251, 154],
[200, 174],
[137, 260],
[133, 366],
[132, 370],
[130, 375],
[126, 378],
[123, 380],
[ 60, 398],
[ 55, 398],
[ 49, 397],
[ 45, 394],
[ 43, 390],
[ 41, 383],
[ 43, 236],
[ 44, 230],
[ 45, 227],
[141, 96],
[144, 93],
[148, 90],
[311, 38],
[315, 38]]), array([[461, 347],
[465, 350],
[468, 354],
[470, 360],
[470, 425],
[469, 429],
[467, 433],
[462, 437],
[456, 439],
[169, 439],
[165, 439],
[160, 436],
[157, 432],
[155, 426],
[154, 360],
[155, 356],
[158, 352],
[161, 348],
[168, 346],
[456, 346]]), array([[439, 445],
[441, 447],
[443, 451],
[444, 453],
[444, 497],
[443, 502],
[440, 504],
[437, 506],
[434, 507],
[189, 505],
[184, 504],
[182, 502],
[180, 498],
[179, 496],
[181, 453],
[182, 449],
[184, 446],
[188, 444],
[434, 444]]), array([[158, 468],
[199, 502],
[242, 522],
[299, 534],
[339, 532],
[373, 526],
[417, 508],
[459, 475],
[462, 474],
[467, 474],
[472, 476],
[502, 507],
[503, 510],
[504, 515],
[503, 518],
[501, 521],
[452, 559],
[450, 560],
[391, 584],
[390, 584],
[372, 590],
[370, 590],
[305, 596],
[302, 596],
[224, 581],
[221, 580],
[164, 553],
[162, 551],
[114, 509],
[112, 507],
[111, 503],
[112, 498],
[114, 496],
[146, 468],
[149, 466],
[154, 466]])], 'rec_boxes': array([], dtype=float64)}]}}
Computer Vision Pipelines CLI
paddlex --pipeline image_classification --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_image_classification_001.jpg --device gpu:0
What's the result
{'res': {'input_path': 'general_image_classification_001.jpg', 'page_index': None, 'class_ids': array([296, 170, 356, 258, 248], dtype=int32), 'scores': array([0.62736, 0.03752, 0.03256, 0.0323 , 0.03194], dtype=float32), 'label_names': ['ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus', 'Irish wolfhound', 'weasel', 'Samoyed, Samoyede', 'Eskimo dog, husky']}}
paddlex --pipeline object_detection \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_object_detection_002.png \
--threshold 0.5 \
--save_path ./output/ \
--device gpu:0
What's the result
{'res': {'input_path': 'general_object_detection_002.png', 'page_index': None, 'boxes': [{'cls_id': 49, 'label': 'orange', 'score': 0.8188614249229431, 'coordinate': [661.3518, 93.05823, 870.75903, 305.93713]}, {'cls_id': 47, 'label': 'apple', 'score': 0.7745078206062317, 'coordinate': [76.80911, 274.74905, 330.5422, 520.0428]}, {'cls_id': 47, 'label': 'apple', 'score': 0.7271787524223328, 'coordinate': [285.32645, 94.3175, 469.73645, 297.40344]}, {'cls_id': 46, 'label': 'banana', 'score': 0.5576589703559875, 'coordinate': [310.8041, 361.43625, 685.1869, 712.59155]}, {'cls_id': 47, 'label': 'apple', 'score': 0.5490103363990784, 'coordinate': [764.6252, 285.76096, 924.8153, 440.92892]}, {'cls_id': 47, 'label': 'apple', 'score': 0.515821635723114, 'coordinate': [853.9831, 169.41423, 987.803, 303.58615]}, {'cls_id': 60, 'label': 'dining table', 'score': 0.514293372631073, 'coordinate': [0.53089714, 0.32445717, 1072.9534, 720]}, {'cls_id': 47, 'label': 'apple', 'score': 0.510750949382782, 'coordinate': [57.368027, 23.455347, 213.39601, 176.45612]}]}}
paddlex --pipeline instance_segmentation \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_instance_segmentation_004.png \
--threshold 0.5 \
--save_path ./output \
--device gpu:0 \
What's the result
{'res': {'input_path': 'general_instance_segmentation_004.png', 'page_index': None, 'boxes': [{'cls_id': 0, 'label': 'person', 'score': 0.8695873022079468, 'coordinate': [339.83426, 0, 639.8651, 575.22003]}, {'cls_id': 0, 'label': 'person', 'score': 0.8572642803192139, 'coordinate': [0.09976959, 0, 195.07274, 575.358]}, {'cls_id': 0, 'label': 'person', 'score': 0.8201770186424255, 'coordinate': [88.24664, 113.422424, 401.23077, 574.70197]}, {'cls_id': 0, 'label': 'person', 'score': 0.7110118269920349, 'coordinate': [522.54065, 21.457964, 767.5007, 574.2464]}, {'cls_id': 27, 'label': 'tie', 'score': 0.5543721914291382, 'coordinate': [247.38776, 312.4094, 355.2685, 574.1264]}], 'masks': '...'}}
paddlex --pipeline small_object_detection \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/small_object_detection.jpg \
--threshold 0.5 \
--save_path ./output \
--device gpu:0 \
What's the result
{'res': {'input_path': 'small_object_detection.jpg', 'page_index': None, 'boxes': [{'cls_id': 0, 'label': 'pedestrian', 'score': 0.8182944655418396, 'coordinate': [203.60147, 701.3809, 224.2007, 743.8429]}, {'cls_id': 0, 'label': 'pedestrian', 'score': 0.8150849342346191, 'coordinate': [185.01398, 710.8665, 201.76335, 744.9308]}, {'cls_id': 0, 'label': 'pedestrian', 'score': 0.7748839259147644, 'coordinate': [295.1978, 500.2161, 309.33438, 532.0253]}, {'cls_id': 0, 'label': 'pedestrian', 'score': 0.7688254714012146, 'coordinate': [851.5233, 436.13293, 863.2146, 466.8981]}, {'cls_id': 0, 'label': 'pedestrian', 'score': 0.689735472202301, 'coordinate': [802.1584, 460.10693, 815.6586, 488.85086]}, {'cls_id': 0, 'label': 'pedestrian', 'score': 0.6697502136230469, 'coordinate': [479.947, 309.43323, 489.1534, 332.5485]}, ...]}}
paddlex --pipeline human_keypoint_detection \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/keypoint_detection_001.jpg \
--det_threshold 0.5 \
--save_path ./output/ \
--device gpu:0
What's the result
{'res': {'input_path': 'keypoint_detection_001.jpg', 'boxes': [{'coordinate': [325.65088, 74.46718, 391.5512, 209.46529], 'det_score': 0.9316536784172058, 'keypoints': array([[351.6419 , 84.80058 , 0.79337054],
[353.9377 , 82.47209 , 0.7778817 ],
[349.12946 , 83.09801 , 0.7885327 ],
[359.24466 , 83.369225 , 0.80503 ],
[347.46167 , 84.1535 , 0.8710606 ],
[368.82172 , 101.33514 , 0.88625187],
[339.8064 , 99.65537 , 0.8432633 ],
[371.2092 , 123.35563 , 0.7728337 ],
[337.78214 , 121.36371 , 0.9310819 ],
[368.81366 , 142.71593 , 0.79723483],
[337.53455 , 139.85892 , 0.877297 ],
[363.0265 , 141.82988 , 0.7964988 ],
[345.3075 , 141.98972 , 0.7532031 ],
[374.60806 , 171.42578 , 0.7530604 ],
[339.11694 , 167.98814 , 0.7255032 ],
[382.67047 , 197.82553 , 0.73685765],
[336.79745 , 196.5194 , 0.626142 ]], dtype=float32), 'kpt_score': 0.7961825}, {'coordinate': [271.96713, 69.02892, 336.77832, 217.54662], 'det_score': 0.9304604530334473, 'keypoints': array([[294.48553 , 84.144104 , 0.74851245],
[297.09854 , 80.97825 , 0.7341483 ],
[292.39313 , 81.7721 , 0.74603605],
[302.3231 , 81.528275 , 0.7586238 ],
[290.6292 , 83.26544 , 0.7514231 ],
[313.32928 , 98.40588 , 0.83778954],
[286.23532 , 101.702194 , 0.91927457],
[321.99515 , 120.05991 , 0.90197486],
[282.39294 , 122.16547 , 0.74502975],
[327.164 , 141.25995 , 0.8172762 ],
[279.1632 , 133.16023 , 0.59161717],
[311.02557 , 142.31526 , 0.82111686],
[294.72357 , 143.42067 , 0.71559554],
[313.98828 , 174.17151 , 0.7495116 ],
[291.76605 , 174.39961 , 0.7645517 ],
[321.4924 , 202.4499 , 0.7817023 ],
[293.70663 , 204.9227 , 0.72405976]], dtype=float32), 'kpt_score': 0.77107316}, {'coordinate': [293.55933, 188.65804, 419.47382, 305.4712], 'det_score': 0.9179267883300781, 'keypoints': array([[3.3565637e+02, 2.0941801e+02, 8.1438643e-01],
[3.3636591e+02, 2.0724442e+02, 7.7529407e-01],
[3.3486487e+02, 2.0653752e+02, 8.3719862e-01],
[3.4387805e+02, 2.0405179e+02, 7.9793924e-01],
[3.4104437e+02, 2.0354083e+02, 6.7090714e-01],
[3.5167136e+02, 2.1253050e+02, 5.9533423e-01],
[3.5493774e+02, 2.1316977e+02, 5.1632988e-01],
[3.2814764e+02, 2.1943013e+02, 5.3697169e-01],
[3.2577945e+02, 2.2027420e+02, 1.6555195e-01],
[3.1541614e+02, 2.2199020e+02, 5.2568728e-01],
[3.1139435e+02, 2.2925937e+02, 2.2075935e-01],
[3.8441351e+02, 2.4341478e+02, 6.4083064e-01],
[3.8714008e+02, 2.4532764e+02, 6.4894527e-01],
[3.5143246e+02, 2.5615021e+02, 7.7424920e-01],
[3.7133820e+02, 2.7552402e+02, 5.8704698e-01],
[3.6274625e+02, 2.8303183e+02, 6.1670756e-01],
[4.0358893e+02, 2.9351334e+02, 4.2383862e-01]], dtype=float32), 'kpt_score': 0.5969399}, {'coordinate': [238.98825, 182.67476, 372.81628, 307.61395], 'det_score': 0.914400041103363, 'keypoints': array([[282.9012 , 208.31485 , 0.6685285 ],
[282.95908 , 204.36131 , 0.66104335],
[280.90683 , 204.54018 , 0.7281709 ],
[274.7831 , 204.04141 , 0.54541856],
[270.97324 , 203.04889 , 0.73486483],
[269.43472 , 217.63014 , 0.6707946 ],
[256.871 , 216.546 , 0.89603853],
[277.03226 , 238.2196 , 0.4412233 ],
[262.29578 , 241.33434 , 0.791063 ],
[292.90753 , 251.69914 , 0.4993091 ],
[285.6907 , 252.71925 , 0.7215052 ],
[279.36578 , 261.8949 , 0.6626504 ],
[270.43402 , 268.07068 , 0.80625033],
[311.96924 , 261.36716 , 0.67315185],
[309.32407 , 262.97354 , 0.72746485],
[345.22446 , 285.02255 , 0.60142016],
[334.69235 , 291.57108 , 0.7674925 ]], dtype=float32), 'kpt_score': 0.6821406}, {'coordinate': [66.23172, 93.531204, 124.48463, 217.99655], 'det_score': 0.9086756110191345, 'keypoints': array([[ 91.04524 , 108.79487 , 0.8234256 ],
[ 92.67917 , 106.63517 , 0.79848343],
[ 88.41122 , 106.8017 , 0.8122996 ],
[ 95.353096 , 106.96488 , 0.85210425],
[ 84.35098 , 107.85205 , 0.971826 ],
[ 99.92103 , 119.87272 , 0.853371 ],
[ 79.69138 , 121.08684 , 0.8854925 ],
[103.019554 , 135.00996 , 0.73513967],
[ 72.38997 , 136.8782 , 0.7727014 ],
[104.561935 , 146.01869 , 0.8377464 ],
[ 72.70636 , 151.44576 , 0.67577386],
[ 98.69484 , 151.30742 , 0.8381225 ],
[ 85.946 , 152.07056 , 0.7904873 ],
[106.64397 , 175.77159 , 0.8179414 ],
[ 84.6963 , 178.4353 , 0.8094256 ],
[111.30463 , 201.2306 , 0.74394226],
[ 80.08708 , 204.05814 , 0.8457697 ]], dtype=float32), 'kpt_score': 0.8155325}, {'coordinate': [160.1294, 78.35935, 212.01868, 153.2241], 'det_score': 0.8295672535896301, 'keypoints': array([[1.89240387e+02, 9.08055573e+01, 7.36447990e-01],
[1.91318649e+02, 8.84640198e+01, 7.86390483e-01],
[1.87943207e+02, 8.88532104e+01, 8.23230743e-01],
[1.95832245e+02, 8.76751480e+01, 6.76276207e-01],
[1.86741409e+02, 8.96744080e+01, 7.87400603e-01],
[2.04019852e+02, 9.83068924e+01, 7.34004617e-01],
[1.85355087e+02, 9.81262970e+01, 6.23330474e-01],
[2.01501678e+02, 1.12709480e+02, 2.93740422e-01],
[1.80446320e+02, 1.11967369e+02, 5.50001860e-01],
[1.95137482e+02, 9.73322601e+01, 4.24658984e-01],
[1.74287552e+02, 1.21760696e+02, 3.51236403e-01],
[1.97997589e+02, 1.24219963e+02, 3.45360219e-01],
[1.83250824e+02, 1.22610085e+02, 4.38733459e-01],
[1.96233871e+02, 1.22864418e+02, 5.36903977e-01],
[1.66795364e+02, 1.25634903e+02, 3.78726840e-01],
[1.80727753e+02, 1.42604034e+02, 2.78717279e-01],
[1.75880920e+02, 1.41181213e+02, 1.70833692e-01]], dtype=float32), 'kpt_score': 0.5256467}, {'coordinate': [52.482475, 59.36664, 96.47121, 135.45993], 'det_score': 0.7726763486862183, 'keypoints': array([[ 73.98227 , 74.01257 , 0.71940714],
[ 75.44208 , 71.73432 , 0.6955297 ],
[ 72.20365 , 71.9637 , 0.6138198 ],
[ 77.7856 , 71.665825 , 0.73568064],
[ 69.342285 , 72.25549 , 0.6311799 ],
[ 83.1019 , 77.65522 , 0.7037722 ],
[ 64.89729 , 78.846565 , 0.56623787],
[ 85.16928 , 88.88764 , 0.5665537 ],
[ 61.65655 , 89.35312 , 0.4463089 ],
[ 80.01986 , 91.51777 , 0.30305162],
[ 70.90767 , 89.90153 , 0.48063472],
[ 78.70658 , 97.33488 , 0.39359188],
[ 68.3219 , 97.67902 , 0.41903985],
[ 80.69448 , 109.193985 , 0.14496553],
[ 65.57641 , 105.08109 , 0.27744702],
[ 79.44859 , 122.69015 , 0.17710638],
[ 64.03736 , 120.170425 , 0.46565098]], dtype=float32), 'kpt_score': 0.4905869}, {'coordinate': [7.081953, 80.3705, 46.81927, 161.72012], 'det_score': 0.6587498784065247, 'keypoints': array([[ 29.51531 , 91.49908 , 0.75517464],
[ 31.225754 , 89.82169 , 0.7765606 ],
[ 27.376017 , 89.71614 , 0.80448 ],
[ 33.515877 , 90.82257 , 0.7093001 ],
[ 23.521307 , 90.84212 , 0.777707 ],
[ 37.539314 , 101.381516 , 0.6913692 ],
[ 18.340288 , 102.41546 , 0.7203535 ],
[ 39.826218 , 113.37301 , 0.5913918 ],
[ 16.857304 , 115.10882 , 0.5492331 ],
[ 28.826103 , 121.861855 , 0.39205936],
[ 22.47133 , 120.69003 , 0.6120081 ],
[ 34.177963 , 126.15756 , 0.5601723 ],
[ 21.39047 , 125.30078 , 0.5064371 ],
[ 27.961575 , 133.33154 , 0.54826814],
[ 22.303364 , 129.8608 , 0.2293001 ],
[ 31.242027 , 153.047 , 0.36292207],
[ 21.80127 , 153.78947 , 0.30531448]], dtype=float32), 'kpt_score': 0.58188534}, {'coordinate': [126.131096, 30.263107, 168.5759, 134.09885], 'det_score': 0.6441988348960876, 'keypoints': array([[149.89236 , 43.87846 , 0.75441885],
[151.99484 , 41.95912 , 0.82070917],
[148.18002 , 41.775055 , 0.8453321 ],
[155.37967 , 42.06968 , 0.83349544],
[145.38167 , 41.69159 , 0.8233239 ],
[159.26329 , 53.284737 , 0.86246717],
[142.35178 , 51.206886 , 0.6940705 ],
[157.3975 , 71.31917 , 0.7624757 ],
[136.59795 , 66.40522 , 0.55612797],
[142.90988 , 78.28269 , 0.779243 ],
[135.43607 , 73.9765 , 0.5737738 ],
[155.7851 , 82.44225 , 0.6966109 ],
[143.4588 , 80.91763 , 0.60589534],
[153.45274 , 102.84818 , 0.62720954],
[131.59738 , 87.54947 , 0.4976839 ],
[155.56401 , 125.58888 , 0.5414401 ],
[139.57607 , 122.08866 , 0.26570275]], dtype=float32), 'kpt_score': 0.67882234}, {'coordinate': [112.50212, 64.127, 150.35353, 125.85529], 'det_score': 0.5013833045959473, 'keypoints': array([[1.35197662e+02, 7.29378281e+01, 5.58694899e-01],
[1.36285202e+02, 7.16439133e+01, 6.38598502e-01],
[1.33776855e+02, 7.16437454e+01, 6.36756659e-01],
[1.37833389e+02, 7.24015121e+01, 4.13749218e-01],
[1.31340057e+02, 7.30362549e+01, 5.70683837e-01],
[1.42542435e+02, 8.28875885e+01, 2.30803847e-01],
[1.29773300e+02, 8.52729874e+01, 4.94463116e-01],
[1.41332916e+02, 9.43963928e+01, 9.36751068e-02],
[1.28858521e+02, 9.95147858e+01, 2.72373617e-01],
[1.44981277e+02, 7.83604965e+01, 8.68032947e-02],
[1.34379593e+02, 8.23366165e+01, 1.67876005e-01],
[1.37895874e+02, 1.08476562e+02, 1.58305198e-01],
[1.30837265e+02, 1.07525513e+02, 1.45044222e-01],
[1.31290604e+02, 1.02961494e+02, 7.68775940e-02],
[1.17951675e+02, 1.07433502e+02, 2.09531561e-01],
[1.29175934e+02, 1.14402641e+02, 1.46551579e-01],
[1.27901909e+02, 1.16773926e+02, 2.08665460e-01]], dtype=float32), 'kpt_score': 0.3005561}]}}
paddlex --pipeline open_vocabulary_segmentation \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/open_vocabulary_segmentation.jpg \
--prompt_type box \
--prompt "[[112.9,118.4,513.8,382.1],[4.6,263.6,92.2,336.6],[592.4,260.9,607.2,294.2]]" \
--save_path ./output \
--device gpu:0
What's the result
{'res': {'input_path': 'open_vocabulary_segmentation.jpg', 'prompts': {'box_prompt': [[112.9, 118.4, 513.8, 382.1], [4.6, 263.6, 92.2, 336.6], [592.4, 260.9, 607.2, 294.2]]}, 'masks': '...', 'mask_infos': [{'label': 'box_prompt', 'prompt': [112.9, 118.4, 513.8, 382.1]}, {'label': 'box_prompt', 'prompt': [4.6, 263.6, 92.2, 336.6]}, {'label': 'box_prompt', 'prompt': [592.4, 260.9, 607.2, 294.2]}]}}
paddlex --pipeline open_vocabulary_detection \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/open_vocabulary_detection.jpg \
--prompt "bus . walking man . rearview mirror ." \
--thresholds "{'text_threshold': 0.25, 'box_threshold': 0.3}" \
--save_path ./output \
--device gpu:0
What's the result
{'res': {'input_path': 'open_vocabulary_detection.jpg', 'page_index': None, 'boxes': [{'coordinate': [112.10542297363281, 117.93667602539062, 514.35693359375, 382.10150146484375], 'label': 'bus', 'score': 0.9348853230476379}, {'coordinate': [264.1828918457031, 162.6674346923828, 286.8844909667969, 201.86187744140625], 'label': 'rearview mirror', 'score': 0.6022508144378662}, {'coordinate': [606.1133422851562, 254.4973907470703, 622.56982421875, 293.7867126464844], 'label': 'walking man', 'score': 0.4384709894657135}, {'coordinate': [591.8192138671875, 260.2451171875, 607.3953247070312, 294.2210388183594], 'label': 'man', 'score': 0.3573091924190521}]}}
paddlex --pipeline pedestrian_attribute_recognition --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pedestrian_attribute_002.jpg --device gpu:0
What's the result
{'res': {'input_path': 'pedestrian_attribute_002.jpg', 'boxes': [{'labels': ['Trousers(éŋčŖ¤)', 'Age18-60(åš´éžå¨18-60å˛äšé´)', 'LongCoat(éŋå¤åĨ)', 'Side(䞧éĸ)'], 'cls_scores': array([0.99965, 0.99963, 0.98866, 0.9624 ]), 'det_score': 0.9795178771018982, 'coordinate': [87.24581, 322.5872, 546.2697, 1039.9852]}, {'labels': ['Trousers(éŋčŖ¤)', 'LongCoat(éŋå¤åĨ)', 'Front(éĸæå)', 'Age18-60(åš´éžå¨18-60å˛äšé´)'], 'cls_scores': array([0.99996, 0.99872, 0.93379, 0.71614]), 'det_score': 0.967143177986145, 'coordinate': [737.91626, 306.287, 1150.5961, 1034.2979]}, {'labels': ['Trousers(éŋčŖ¤)', 'LongCoat(éŋå¤åĨ)', 'Age18-60(åš´éžå¨18-60å˛äšé´)', 'Side(䞧éĸ)'], 'cls_scores': array([0.99996, 0.99514, 0.98726, 0.96224]), 'det_score': 0.9645745754241943, 'coordinate': [399.45944, 281.9107, 869.5312, 1038.9962]}]}}
paddlex --pipeline vehicle_attribute_recognition --input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/vehicle_attribute_002.jpg --device gpu:0
What's the result
{'res': {'input_path': 'vehicle_attribute_002.jpg', 'boxes': [{'labels': ['red(įēĸč˛)', 'sedan(čŊŋčŊĻ)'], 'cls_scores': array([0.96375, 0.94025]), 'det_score': 0.9774094820022583, 'coordinate': [196.32553, 302.3847, 639.3131, 655.57904]}, {'labels': ['suv(SUV)', 'brown(æŖč˛)'], 'cls_scores': array([0.99968, 0.99317]), 'det_score': 0.9705657958984375, 'coordinate': [769.4419, 278.8417, 1401.0217, 641.3569]}]}}
paddlex --pipeline rotated_object_detection \
--input https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/rotated_object_detection_001.png \
--threshold 0.5 \
--save_path ./output \
--device gpu:0 \
What's the result
{'res': {'input_path': 'rotated_object_detection_001.png', 'page_index': None, 'boxes': [{'cls_id': 4, 'label': 'small-vehicle', 'score': 0.7409099340438843, 'coordinate': [92.88687, 763.1569, 85.163124, 749.5868, 116.07975, 731.99414, 123.8035, 745.5643]}, {'cls_id': 4, 'label': 'small-vehicle', 'score': 0.7393015623092651, 'coordinate': [348.2332, 177.55974, 332.77704, 150.24973, 345.2183, 143.21028, 360.67444, 170.5203]}, {'cls_id': 11, 'label': 'roundabout', 'score': 0.8101699948310852, 'coordinate': [537.1732, 695.5475, 204.4297, 612.9735, 286.71338, 281.48022, 619.4569, 364.05426]}]}}
Time Series-related CLI
paddlex --pipeline ts_forecast --input https://paddle-model-ecology.bj.bcebos.com/paddlex/ts/demo_ts/ts_fc.csv --save_path ./output --device gpu:0
What's the result
{'input_path': 'ts_fc.csv', 'forecast': OT
date
2018-06-26 20:00:00 9.586131
2018-06-26 21:00:00 9.379762
2018-06-26 22:00:00 9.252275
2018-06-26 23:00:00 9.249993
2018-06-27 00:00:00 9.164998
... ...
2018-06-30 15:00:00 8.830340
2018-06-30 16:00:00 9.291553
2018-06-30 17:00:00 9.097666
2018-06-30 18:00:00 8.905430
2018-06-30 19:00:00 8.993793
[96 rows x 1 columns]}
Speech-related Pipelines CLI
paddlex --pipeline multilingual_speech_recognition \
--input https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav \
--save_path ./output \
--device gpu:0
What's the result
{'input_path': 'zh.wav', 'result': {'text': 'æ莤ä¸ēčˇæĨæéčĻįå°ąæ¯įģæå¸ĻæĨäēčēĢäŊåĨåēˇ', 'segments': [{'id': 0, 'seek': 0, 'start': 0.0, 'end': 2.0, 'text': 'æ莤ä¸ēčˇæĨæéčĻįå°ąæ¯', 'tokens': [50364, 1654, 7422, 97, 13992, 32585, 31429, 8661, 24928, 1546, 5620, 50464, 50464, 49076, 4845, 99, 34912, 19847, 29485, 44201, 6346, 115, 50564], 'temperature': 0, 'avg_logprob': -0.22779104113578796, 'compression_ratio': 0.28169014084507044, 'no_speech_prob': 0.026114309206604958}, {'id': 1, 'seek': 200, 'start': 2.0, 'end': 31.0, 'text': 'įģæå¸ĻæĨäēčēĢäŊåĨåēˇ', 'tokens': [50364, 49076, 4845, 99, 34912, 19847, 29485, 44201, 6346, 115, 51814], 'temperature': 0, 'avg_logprob': -0.21976988017559052, 'compression_ratio': 0.23684210526315788, 'no_speech_prob': 0.009023111313581467}], 'language': 'zh'}}
Video-related Pipelines CLI
paddlex --pipeline video_detection --input https://paddle-model-ecology.bj.bcebos.com/paddlex/videos/demo_video/HorseRiding.avi --device gpu:0 --save_path output
What's the result
{'input_path': 'HorseRiding.avi', 'result': [[[[110, 40, 170, 171], 0.8385784886274905, 'HorseRiding']], [[[112, 31, 168, 167], 0.8587647461352432, 'HorseRiding']], [[[106, 28, 164, 165], 0.8579590929730969, 'HorseRiding']], [[[106, 24, 165, 171], 0.8743957465404151, 'HorseRiding']], [[[107, 22, 165, 172], 0.8488322619908999, 'HorseRiding']], [[[112, 22, 173, 171], 0.8446755521458691, 'HorseRiding']], [[[115, 23, 177, 176], 0.8454028365262367, 'HorseRiding']], [[[117, 22, 178, 179], 0.8484261880748285, 'HorseRiding']], [[[117, 22, 181, 181], 0.8319480115446183, 'HorseRiding']], [[[117, 39, 182, 183], 0.820551099084625, 'HorseRiding']], [[[117, 41, 183, 185], 0.8202395831914338, 'HorseRiding']], [[[121, 47, 185, 190], 0.8261058921745246, 'HorseRiding']], [[[123, 46, 188, 196], 0.8307278306829033, 'HorseRiding']], [[[125, 44, 189, 197], 0.8259781361122833, 'HorseRiding']], [[[128, 47, 191, 195], 0.8227593229866699, 'HorseRiding']], [[[127, 44, 192, 193], 0.8205373129456817, 'HorseRiding']], [[[129, 39, 192, 185], 0.8223318812628619, 'HorseRiding']], [[[127, 31, 196, 179], 0.8501208612019866, 'HorseRiding']], [[[128, 22, 193, 171], 0.8315708410681566, 'HorseRiding']], [[[130, 22, 192, 169], 0.8318588228062005, 'HorseRiding']], [[[132, 18, 193, 170], 0.8310494469100611, 'HorseRiding']], [[[132, 18, 194, 172], 0.8302132445350239, 'HorseRiding']], [[[133, 18, 194, 176], 0.8339063714162727, 'HorseRiding']], [[[134, 26, 200, 183], 0.8365876380675275, 'HorseRiding']], [[[133, 16, 198, 182], 0.8395230321418268, 'HorseRiding']], [[[133, 17, 199, 184], 0.8198139782724922, 'HorseRiding']], [[[140, 28, 204, 189], 0.8344166596681291, 'HorseRiding']], [[[139, 27, 204, 187], 0.8412694521771158, 'HorseRiding']], [[[139, 28, 204, 185], 0.8500098862888805, 'HorseRiding']], [[[135, 19, 199, 179], 0.8506627974981384, 'HorseRiding']], [[[132, 15, 201, 178], 0.8495054272547193, 'HorseRiding']], [[[136, 14, 199, 173], 0.8451630721500223, 'HorseRiding']], [[[136, 12, 200, 167], 0.8366456814214907, 'HorseRiding']], [[[133, 8, 200, 168], 0.8457252233401213, 'HorseRiding']], [[[131, 7, 197, 162], 0.8400586356358062, 'HorseRiding']], [[[131, 8, 195, 163], 0.8320492682901985, 'HorseRiding']], [[[129, 4, 194, 159], 0.8298043752822792, 'HorseRiding']], [[[127, 5, 194, 162], 0.8348390851948722, 'HorseRiding']], [[[125, 7, 190, 164], 0.8299688814865505, 'HorseRiding']], [[[125, 6, 191, 164], 0.8303107088154711, 'HorseRiding']], [[[123, 8, 190, 168], 0.8348342187965798, 'HorseRiding']], [[[125, 14, 189, 170], 0.8356523950497134, 'HorseRiding']], [[[127, 18, 191, 171], 0.8392671764931521, 'HorseRiding']], [[[127, 30, 193, 178], 0.8441704160826191, 'HorseRiding']], [[[128, 18, 190, 181], 0.8438125326146775, 'HorseRiding']], [[[128, 12, 189, 186], 0.8390128962093542, 'HorseRiding']], [[[129, 15, 190, 185], 0.8471056476788448, 'HorseRiding']], [[[129, 16, 191, 184], 0.8536121834731034, 'HorseRiding']], [[[129, 16, 192, 185], 0.8488154629800881, 'HorseRiding']], [[[128, 15, 194, 184], 0.8417711698421471, 'HorseRiding']], [[[129, 13, 195, 187], 0.8412510238991473, 'HorseRiding']], [[[129, 14, 191, 187], 0.8404350980083457, 'HorseRiding']], [[[129, 13, 190, 189], 0.8382891279858882, 'HorseRiding']], [[[129, 11, 187, 191], 0.8318282305903217, 'HorseRiding']], [[[128, 8, 188, 195], 0.8043430817880264, 'HorseRiding']], [[[131, 25, 193, 199], 0.826184954516826, 'HorseRiding']], [[[124, 35, 191, 203], 0.8270462809459467, 'HorseRiding']], [[[121, 38, 191, 206], 0.8350931715324705, 'HorseRiding']], [[[124, 41, 195, 212], 0.8331239341053625, 'HorseRiding']], [[[128, 42, 194, 211], 0.8343046153103657, 'HorseRiding']], [[[131, 40, 192, 203], 0.8309784496027532, 'HorseRiding']], [[[130, 32, 195, 202], 0.8316640083647542, 'HorseRiding']], [[[135, 30, 196, 197], 0.8272172409555161, 'HorseRiding']], [[[131, 16, 197, 186], 0.8388410406147955, 'HorseRiding']], [[[134, 15, 202, 184], 0.8485738297037244, 'HorseRiding']], [[[136, 15, 209, 182], 0.8529430205135213, 'HorseRiding']], [[[134, 13, 218, 182], 0.8601191479922718, 'HorseRiding']], [[[144, 10, 213, 183], 0.8591963099263467, 'HorseRiding']], [[[151, 12, 219, 184], 0.8617965108346937, 'HorseRiding']], [[[151, 10, 220, 186], 0.8631923599955371, 'HorseRiding']], [[[145, 10, 216, 186], 0.8800860885204287, 'HorseRiding']], [[[144, 10, 216, 186], 0.8858840451538228, 'HorseRiding']], [[[146, 11, 214, 190], 0.8773644144886106, 'HorseRiding']], [[[145, 24, 214, 193], 0.8605544385867248, 'HorseRiding']], [[[146, 23, 214, 193], 0.8727294882672254, 'HorseRiding']], [[[148, 22, 212, 198], 0.8713131467067079, 'HorseRiding']], [[[146, 29, 213, 197], 0.8579099324651196, 'HorseRiding']], [[[154, 29, 217, 199], 0.8547794072847914, 'HorseRiding']], [[[151, 26, 217, 203], 0.8641733722316758, 'HorseRiding']], [[[146, 24, 212, 199], 0.8613466257602624, 'HorseRiding']], [[[142, 25, 210, 194], 0.8492670944810214, 'HorseRiding']], [[[134, 24, 204, 192], 0.8428117300203049, 'HorseRiding']], [[[136, 25, 204, 189], 0.8486779356971397, 'HorseRiding']], [[[127, 21, 199, 179], 0.8513896296400709, 'HorseRiding']], [[[125, 10, 192, 192], 0.8510201771386576, 'HorseRiding']], [[[124, 8, 191, 192], 0.8493999019508465, 'HorseRiding']], [[[121, 8, 192, 193], 0.8487097098892171, 'HorseRiding']], [[[119, 6, 187, 193], 0.847543279648022, 'HorseRiding']], [[[118, 12, 190, 190], 0.8503535936620565, 'HorseRiding']], [[[122, 22, 189, 194], 0.8427901493276977, 'HorseRiding']], [[[118, 24, 188, 195], 0.8418835400352087, 'HorseRiding']], [[[120, 25, 188, 205], 0.847192725785284, 'HorseRiding']], [[[122, 25, 189, 207], 0.8444105420674433, 'HorseRiding']], [[[120, 23, 189, 208], 0.8470784016639392, 'HorseRiding']], [[[121, 23, 188, 205], 0.843428111269418, 'HorseRiding']], [[[117, 23, 186, 206], 0.8420809714166708, 'HorseRiding']], [[[119, 5, 199, 197], 0.8288265053231356, 'HorseRiding']], [[[121, 8, 192, 195], 0.8197548738023599, 'HorseRiding']]]}
đ Python Usage¶
A few lines of code can complete the quick inference of the production line, with a unified Python script format as follows:
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline=[production line name])
output = pipeline.predict([input image name])
for res in output:
res.print()
res.save_to_img("./output/")
res.save_to_json("./output/")
create_pipeline()
instantiates the production line object- Pass in the image and call the
predict
method of the production line object for inference prediction - Process the prediction results
OCR-related Python
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="OCR")
output = pipeline.predict(
input="./general_ocr_002.png",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_textline_orientation=False,
)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline_name="table_recognition")
output = pipeline.predict(
input="table_recognition.jpg",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
)
for res in output:
res.print()
res.save_to_img("./output/")
res.save_to_xlsx("./output/")
res.save_to_html("./output/")
res.save_to_json("./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline_name="table_recognition_v2")
output = pipeline.predict(
input="table_recognition.jpg",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
)
for res in output:
res.print()
res.save_to_img("./output/")
res.save_to_xlsx("./output/")
res.save_to_html("./output/")
res.save_to_json("./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="layout_parsing")
output = pipeline.predict(
input="./demo_paper.png",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_textline_orientation=False,
)
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img(save_path="./output/") ## Save the result in img format
res.save_to_json(save_path="./output/") ## Save the result in json format
res.save_to_xlsx(save_path="./output/") ## Save the result in table format
res.save_to_html(save_path="./output/") ## Save the result in html format
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="layout_parsing_v2")
output = pipeline.predict(
input="./layout_parsing_v2_demo.png",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
use_textline_orientation=False,
)
for res in output:
res.print()
res.save_to_json(save_path="output")
res.save_to_markdown(save_path="output")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="formula_recognition")
output = pipeline.predict(
input="./general_formula_recognition_001.png",
use_layout_detection=True,
use_doc_orientation_classify=False,
use_doc_unwarping=False,
layout_threshold=0.5,
layout_nms=True,
layout_unclip_ratio=1.0,
layout_merge_bboxes_mode="large"
)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="seal_recognition")
output = pipeline.predict(
"seal_text_det.png",
use_doc_orientation_classify=False,
use_doc_unwarping=False,
)
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img("./output/") ## Save the visualization result
res.save_to_json("./output/") ## Save the visualization result
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="doc_preprocessor")
output = pipeline.predict(
input="doc_test_rotated.jpg",
use_doc_orientation_classify=True,
use_doc_unwarping=True,
)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
Computer Vision Pipeline Command-Line Usage
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="image_classification")
output = pipeline.predict("general_image_classification_001.jpg")
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img(save_path="./output/") ## Save the visualized result image
res.save_to_json(save_path="./output/") ## Save the structured output of the prediction
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline_name="semantic_segmentation")
output = pipeline.predict(input="general_semantic_segmentation_002.png", target_size = -1)
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="image_multilabel_classification")
output = pipeline.predict("general_image_classification_001.jpg")
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img("./output/") ## Save the visualized result image
res.save_to_json("./output/") ## Save the structured output of the prediction
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="anomaly_detection")
output = pipeline.predict(input="uad_grid.png")
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img(save_path="./output/") ## Save the visualized result image
res.save_to_json(save_path="./output/") ## Save the structured output of the prediction
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="3d_bev_detection")
output = pipeline.predict("./data/nuscenes_demo/nuscenes_infos_val.pkl")
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_json("./output/") ## Save the result to a JSON file
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline_name="open_vocabulary_segmentation")
output = pipeline.predict(input="open_vocabulary_segmentation.jpg", prompt_type="box", prompt=[[112.9,118.4,513.8,382.1],[4.6,263.6,92.2,336.6],[592.4,260.9,607.2,294.2]])
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline_name="open_vocabulary_detection")
output = pipeline.predict(input="open_vocabulary_detection.jpg", prompt="bus . walking man . rearview mirror .")
for res in output:
res.print()
res.save_to_img(save_path="./output/")
res.save_to_json(save_path="./output/")
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="pedestrian_attribute_recognition")
output = pipeline.predict("pedestrian_attribute_002.jpg")
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img("./output/") ## Save the visualized result image
res.save_to_json("./output/") ## Save the structured output of the prediction
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="vehicle_attribute_recognition")
output = pipeline.predict("vehicle_attribute_002.jpg")
for res in output:
res.print() ## Print the structured output of the prediction
res.save_to_img("./output/") ## Save the visualized result image
res.save_to_json("./output/") ## Save the structured output of the prediction
Command Line Usage for Time Series Production Lines
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="ts_forecast")
output = pipeline.predict(input="ts_fc.csv")
for res in output:
res.print() ## Print the structured prediction output
res.save_to_csv(save_path="./output/") ## Save results in CSV format
res.save_to_json(save_path="./output/") ## Save results in JSON format
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="ts_anomaly_detection")
output = pipeline.predict("ts_ad.csv")
for res in output:
res.print() ## Print the structured prediction output
res.save_to_csv(save_path="./output/") ## Save results in CSV format
res.save_to_json(save_path="./output/") ## Save results in JSON format
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="ts_classification")
output = pipeline.predict("ts_cls.csv")
for res in output:
res.print() ## Print the structured prediction output
res.save_to_csv(save_path="./output/") ## Save results in CSV format
res.save_to_json(save_path="./output/") ## Save results in JSON format
Command Line Usage for Speech Production Lines
Command Line Usage for Video Production Lines
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="video_detection")
output = pipeline.predict(input="HorseRiding.avi")
for res in output:
res.print() ## Print the structured prediction output
res.save_to_video(save_path="./output/") ## Save the visualized video results
res.save_to_json(save_path="./output/") ## Save the structured prediction output
đ Detailed Tutorials¶
-
PP-ChatOCRv4
Document Scene Information Extraction v4 (PP-ChatOCRv4) is a PaddlePaddle-based intelligent document and image analysis solution that integrates LLM (Large Language Model), MLLM (Multimodal Large Language Model), and OCR (Optical Character Recognition) technologies. It provides a one-stop solution for common challenges in complex document information extraction, such as layout analysis, rare character recognition, multi-page PDFs, table extraction, and seal detection.
-
OCR
The general OCR production line is used to solve text recognition tasks, extract text information from images, and output it in text form. Based on the end-to-end OCR system, it can achieve millisecond-level precise text content prediction on CPUs and reach open-source SOTA in general scenarios.
-
General Layout Parsing v2
The General Layout Parsing v2 pipeline enhances the capabilities of layout area detection, table recognition, and formula recognition based on the General Layout Parsing v1 pipeline. It also adds the ability to restore multi-column reading order and convert results to Markdown files. It performs well on various document datasets and can handle more complex document data.
-
General Table Recognition Pipeline v2
General Table Recognition Pipeline v2 is designed to solve table recognition tasks by identifying tables in images and outputting them in HTML format. This pipeline enables precise table prediction and is applicable across various fields, including general, manufacturing, finance, and transportation.
-
Small Object Detection
Small object detection is a technology specifically designed to recognize smaller objects in images, widely used in surveillance, unmanned driving, and satellite image analysis fields. It can accurately locate and classify small-sized objects such as pedestrians, traffic signs, or small animals from complex scenes.
-
Time Series Forecasting
Time series forecasting is a technique that uses historical data to predict future trends by analyzing the patterns of change in time series data. It is widely used in financial markets, weather forecasting, and sales forecasting fields.
đŦ Discussion¶
We warmly welcome and encourage community members to raise questions, share ideas, and feedback in the Discussions section. Whether you want to report a bug, discuss a feature request, seek help, or just want to keep up with the latest project news, this is a great platform.