Skip to content

Time Series Classification Pipeline Tutorial

1. Introduction to General Time Series Classification Pipeline

Time series classification is a technique that categorizes time-series data into predefined classes, widely applied in fields such as behavior recognition and financial trend analysis. By analyzing features that vary over time, it identifies different patterns or events, for example, classifying a speech signal as "greeting" or "request," or categorizing stock price movements as "rising" or "falling." Time series classification typically employs machine learning and deep learning models, effectively capturing temporal dependencies and variation patterns to provide accurate classification labels for data. This technology plays a pivotal role in applications such as intelligent monitoring and market forecasting.

The General Time Series Classification Pipeline includes a Time Series Classification module.

Model NameModel Download Link Acc(%) Model Size (M)
TimesNet_clsInference Model/Trained Model 87.5 792K

Test Environment Description:

  • Performance Test Environment
  • Test Dataset: UWaveGestureLibrary dataset.
  • Hardware Configuration:

    • GPU: NVIDIA Tesla T4
    • CPU: Intel Xeon Gold 6271C @ 2.60GHz
    • Other Environments: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
  • Inference Mode Description

Mode GPU Configuration CPU Configuration Acceleration Technology Combination
Regular Mode FP32 Precision / No TRT Acceleration FP32 Precision / 8 Threads PaddleInference
High-Performance Mode Optimal combination of pre-selected precision types and acceleration strategies FP32 Precision / 8 Threads Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.)

2. Quick Start

PaddleX provides pre-trained model pipelines that can be quickly experienced. You can experience the effects of the General Time Series Classification Pipeline online or locally using command line or Python.

2.1 Online Experience

You can experience online the effects of the General Time Series Classification Pipeline using the official demo for recognition, for example:

If you are satisfied with the pipeline's performance, you can directly integrate and deploy it. If not, you can also use your private data to fine-tune the model in the pipeline online.

Note: Due to the close relationship between time series data and scenarios, the official built-in model for online experience of time series tasks is only a model solution for a specific scenario and is not a general solution applicable to other scenarios. Therefore, the experience method does not support using arbitrary files to experience the effect of the official model solution. However, after training a model for your own scenario data, you can select your trained model solution and use data from the corresponding scenario for online experience.

2.2 Local Experience

Before using the general time-series classification pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the PaddleX Local Installation Guide.

2.2.1 Command Line Experience

You can quickly experience the time-series classification pipeline with a single command. Use the test file and replace --input with your local path for prediction.

paddlex --pipeline ts_classification --input ts_cls.csv --device gpu:0 --save_path ./output

For the explanation of the parameters, you can refer to the parameter description in Section 2.2.2 Integration via Python Script.

After running the command, the results will be printed to the terminal, as follows:

{'input_path': 'ts_cls.csv', 'classification':         classid     score
sample
0             0  0.617688}

The explanation of the result parameters can be found in the result interpretation section of 2.2.2 Python Script Integration.

The time-series file results are saved under save_path.

2.2.2 Python Script Integration

The above command line is for quickly experiencing and viewing the results. Generally, in a project, you often need to integrate through code. You can complete the production line's fast inference with just a few lines of code. The inference code is as follows:

from paddlex import create_pipeline

pipeline = create_pipeline(pipeline="ts_classification", device="gpu:0")
output = pipeline.predict("ts_cls.csv")
for res in output:
    res.print() ## 打印预测的结构化输出
    res.save_to_csv(save_path="./output/") ## 保存csv格式结果
    res.save_to_json(save_path="./output/") ## 保存json格式结果

In the above Python script, the following steps are executed:

(1) Instantiate the pipeline object using create_pipeline(): The parameters are explained as follows:

Parameter Description Type Default Value
pipeline The name of the pipeline or the path to the pipeline configuration file. If it is a pipeline name, it must be a pipeline supported by PaddleX. str None
config Specific configuration information for the production line (if set simultaneously with pipeline, it takes precedence over pipeline, and the production line name must be consistent with pipeline). dict[str, Any] None
config Specific configuration information for the production line (if set simultaneously with pipeline, it takes precedence over pipeline, and the production line name must be consistent with pipeline). dict[str, Any] None
device The device used for pipeline inference. It supports specifying specific GPU card numbers, such as "gpu:0", specific card numbers for other hardware, such as "npu:0", and CPU, such as "cpu". str gpu:0
use_hpip Whether to enable high-performance inference. This is only available if the pipeline supports high-performance inference. bool False

(2) Call the predict() method of the ts_classification pipeline object for inference. This method returns a generator. The parameters and their descriptions for the predict() method are as follows:

Parameter Description Type Options Default Value
input The data to be predicted. It supports multiple input types and is required. Python Var|str|list
  • Python Var: Time-series data represented by pandas.DataFrame.
  • str: Local path of the time-series file, such as /root/data/ts.csv; URL link, such as the network URL of the time-series file: Example; Local directory, which should contain the time-series data to be predicted, such as /root/data/.
  • List: The elements of the list must be of the above types, such as [pandas.DataFrame, pandas.DataFrame], ["/root/data/ts1.csv", "/root/data/ts2.csv"], ["/root/data1", "/root/data2"].
None
device The device used for pipeline inference. str|None
  • CPU: Use CPU for inference, such as cpu.
  • GPU: Use the first GPU for inference, such as gpu:0.
  • NPU: Use the first NPU for inference, such as npu:0.
  • XPU: Use the first XPU for inference, such as xpu:0.
  • MLU: Use the first MLU for inference, such as mlu:0.
  • DCU: Use the first DCU for inference, such as dcu:0.
  • None: If set to None, the default value used during pipeline initialization will be applied. During initialization, it will prioritize using the local GPU device 0. If unavailable, it will fall back to the CPU.
None

(3) Process the prediction results. The prediction result for each sample is of type dict, and supports operations such as printing, saving as a csv file, and saving as a json file:

Method Description Parameter Type Description Default Value
print() Print the result to the terminal format_json bool Whether to format the output content using JSON indentation True
indent int Specify the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True 4
ensure_ascii bool Control whether to escape non-ASCII characters to Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True False
save_to_json() Save the result as a JSON file save_path str The file path for saving. When a directory is provided, the saved file name will match the input file name None
indent int Specify the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True 4
ensure_ascii bool Control whether to escape non-ASCII characters to Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True False
save_to_csv() Save the result as a CSV file save_path str The file path for saving, supporting both directory and file paths None
  • Calling the print() method will print the result to the terminal, with the following explanations for the printed content:

    • input_path: (str) The input path of the time-series file to be predicted
    • classification: (Pandas.DataFrame) The time-series classification result, including sample IDs, corresponding classification categories, and confidence scores.
  • Calling the save_to_json() method will save the above content to the specified save_path. If a directory is specified, the saved path will be save_path/{your_ts_basename}_res.json. If a file is specified, it will be saved directly to that file. Since JSON files do not support saving NumPy arrays, numpy.array types will be converted to list format.

  • Calling the save_to_csv() method will save the visualization results to the specified save_path. If a directory is specified, the saved path will be save_path/{your_ts_basename}_res.csv. If a file is specified, it will be saved directly to that file.

  • Additionally, it also supports obtaining prediction results in different formats through attributes, as follows:

Attribute Description
json Get the prediction result in json format
csv Get the result in csv format
  • The prediction result obtained through the json attribute is of type dict, with content consistent with what is saved by calling the save_to_json() method.
  • The csv attribute returns a Pandas.DataFrame type data, which contains the time-series classification results.

Additionally, you can obtain the ts_classification production line configuration file and load it for prediction. You can run the following command to save the results in my_path:

paddlex --get_pipeline_config ts_classification --save_path ./my_path

If you have obtained the configuration file, you can customize the settings for the time-series classification pipeline by simply modifying the pipeline parameter value in the create_pipeline method to the path of the pipeline configuration file.

For example, if your configuration file is saved at ./my_path/ts_cls.yaml, you just need to execute:

from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="./my_path/ts_classification.yaml")
output = pipeline.predict("ts_cls.csv")
for res in output:
    res.print() ## 打印预测的结构化输出
    res.save_to_csv("./output/") ## 保存csv格式结果
    res.save_to_json("./output/") ## 保存json格式结果

3. Development Integration/Deployment

If the pipeline meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.

If you need to integrate the pipeline directly into your Python project, you can refer to the example code in Section 2.2.2 Integration via Python Script.

Additionally, PaddleX offers three other deployment methods, detailed as follows:

🚀 High-Performance Inference: In practical production environments, many applications have stringent performance requirements for deployment strategies, especially in terms of response speed, to ensure efficient system operation and a smooth user experience. To address this, PaddleX provides a high-performance inference plugin aimed at deeply optimizing the performance of model inference and pre/post-processing, significantly speeding up the end-to-end process. For detailed instructions, please refer to the PaddleX High-Performance Inference Guide.

☁️ Service-Oriented Deployment: Service-oriented deployment is a common form of deployment in practical production environments. By encapsulating inference functionality into a service, clients can access these services via network requests to obtain inference results. PaddleX supports various pipeline service-oriented deployment solutions. For detailed instructions, please refer to the PaddleX Service-Oriented Deployment Guide.

Below are the API references for basic service-oriented deployment and multi-language service call examples:

API Reference

For the main operations provided by the service:

  • The HTTP request method is POST.
  • Both the request body and response body are JSON data (JSON objects).
  • When the request is processed successfully, the response status code is 200, and the attributes of the response body are as follows:
Name Type Meaning
logId string The UUID of the request.
errorCode integer Error code. Fixed as 0.
errorMsg string Error message. Fixed as "Success".
result object The result of the operation.
  • When the request is not processed successfully, the attributes of the response body are as follows:
Name Type Meaning
logId string The UUID of the request.
errorCode integer Error code. Same as the response status code.
errorMsg string Error message.

The main operations provided by the service are as follows:

  • infer

Classify time-series data.

POST /time-series-classification

  • The attributes of the request body are as follows:
Name Type Meaning Required
csv string The URL of a CSV file accessible by the server or the Base64-encoded content of a CSV file. The CSV file must be encoded in UTF-8. Yes
  • When the request is processed successfully, the result in the response body has the following attributes:
Name Type Meaning
label string The class label.
score number The class score.

An example of result is as follows:

{
"label": "running",
"score": 0.97
}
Multi-language Service Invocation Examples
Python
import base64
import requests

API_URL = "http://localhost:8080/time-series-classification" # Service URL
csv_path = "./test.csv"

# Encode the local CSV file in Base64
with open(csv_path, "rb") as file:
    csv_bytes = file.read()
    csv_data = base64.b64encode(csv_bytes).decode("ascii")

payload = {"csv": csv_data}

# Call the API
response = requests.post(API_URL, json=payload)

# Process the response data
assert response.status_code == 200
result = response.json()["result"]
print(f"label: {result['label']}, score: {result['score']}")
C++
#include <iostream>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64

int main() {
    httplib::Client client("localhost:8080");
    const std::string csvPath = "./test.csv";

    httplib::Headers headers = {
        {"Content-Type", "application/json"}
    };

    // Encode in Base64
    std::ifstream file(csvPath, std::ios::binary | std::ios::ate);
    std::streamsize size = file.tellg();
    file.seekg(0, std::ios::beg);

    std::vector<char> buffer(size);
    if (!file.read(buffer.data(), size)) {
        std::cerr << "Error reading file." << std::endl;
        return 1;
    }
    std::string bufferStr(reinterpret_cast<const char*>(buffer.data()), buffer.size());
    std::string encodedCsv = base64::to_base64(bufferStr);

    nlohmann::json jsonObj;
    jsonObj["csv"] = encodedCsv;
    std::string body = jsonObj.dump();

    // Call the API
    auto response = client.Post("/time-series-classification", headers, body, "application/json");
    // Process the response data
    if (response && response->status == 200) {
        nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
        auto result = jsonResponse["result"];
        std::cout << "label: " << result["label"] << ", score: " << result["score"] << std::endl;
    } else {
        std::cout << "Failed to send HTTP request." << std::endl;
        std::cout << response->body << std::endl;
        return 1;
    }

    return 0;
}
Java
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;

public class Main {
    public static void main(String[] args) throws IOException {
        String API_URL = "http://localhost:8080/time-series-classification";
        String csvPath = "./test.csv";

        // Encode the local CSV file using Base64
        File file = new File(csvPath);
        byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
        String csvData = Base64.getEncoder().encodeToString(fileContent);

        ObjectMapper objectMapper = new ObjectMapper();
        ObjectNode params = objectMapper.createObjectNode();
        params.put("csv", csvData);

        // Create an OkHttpClient instance
        OkHttpClient client = new OkHttpClient();
        MediaType JSON = MediaType.Companion.get("application/json; charset=utf-8");
        RequestBody body = RequestBody.Companion.create(params.toString(), JSON);
        Request request = new Request.Builder()
                .url(API_URL)
                .post(body)
                .build();

        // Call the API and process the returned data
        try (Response response = client.newCall(request).execute()) {
            if (response.isSuccessful()) {
                String responseBody = response.body().string();
                JsonNode resultNode = objectMapper.readTree(responseBody);
                JsonNode result = resultNode.get("result");
                System.out.println("label: " + result.get("label").asText() + ", score: " + result.get("score").asText());
            } else {
                System.err.println("Request failed with code: " + response.code());
            }
        }
    }
}
Go
package main

import (
    "bytes"
    "encoding/base64"
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
)

func main() {
    API_URL := "http://localhost:8080/time-series-classification"
    csvPath := "./test.csv";

    // Read the CSV file and encode it with Base64
    csvBytes, err := ioutil.ReadFile(csvPath)
    if err != nil {
        fmt.Println("Error reading csv file:", err)
        return
    }
    csvData := base64.StdEncoding.EncodeToString(csvBytes)

    payload := map[string]string{"csv": csvData} // Base64-encoded file content
    payloadBytes, err := json.Marshal(payload)
    if err != nil {
        fmt.Println("Error marshaling payload:", err)
        return
    }

    // Call the API
    client := &http.Client{}
    req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
    if err != nil {
        fmt.Println("Error creating request:", err)
        return
    }

    res, err := client.Do(req)
    if err != nil {
        fmt.Println("Error sending request:", err)
        return
    }
    defer res.Body.Close()

    // Process the response data
    body, err := ioutil.ReadAll(res.Body)
    if err != nil {
        fmt.Println("Error reading response body:", err)
        return
    }
    type Response struct {
        Result struct {
            Label string `json:"label"`
            Score string `json:"score"`
        } `json:"result"`
    }
    var respData Response
    err = json.Unmarshal([]byte(string(body)), &respData)
    if err != nil {
        fmt.Println("Error unmarshaling response body:", err)
        return
    }

    fmt.Printf("label: %s, score: %s\n", respData.Result.Label, respData.Result.Score)
}
C#
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;

class Program
{
    static readonly string API_URL = "http://localhost:8080/time-series-classification";
    static readonly string csvPath = "./test.csv";

    static async Task Main(string[] args)
    {
        var httpClient = new HttpClient();

        // Encode the local CSV file in Base64
        byte[] csvBytes = File.ReadAllBytes(csvPath);
        string csvData = Convert.ToBase64String(csvBytes);

        var payload = new JObject{ { "csv", csvData } }; // Base64-encoded file content
        var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");

        // Call the API
        HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
        response.EnsureSuccessStatusCode();

        // Process the response data
        string responseBody = await response.Content.ReadAsStringAsync();
        JObject jsonResponse = JObject.Parse(responseBody);

        string label = jsonResponse["result"]["label"].ToString();
        string score = jsonResponse["result"]["score"].ToString();
        Console.WriteLine($"label: {label}, score: {score}");
    }
}
Node.js
const axios = require('axios');
const fs = require('fs');

const API_URL = 'http://localhost:8080/time-series-classification';
const csvPath = './test.csv';

let config = {
   method: 'POST',
   maxBodyLength: Infinity,
   url: API_URL,
   data: JSON.stringify({
    'csv': encodeFileToBase64(csvPath)  // Base64-encoded file content
  })
};

// Read the CSV file and convert it to Base64
function encodeFileToBase64(filePath) {
  const bitmap = fs.readFileSync(filePath);
  return Buffer.from(bitmap).toString('base64');
}

axios.request(config)
.then((response) => {
    const result = response.data['result'];
    console.log(`label: ${result['label']}, score: ${result['score']}`);
})
.catch((error) => {
  console.log(error);
});
PHP
<?php

$API_URL = "http://localhost:8080/time-series-classification"; // Service URL
$csv_path = "./test.csv";

// Encode the local CSV file using Base64
$csv_data = base64_encode(file_get_contents($csv_path));
$payload = array("csv" => $csv_data); // Base64-encoded file content

// Call the API
$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);

// Process the response data
$result = json_decode($response, true)["result"];
echo "label: " . $result["label"] . ", score: " . $result["score"];

?>


📱 Edge Deployment: Edge deployment is a method of placing computing and data processing capabilities directly on user devices, allowing them to process data locally without relying on remote servers. PaddleX supports deploying models on edge devices such as Android. For detailed procedures, please refer to the PaddleX Edge Deployment Guide. You can choose the appropriate deployment method based on your needs to integrate the model pipeline into subsequent AI applications.

4. Secondary Development

If the default model weights provided by the time-series classification pipeline do not meet your requirements in terms of accuracy or speed, you can try to fine-tune the existing model using your own domain-specific or application data to improve the performance of the time-series classification pipeline in your scenario.

4.1 Model Fine-Tuning

Since the time-series classification pipeline includes a time-series classification module, if the pipeline's performance is not satisfactory, you need to refer to the Secondary Development section in the Time-Series Classification Module Development Tutorial and fine-tune the time-series classification model using your private dataset.

4.2 Model Application

After you have completed fine-tuning training with your private dataset, you will obtain a local model weight file.

If you need to use the fine-tuned model weights, simply modify the pipeline configuration file and fill in the local path of the fine-tuned model weights to the model_dir in the pipeline configuration file:

pipeline_name: ts_classification

SubModules:
  TSClassification:
    module_name: ts_classification
    model_name: TimesNet_cls
    model_dir: null  # Can be modified to the local path of the fine-tuned model
    batch_size: 1

Subsequently, you can load the modified pipeline configuration file using the command line or Python script methods described in the local experience section.

5. Multi-Hardware Support

PaddleX supports a variety of mainstream hardware devices, including NVIDIA GPU, Kunlunxin XPU, Ascend NPU, and Cambricon MLU. Simply modify the --device parameter to seamlessly switch between different hardware.

For example, if you are using Ascend NPU for inference in the time-series classification pipeline, the Python command is as follows:

paddlex --pipeline ts_classification --input ts_cls.csv --device npu:0

If you want to use the general time-series classification production line on a wider range of hardware devices, please refer to the PaddleX Multi-Hardware Usage Guide.

Comments