Time Series Classification Pipeline Tutorial¶
1. Introduction to General Time Series Classification Pipeline¶
Time series classification is a technique that categorizes time-series data into predefined classes, widely applied in fields such as behavior recognition and financial trend analysis. By analyzing features that vary over time, it identifies different patterns or events, for example, classifying a speech signal as "greeting" or "request," or categorizing stock price movements as "rising" or "falling." Time series classification typically employs machine learning and deep learning models, effectively capturing temporal dependencies and variation patterns to provide accurate classification labels for data. This technology plays a pivotal role in applications such as intelligent monitoring and market forecasting.
The General Time Series Classification Pipeline includes a Time Series Classification module.
Model Name | Model Download Link | Acc(%) | Model Size (M) |
---|---|---|---|
TimesNet_cls | Inference Model/Trained Model | 87.5 | 792K |
Test Environment Description:
- Performance Test Environment
- Test Dataset: UWaveGestureLibrary dataset.
-
Hardware Configuration:
- GPU: NVIDIA Tesla T4
- CPU: Intel Xeon Gold 6271C @ 2.60GHz
- Other Environments: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
-
Inference Mode Description
Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
---|---|---|---|
Regular Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
High-Performance Mode | Optimal combination of pre-selected precision types and acceleration strategies | FP32 Precision / 8 Threads | Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.) |
2. Quick Start¶
PaddleX provides pre-trained model pipelines that can be quickly experienced. You can experience the effects of the General Time Series Classification Pipeline online or locally using command line or Python.
2.1 Online Experience¶
You can experience online the effects of the General Time Series Classification Pipeline using the official demo for recognition, for example:
If you are satisfied with the pipeline's performance, you can directly integrate and deploy it. If not, you can also use your private data to fine-tune the model in the pipeline online.
Note: Due to the close relationship between time series data and scenarios, the official built-in model for online experience of time series tasks is only a model solution for a specific scenario and is not a general solution applicable to other scenarios. Therefore, the experience method does not support using arbitrary files to experience the effect of the official model solution. However, after training a model for your own scenario data, you can select your trained model solution and use data from the corresponding scenario for online experience.
2.2 Local Experience¶
Before using the general time-series classification pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the PaddleX Local Installation Guide.
2.2.1 Command Line Experience¶
You can quickly experience the time-series classification pipeline with a single command. Use the test file and replace --input
with your local path for prediction.
For the explanation of the parameters, you can refer to the parameter description in Section 2.2.2 Integration via Python Script.
After running the command, the results will be printed to the terminal, as follows:
The explanation of the result parameters can be found in the result interpretation section of 2.2.2 Python Script Integration.
The time-series file results are saved under save_path
.
2.2.2 Python Script Integration¶
The above command line is for quickly experiencing and viewing the results. Generally, in a project, you often need to integrate through code. You can complete the production line's fast inference with just a few lines of code. The inference code is as follows:
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="ts_classification", device="gpu:0")
output = pipeline.predict("ts_cls.csv")
for res in output:
res.print() ## 打印预测的结构化输出
res.save_to_csv(save_path="./output/") ## 保存csv格式结果
res.save_to_json(save_path="./output/") ## 保存json格式结果
In the above Python script, the following steps are executed:
(1) Instantiate the pipeline object using create_pipeline()
: The parameters are explained as follows:
Parameter | Description | Type | Default Value |
---|---|---|---|
pipeline |
The name of the pipeline or the path to the pipeline configuration file. If it is a pipeline name, it must be a pipeline supported by PaddleX. | str |
None |
config |
Specific configuration information for the production line (if set simultaneously with pipeline , it takes precedence over pipeline , and the production line name must be consistent with pipeline ). |
dict[str, Any] |
None |
config |
Specific configuration information for the production line (if set simultaneously with pipeline , it takes precedence over pipeline , and the production line name must be consistent with pipeline ). |
dict[str, Any] |
None |
device |
The device used for pipeline inference. It supports specifying specific GPU card numbers, such as "gpu:0", specific card numbers for other hardware, such as "npu:0", and CPU, such as "cpu". | str |
gpu:0 |
use_hpip |
Whether to enable high-performance inference. This is only available if the pipeline supports high-performance inference. | bool |
False |
(2) Call the predict()
method of the ts_classification pipeline object for inference. This method returns a generator
. The parameters and their descriptions for the predict()
method are as follows:
Parameter | Description | Type | Options | Default Value |
---|---|---|---|---|
input |
The data to be predicted. It supports multiple input types and is required. | Python Var|str|list |
|
None |
device |
The device used for pipeline inference. | str|None |
|
None |
(3) Process the prediction results. The prediction result for each sample is of type dict
, and supports operations such as printing, saving as a csv
file, and saving as a json
file:
Method | Description | Parameter | Type | Description | Default Value |
---|---|---|---|---|---|
print() |
Print the result to the terminal | format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Control whether to escape non-ASCII characters to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True |
False |
||
save_to_json() |
Save the result as a JSON file | save_path |
str |
The file path for saving. When a directory is provided, the saved file name will match the input file name | None |
indent |
int |
Specify the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Control whether to escape non-ASCII characters to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True |
False |
||
save_to_csv() |
Save the result as a CSV file | save_path |
str |
The file path for saving, supporting both directory and file paths | None |
-
Calling the
print()
method will print the result to the terminal, with the following explanations for the printed content:input_path
:(str)
The input path of the time-series file to be predictedclassification
:(Pandas.DataFrame)
The time-series classification result, including sample IDs, corresponding classification categories, and confidence scores.
-
Calling the
save_to_json()
method will save the above content to the specifiedsave_path
. If a directory is specified, the saved path will besave_path/{your_ts_basename}_res.json
. If a file is specified, it will be saved directly to that file. Since JSON files do not support saving NumPy arrays,numpy.array
types will be converted to list format. -
Calling the
save_to_csv()
method will save the visualization results to the specifiedsave_path
. If a directory is specified, the saved path will besave_path/{your_ts_basename}_res.csv
. If a file is specified, it will be saved directly to that file. -
Additionally, it also supports obtaining prediction results in different formats through attributes, as follows:
Attribute | Description |
---|---|
json |
Get the prediction result in json format |
csv |
Get the result in csv format |
- The prediction result obtained through the
json
attribute is of typedict
, with content consistent with what is saved by calling thesave_to_json()
method. - The
csv
attribute returns aPandas.DataFrame
type data, which contains the time-series classification results.
Additionally, you can obtain the ts_classification production line configuration file and load it for prediction. You can run the following command to save the results in my_path
:
If you have obtained the configuration file, you can customize the settings for the time-series classification pipeline by simply modifying the pipeline
parameter value in the create_pipeline
method to the path of the pipeline configuration file.
For example, if your configuration file is saved at ./my_path/ts_cls.yaml
, you just need to execute:
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="./my_path/ts_classification.yaml")
output = pipeline.predict("ts_cls.csv")
for res in output:
res.print() ## 打印预测的结构化输出
res.save_to_csv("./output/") ## 保存csv格式结果
res.save_to_json("./output/") ## 保存json格式结果
3. Development Integration/Deployment¶
If the pipeline meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.
If you need to integrate the pipeline directly into your Python project, you can refer to the example code in Section 2.2.2 Integration via Python Script.
Additionally, PaddleX offers three other deployment methods, detailed as follows:
🚀 High-Performance Inference: In practical production environments, many applications have stringent performance requirements for deployment strategies, especially in terms of response speed, to ensure efficient system operation and a smooth user experience. To address this, PaddleX provides a high-performance inference plugin aimed at deeply optimizing the performance of model inference and pre/post-processing, significantly speeding up the end-to-end process. For detailed instructions, please refer to the PaddleX High-Performance Inference Guide.
☁️ Service-Oriented Deployment: Service-oriented deployment is a common form of deployment in practical production environments. By encapsulating inference functionality into a service, clients can access these services via network requests to obtain inference results. PaddleX supports various pipeline service-oriented deployment solutions. For detailed instructions, please refer to the PaddleX Service-Oriented Deployment Guide.
Below are the API references for basic service-oriented deployment and multi-language service call examples:
API Reference
For the main operations provided by the service:
- The HTTP request method is POST.
- Both the request body and response body are JSON data (JSON objects).
- When the request is processed successfully, the response status code is
200
, and the attributes of the response body are as follows:
Name | Type | Meaning |
---|---|---|
logId |
string |
The UUID of the request. |
errorCode |
integer |
Error code. Fixed as 0 . |
errorMsg |
string |
Error message. Fixed as "Success" . |
result |
object |
The result of the operation. |
- When the request is not processed successfully, the attributes of the response body are as follows:
Name | Type | Meaning |
---|---|---|
logId |
string |
The UUID of the request. |
errorCode |
integer |
Error code. Same as the response status code. |
errorMsg |
string |
Error message. |
The main operations provided by the service are as follows:
infer
Classify time-series data.
POST /time-series-classification
- The attributes of the request body are as follows:
Name | Type | Meaning | Required |
---|---|---|---|
csv |
string |
The URL of a CSV file accessible by the server or the Base64-encoded content of a CSV file. The CSV file must be encoded in UTF-8. | Yes |
- When the request is processed successfully, the
result
in the response body has the following attributes:
Name | Type | Meaning |
---|---|---|
label |
string |
The class label. |
score |
number |
The class score. |
An example of result
is as follows:
{
"label": "running",
"score": 0.97
}
Multi-language Service Invocation Examples
Python
import base64
import requests
API_URL = "http://localhost:8080/time-series-classification" # Service URL
csv_path = "./test.csv"
# Encode the local CSV file in Base64
with open(csv_path, "rb") as file:
csv_bytes = file.read()
csv_data = base64.b64encode(csv_bytes).decode("ascii")
payload = {"csv": csv_data}
# Call the API
response = requests.post(API_URL, json=payload)
# Process the response data
assert response.status_code == 200
result = response.json()["result"]
print(f"label: {result['label']}, score: {result['score']}")
C++
#include <iostream>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64
int main() {
httplib::Client client("localhost:8080");
const std::string csvPath = "./test.csv";
httplib::Headers headers = {
{"Content-Type", "application/json"}
};
// Encode in Base64
std::ifstream file(csvPath, std::ios::binary | std::ios::ate);
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<char> buffer(size);
if (!file.read(buffer.data(), size)) {
std::cerr << "Error reading file." << std::endl;
return 1;
}
std::string bufferStr(reinterpret_cast<const char*>(buffer.data()), buffer.size());
std::string encodedCsv = base64::to_base64(bufferStr);
nlohmann::json jsonObj;
jsonObj["csv"] = encodedCsv;
std::string body = jsonObj.dump();
// Call the API
auto response = client.Post("/time-series-classification", headers, body, "application/json");
// Process the response data
if (response && response->status == 200) {
nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
auto result = jsonResponse["result"];
std::cout << "label: " << result["label"] << ", score: " << result["score"] << std::endl;
} else {
std::cout << "Failed to send HTTP request." << std::endl;
std::cout << response->body << std::endl;
return 1;
}
return 0;
}
Java
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
public class Main {
public static void main(String[] args) throws IOException {
String API_URL = "http://localhost:8080/time-series-classification";
String csvPath = "./test.csv";
// Encode the local CSV file using Base64
File file = new File(csvPath);
byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
String csvData = Base64.getEncoder().encodeToString(fileContent);
ObjectMapper objectMapper = new ObjectMapper();
ObjectNode params = objectMapper.createObjectNode();
params.put("csv", csvData);
// Create an OkHttpClient instance
OkHttpClient client = new OkHttpClient();
MediaType JSON = MediaType.Companion.get("application/json; charset=utf-8");
RequestBody body = RequestBody.Companion.create(params.toString(), JSON);
Request request = new Request.Builder()
.url(API_URL)
.post(body)
.build();
// Call the API and process the returned data
try (Response response = client.newCall(request).execute()) {
if (response.isSuccessful()) {
String responseBody = response.body().string();
JsonNode resultNode = objectMapper.readTree(responseBody);
JsonNode result = resultNode.get("result");
System.out.println("label: " + result.get("label").asText() + ", score: " + result.get("score").asText());
} else {
System.err.println("Request failed with code: " + response.code());
}
}
}
}
Go
package main
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
)
func main() {
API_URL := "http://localhost:8080/time-series-classification"
csvPath := "./test.csv";
// Read the CSV file and encode it with Base64
csvBytes, err := ioutil.ReadFile(csvPath)
if err != nil {
fmt.Println("Error reading csv file:", err)
return
}
csvData := base64.StdEncoding.EncodeToString(csvBytes)
payload := map[string]string{"csv": csvData} // Base64-encoded file content
payloadBytes, err := json.Marshal(payload)
if err != nil {
fmt.Println("Error marshaling payload:", err)
return
}
// Call the API
client := &http.Client{}
req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
if err != nil {
fmt.Println("Error creating request:", err)
return
}
res, err := client.Do(req)
if err != nil {
fmt.Println("Error sending request:", err)
return
}
defer res.Body.Close()
// Process the response data
body, err := ioutil.ReadAll(res.Body)
if err != nil {
fmt.Println("Error reading response body:", err)
return
}
type Response struct {
Result struct {
Label string `json:"label"`
Score string `json:"score"`
} `json:"result"`
}
var respData Response
err = json.Unmarshal([]byte(string(body)), &respData)
if err != nil {
fmt.Println("Error unmarshaling response body:", err)
return
}
fmt.Printf("label: %s, score: %s\n", respData.Result.Label, respData.Result.Score)
}
C#
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;
class Program
{
static readonly string API_URL = "http://localhost:8080/time-series-classification";
static readonly string csvPath = "./test.csv";
static async Task Main(string[] args)
{
var httpClient = new HttpClient();
// Encode the local CSV file in Base64
byte[] csvBytes = File.ReadAllBytes(csvPath);
string csvData = Convert.ToBase64String(csvBytes);
var payload = new JObject{ { "csv", csvData } }; // Base64-encoded file content
var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");
// Call the API
HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
response.EnsureSuccessStatusCode();
// Process the response data
string responseBody = await response.Content.ReadAsStringAsync();
JObject jsonResponse = JObject.Parse(responseBody);
string label = jsonResponse["result"]["label"].ToString();
string score = jsonResponse["result"]["score"].ToString();
Console.WriteLine($"label: {label}, score: {score}");
}
}
Node.js
const axios = require('axios');
const fs = require('fs');
const API_URL = 'http://localhost:8080/time-series-classification';
const csvPath = './test.csv';
let config = {
method: 'POST',
maxBodyLength: Infinity,
url: API_URL,
data: JSON.stringify({
'csv': encodeFileToBase64(csvPath) // Base64-encoded file content
})
};
// Read the CSV file and convert it to Base64
function encodeFileToBase64(filePath) {
const bitmap = fs.readFileSync(filePath);
return Buffer.from(bitmap).toString('base64');
}
axios.request(config)
.then((response) => {
const result = response.data['result'];
console.log(`label: ${result['label']}, score: ${result['score']}`);
})
.catch((error) => {
console.log(error);
});
PHP
<?php
$API_URL = "http://localhost:8080/time-series-classification"; // Service URL
$csv_path = "./test.csv";
// Encode the local CSV file using Base64
$csv_data = base64_encode(file_get_contents($csv_path));
$payload = array("csv" => $csv_data); // Base64-encoded file content
// Call the API
$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
// Process the response data
$result = json_decode($response, true)["result"];
echo "label: " . $result["label"] . ", score: " . $result["score"];
?>
📱 Edge Deployment: Edge deployment is a method of placing computing and data processing capabilities directly on user devices, allowing them to process data locally without relying on remote servers. PaddleX supports deploying models on edge devices such as Android. For detailed procedures, please refer to the PaddleX Edge Deployment Guide. You can choose the appropriate deployment method based on your needs to integrate the model pipeline into subsequent AI applications.
4. Secondary Development¶
If the default model weights provided by the time-series classification pipeline do not meet your requirements in terms of accuracy or speed, you can try to fine-tune the existing model using your own domain-specific or application data to improve the performance of the time-series classification pipeline in your scenario.
4.1 Model Fine-Tuning¶
Since the time-series classification pipeline includes a time-series classification module, if the pipeline's performance is not satisfactory, you need to refer to the Secondary Development section in the Time-Series Classification Module Development Tutorial and fine-tune the time-series classification model using your private dataset.
4.2 Model Application¶
After you have completed fine-tuning training with your private dataset, you will obtain a local model weight file.
If you need to use the fine-tuned model weights, simply modify the pipeline configuration file and fill in the local path of the fine-tuned model weights to the model_dir
in the pipeline configuration file:
pipeline_name: ts_classification
SubModules:
TSClassification:
module_name: ts_classification
model_name: TimesNet_cls
model_dir: null # Can be modified to the local path of the fine-tuned model
batch_size: 1
Subsequently, you can load the modified pipeline configuration file using the command line or Python script methods described in the local experience section.
5. Multi-Hardware Support¶
PaddleX supports a variety of mainstream hardware devices, including NVIDIA GPU, Kunlunxin XPU, Ascend NPU, and Cambricon MLU. Simply modify the --device
parameter to seamlessly switch between different hardware.
For example, if you are using Ascend NPU for inference in the time-series classification pipeline, the Python command is as follows:
If you want to use the general time-series classification production line on a wider range of hardware devices, please refer to the PaddleX Multi-Hardware Usage Guide.