Time Series Forecasting Pipeline Tutorial¶
1. Introduction to the General Time Series Forecasting Pipeline¶
Time series forecasting is a technique that utilizes historical data to predict future trends by analyzing the patterns of change in time series data. It is widely applied in fields such as financial markets, weather forecasting, and sales prediction. Time series forecasting often employs statistical methods or deep learning models (e.g., LSTM, ARIMA), capable of handling temporal dependencies in data to provide accurate predictions, assisting decision-makers in better planning and response. This technology plays a crucial role in various industries, including energy management, supply chain optimization, and market analysis.
The General Time Series Forecasting Pipeline includes a time series forecasting module. If you prioritize model accuracy, choose a model with higher accuracy. If you prioritize inference speed, select a model with faster inference. If you prioritize model storage size, choose a model with a smaller storage size.
Model Name | Model Download Link | MSE | MAE | Model Storage Size (M) |
---|---|---|---|---|
DLinear | Inference Model/Trained Model | 0.382 | 0.394 | 72K |
NLinear | Inference Model/Trained Model | 0.386 | 0.392 | 40K |
Nonstationary | Inference Model/Trained Model | 0.600 | 0.515 | 55.5 M |
PatchTST | Inference Model/Trained Model | 0.385 | 0.397 | 2.0M |
RLinear | Inference Model/Trained Model | 0.384 | 0.392 | 40K |
TiDE | Inference Model/Trained Model | 0.405 | 0.412 | 31.7M |
TimesNet | Inference Model/Trained Model | 0.417 | 0.431 | 4.9M |
Test Environment Description:
- Performance Test Environment
- Test Dataset: ETTH1.
-
Hardware Configuration:
- GPU: NVIDIA Tesla T4
- CPU: Intel Xeon Gold 6271C @ 2.60GHz
- Other Environments: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
-
Inference Mode Description
Mode | GPU Configuration | CPU Configuration | Acceleration Technology Combination |
---|---|---|---|
Normal Mode | FP32 Precision / No TRT Acceleration | FP32 Precision / 8 Threads | PaddleInference |
High-Performance Mode | Optimal combination of pre-selected precision types and acceleration strategies | FP32 Precision / 8 Threads | Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.) |
2. Quick Start¶
The pre-trained model pipelines provided by PaddleX allow for quick experience of their effects. You can experience the effects of the General Time Series Forecasting Pipeline online or locally using command line or Python.
2.1 Online Experience¶
You can experience the General Time Series Forecasting Pipeline online using the demo provided by the official team, for example:
If you are satisfied with the pipeline's performance, you can directly integrate and deploy it. If not, you can also use your private data to fine-tune the model within the pipeline online.
Note: Due to the close relationship between time series data and scenarios, the official built-in models for online time series tasks are scenario-specific and not universal. Therefore, the experience mode does not support using arbitrary files to experience the effects of the official model solutions. However, after training a model with your own scenario data, you can select your trained model solution and use data from the corresponding scenario for online experience.
2.2 Local Experience¶
Before using the general time-series forecasting pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the PaddleX Local Installation Guide.
2.2.1 Command Line Experience¶
You can quickly experience the time-series forecasting pipeline with a single command. Use the test file and replace --input
with your local path for prediction.
The relevant parameter descriptions can be found in the parameter explanation section of 2.2.2 Python Script Integration.
After running, the result will be printed to the terminal as follows:
👉 Click to Expand
{'input_path': 'ts_fc.csv', 'forecast': OT
date
2018-06-26 20:00:00 9.586131
2018-06-26 21:00:00 9.379762
2018-06-26 22:00:00 9.252275
2018-06-26 23:00:00 9.249993
2018-06-27 00:00:00 9.164998
... ...
2018-06-30 15:00:00 8.830340
2018-06-30 16:00:00 9.291553
2018-06-30 17:00:00 9.097666
2018-06-30 18:00:00 8.905430
2018-06-30 19:00:00 8.993793
[96 rows x 1 columns]}
For the explanation of the running result parameters, you can refer to the result interpretation in 2.2.2 Python Script Integration.
The time-series file results are saved under save_path
.
2.2.2 Python Script Integration¶
The above command line is for quickly experiencing and viewing the results. Generally, in a project, it is often necessary to integrate through code. You can complete the fast inference of the production line with just a few lines of code. The inference code is as follows:
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="ts_forecast")
output = pipeline.predict(input="ts_fc.csv")
for res in output:
res.print() ## 打印预测的结构化输出
res.save_to_csv(save_path="./output/") ## 保存csv格式结果
res.save_to_json(save_path="./output/") ## 保存json格式结果
In the above Python script, the following steps are executed:
(1) Instantiate the pipeline object using create_pipeline()
. The specific parameters are described as follows:
Parameter | Description | Type | Default |
---|---|---|---|
pipeline |
Pipeline name or path to pipeline config file, if it's set as a pipeline name, it must be a pipeline supported by PaddleX. | str |
None |
config |
Specific configuration information for the pipeline (if set simultaneously with the pipeline , it takes precedence over the pipeline , and the pipeline name must match the pipeline ).
|
dict[str, Any] |
None |
device |
Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". | str |
None |
use_hpip |
Whether to enable high-performance inference, only available when the pipeline supports high-performance inference. | bool |
False |
(2) Call the predict()
method of the ts_forecast
pipeline object to perform inference and prediction. This method returns a generator
. The parameters and their descriptions for the predict()
method are as follows:
Parameter | Description | Type | Options | Default Value |
---|---|---|---|---|
input |
The data to be predicted. It supports multiple input types and is required. | Python Var|str|list |
|
None |
device |
The device for pipeline inference. | str|None |
|
None |
(3) Process the prediction results. Each sample's prediction result is of type dict
and supports operations such as printing, saving to a csv
file, and saving to a json
file:
Method | Description | Parameter | Type | Explanation | Default Value |
---|---|---|---|---|---|
print() |
Print the result to the terminal | format_json |
bool |
Whether to format the output content using JSON indentation |
True |
indent |
int |
Specifies the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Controls whether non-ASCII characters are escaped to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True |
False |
||
save_to_json() |
Save the result as a JSON file | save_path |
str |
The file path to save the result. When a directory is specified, the saved file name will match the input file name | None |
indent |
int |
Specifies the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True |
4 | ||
ensure_ascii |
bool |
Controls whether non-ASCII characters are escaped to Unicode . When set to True , all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True |
False |
||
save_to_csv() |
Save the result as a CSV file | save_path |
str |
The file path to save the result. Supports both directory and file paths | None |
-
Calling the
print()
method will print the result to the terminal. The printed content is explained as follows:-
input_path
:(str)
The input path of the time-series file to be predicted. -
forecast
:(Pandas.DataFrame)
The time-series prediction result, including future time points and corresponding predicted values.
-
-
Calling the
save_to_json()
method will save the above content to the specifiedsave_path
. If a directory is specified, the saved path will besave_path/{your_ts_basename}_res.json
. If a file is specified, the result will be saved directly to that file. Since JSON files do not support saving NumPy arrays,numpy.array
types will be converted to lists. -
Calling the
save_to_csv()
method will save the result to the specifiedsave_path
. If a directory is specified, the saved path will besave_path/{your_ts_basename}_res.csv
. If a file is specified, the result will be saved directly to that file. -
In addition, it also supports obtaining prediction results in different formats through attributes, as follows:
Attribute | Description |
---|---|
json |
Obtain the prediction result in json format |
csv |
Obtain the prediction result in csv format |
- The prediction result obtained through the
json
attribute is of typedict
, and its content is consistent with the result saved by thesave_to_json()
method. - The
csv
attribute returns aPandas.DataFrame
type data, which contains the time-series prediction results.
In addition, you can obtain the ts_forecast production line configuration file and load the configuration file for prediction. You can execute the following command to save the result in my_path
:
If you have obtained the configuration file, you can customize the settings for the time-series forecasting pipeline by simply modifying the pipeline
parameter value in the create_pipeline
method to the path of the pipeline configuration file.
For example, if your configuration file is saved at ./my_path/ts_forecast.yaml
, you only need to execute:
from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="./my_path/ts_forecast.yaml")
output = pipeline.predict("ts_fc.csv")
for res in output:
res.print() ## 打印预测的结构化输出
res.save_to_csv("./output/") ## 保存csv格式结果
res.save_to_json("./output/") ## 保存json格式结果
Note: The parameters in the configuration file are the initialization parameters for the production line. If you wish to change the initialization parameters for the ts_forecasts
production line, you can directly modify the parameters in the configuration file and load the configuration file for prediction. Additionally, CLI prediction also supports passing in a configuration file, simply specify the path to the configuration file with --pipeline
.
3. Development Integration/Deployment¶
If the production line meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.
If you need to integrate the production line directly into your Python project, you can refer to the example code in 2.2.2 Python Script Integration.
In addition, PaddleX also provides three other deployment methods, which are detailed as follows:
🚀 High-Performance Inference: In practical production environments, many applications have strict performance requirements for deployment strategies, especially in terms of response speed, to ensure the efficient operation of the system and a smooth user experience. To this end, PaddleX provides a high-performance inference plugin, which aims to deeply optimize the performance of model inference and pre/post-processing to significantly speed up the end-to-end process. For detailed information on high-performance inference, please refer to the PaddleX High-Performance Inference Guide.
☁️ Service-Oriented Deployment: Service-oriented deployment is a common form of deployment in practical production environments. By encapsulating the inference functionality into a service, clients can access these services via network requests to obtain inference results. PaddleX supports multiple service-oriented deployment solutions for production lines. For detailed information on service-oriented deployment, please refer to the PaddleX Service-Oriented Deployment Guide.
Below are the API references for basic service-oriented deployment and examples of multi-language service calls:
API Reference
For the main operations provided by the service:
- The HTTP request method is POST.
- Both the request body and the response body are JSON data (JSON objects).
- When the request is processed successfully, the response status code is
200
, and the attributes of the response body are as follows:
Name | Type | Description |
---|---|---|
logId |
string |
The UUID of the request. |
errorCode |
integer |
Error code. Fixed as 0 . |
errorMsg |
string |
Error message. Fixed as "Success" . |
result |
object |
The result of the operation. |
- When the request is not processed successfully, the attributes of the response body are as follows:
Name | Type | Description |
---|---|---|
logId |
string |
The UUID of the request. |
errorCode |
integer |
Error code. Same as the response status code. |
errorMsg |
string |
Error message. |
The main operations provided by the service are as follows:
infer
Perform time-series forecasting.
POST /time-series-forecasting
- The attributes of the request body are as follows:
Name | Type | Description | Required |
---|---|---|---|
csv |
string |
The URL of a CSV file accessible by the server or the Base64-encoded content of a CSV file. The CSV file must be encoded in UTF-8. | Yes |
- When the request is processed successfully, the
result
in the response body has the following attributes:
Name | Type | Description |
---|---|---|
csv |
string |
The time-series forecasting result in CSV format. Encoded in UTF-8+Base64. |
An example of result
is as follows:
{
"csv": "xxxxxx"
}
Multi-Language Service Call Examples
Python
import base64
import requests
API_URL = "http://localhost:8080/time-series-forecasting" # Service URL
csv_path = "./test.csv"
output_csv_path = "./out.csv"
# Encode the local CSV file using Base64
with open(csv_path, "rb") as file:
csv_bytes = file.read()
csv_data = base64.b64encode(csv_bytes).decode("ascii")
payload = {"csv": csv_data}
# Call the API
response = requests.post(API_URL, json=payload)
# Process the returned data
assert response.status_code == 200
result = response.json()["result"]
with open(output_csv_path, "wb") as f:
f.write(base64.b64decode(result["csv"]))
print(f"Output time-series data saved at {output_csv_path}")
C++
#include <iostream>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64
int main() {
httplib::Client client("localhost:8080");
const std::string csvPath = "./test.csv";
const std::string outputCsvPath = "./out.csv";
httplib::Headers headers = {
{"Content-Type", "application/json"}
};
// Encode the CSV file using Base64
std::ifstream file(csvPath, std::ios::binary | std::ios::ate);
std::streamsize size = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<char> buffer(size);
if (!file.read(buffer.data(), size)) {
std::cerr << "Error reading file." << std::endl;
return 1;
}
std::string bufferStr(reinterpret_cast<const char*>(buffer.data()), buffer.size());
std::string encodedCsv = base64::to_base64(bufferStr);
nlohmann::json jsonObj;
jsonObj["csv"] = encodedCsv;
std::string body = jsonObj.dump();
// Call the API
auto response = client.Post("/time-series-forecasting", headers, body, "application/json");
// Process the returned data
if (response && response->status == 200) {
nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
auto result = jsonResponse["result"];
// Save the data
std::string decodedString;
encodedCsv = result["csv"];
decodedString = base64::from_base64(encodedCsv);
std::vector<unsigned char> decodedCsv(decodedString.begin(), decodedString.end());
std::ofstream outputCsv(outputCsvPath, std::ios::binary | std::ios::out);
if (outputCsv.is_open()) {
outputCsv.write(reinterpret_cast<char*>(decodedCsv.data()), decodedCsv.size());
outputCsv.close();
std::cout << "Output time-series data saved at " << outputCsvPath << std::endl;
} else {
std::cerr << "Unable to open file for writing: " << outputCsvPath << std::endl;
}
} else {
std::cout << "Failed to send HTTP request." << std::endl;
std::cout << response->body << std::endl;
return 1;
}
return 0;
}
Java
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;
public class Main {
public static void main(String[] args) throws IOException {
String API_URL = "http://localhost:8080/time-series-forecasting";
String csvPath = "./test.csv";
String outputCsvPath = "./out.csv";
// Encode the local CSV file using Base64
File file = new File(csvPath);
byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
String csvData = Base64.getEncoder().encodeToString(fileContent);
ObjectMapper objectMapper = new ObjectMapper();
ObjectNode params = objectMapper.createObjectNode();
params.put("csv", csvData);
// Create an OkHttpClient instance
OkHttpClient client = new OkHttpClient();
MediaType JSON = MediaType.Companion.get("application/json; charset=utf-8");
RequestBody body = RequestBody.Companion.create(params.toString(), JSON);
Request request = new Request.Builder()
.url(API_URL)
.post(body)
.build();
// Call the API and process the response data
try (Response response = client.newCall(request).execute()) {
if (response.isSuccessful()) {
String responseBody = response.body().string();
JsonNode resultNode = objectMapper.readTree(responseBody);
JsonNode result = resultNode.get("result");
// Save the returned data
String base64Csv = result.get("csv").asText();
byte[] csvBytes = Base64.getDecoder().decode(base64Csv);
try (FileOutputStream fos = new FileOutputStream(outputCsvPath)) {
fos.write(csvBytes);
}
System.out.println("Output time-series data saved at " + outputCsvPath);
} else {
System.err.println("Request failed with code: " + response.code());
}
}
}
}
Go
package main
import (
"bytes"
"encoding/base64"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
)
func main() {
API_URL := "http://localhost:8080/time-series-forecasting"
csvPath := "./test.csv";
outputCsvPath := "./out.csv";
// Read the csv file and encode it in Base64
csvBytes, err := ioutil.ReadFile(csvPath)
if err != nil {
fmt.Println("Error reading csv file:", err)
return
}
csvData := base64.StdEncoding.EncodeToString(csvBytes)
payload := map[string]string{"csv": csvData} // Base64-encoded file content
payloadBytes, err := json.Marshal(payload)
if err != nil {
fmt.Println("Error marshaling payload:", err)
return
}
// Call the API
client := &http.Client{}
req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
if err != nil {
fmt.Println("Error creating request:", err)
return
}
res, err := client.Do(req)
if err != nil {
fmt.Println("Error sending request:", err)
return
}
defer res.Body.Close()
// Process the response data
body, err := ioutil.ReadAll(res.Body)
if err != nil {
fmt.Println("Error reading response body:", err)
return
}
type Response struct {
Result struct {
Csv string `json:"csv"`
} `json:"result"`
}
var respData Response
err = json.Unmarshal([]byte(string(body)), &respData)
if err != nil {
fmt.Println("Error unmarshaling response body:", err)
return
}
// Decode the Base64-encoded csv data and save it as a file
outputCsvData, err := base64.StdEncoding.DecodeString(respData.Result.Csv)
if err != nil {
fmt.Println("Error decoding base64 csv data:", err)
return
}
err = ioutil.WriteFile(outputCsvPath, outputCsvData, 0644)
if err != nil {
fmt.Println("Error writing csv to file:", err)
return
}
fmt.Printf("Output time-series data saved at %s.csv", outputCsvPath)
}
C#
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;
class Program
{
static readonly string API_URL = "http://localhost:8080/time-series-forecasting";
static readonly string csvPath = "./test.csv";
static readonly string outputCsvPath = "./out.csv";
static async Task Main(string[] args)
{
var httpClient = new HttpClient();
// Encode the local CSV file using Base64
byte[] csvBytes = File.ReadAllBytes(csvPath);
string csvData = Convert.ToBase64String(csvBytes);
var payload = new JObject{ { "csv", csvData } }; // Base64-encoded file content
var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");
// Call the API
HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
response.EnsureSuccessStatusCode();
// Process the returned data
string responseBody = await response.Content.ReadAsStringAsync();
JObject jsonResponse = JObject.Parse(responseBody);
// Save the CSV file
string base64Csv = jsonResponse["result"]["csv"].ToString();
byte[] outputCsvBytes = Convert.FromBase64String(base64Csv);
File.WriteAllBytes(outputCsvPath, outputCsvBytes);
Console.WriteLine($"Output time-series data saved at {outputCsvPath}");
}
}
Node.js
const axios = require('axios');
const fs = require('fs');
const API_URL = 'http://localhost:8080/time-series-forecasting';
const csvPath = "./test.csv";
const outputCsvPath = "./out.csv";
let config = {
method: 'POST',
maxBodyLength: Infinity,
url: API_URL,
data: JSON.stringify({
'csv': encodeFileToBase64(csvPath) // Base64-encoded file content
})
};
// Read the CSV file and convert it to Base64
function encodeFileToBase64(filePath) {
const bitmap = fs.readFileSync(filePath);
return Buffer.from(bitmap).toString('base64');
}
axios.request(config)
.then((response) => {
const result = response.data["result"];
// Save the CSV file
const csvBuffer = Buffer.from(result["csv"], 'base64');
fs.writeFile(outputCsvPath, csvBuffer, (err) => {
if (err) throw err;
console.log(`Output time-series data saved at ${outputCsvPath}`);
});
})
.catch((error) => {
console.log(error);
});
PHP
<?php
$API_URL = "http://localhost:8080/time-series-forecasting"; // Service URL
$csv_path = "./test.csv";
$output_csv_path = "./out.csv";
// Encode the local CSV file in Base64
$csv_data = base64_encode(file_get_contents($csv_path));
$payload = array("csv" => $csv_data); // Base64-encoded file content
// Call the API
$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);
// Process the response data
$result = json_decode($response, true)["result"];
file_put_contents($output_csv_path, base64_decode($result["csv"]));
echo "Output time-series data saved at " . $output_csv_path . "\n";
?>
📱 Edge Deployment: Edge deployment is a method of placing computing and data processing capabilities directly on the user's device, allowing the device to process data without relying on remote servers. PaddleX supports deploying models on edge devices such as Android. For detailed instructions, please refer to the PaddleX Edge Deployment Guide. You can choose the appropriate deployment method based on your needs to integrate the model pipeline into subsequent AI applications.
4. Custom Development¶
If the default model weights provided by the time-series forecasting pipeline are not satisfactory in terms of accuracy or speed for your specific scenario, you can attempt to further fine-tune the existing models using your own domain-specific or application data to improve the performance of the time-series forecasting pipeline in your scenario.
4.1 Model Fine-Tuning¶
Since the general time-series forecasting pipeline includes a time-series forecasting module, if the pipeline's performance does not meet expectations, you need to refer to the Custom Development section in the Time-Series Forecasting Module Development Tutorial to fine-tune the time-series forecasting model using your private dataset.
4.2 Model Application¶
After completing fine-tuning with your private dataset, you will obtain the local model weight file.
If you need to use the fine-tuned model weights, simply modify the pipeline configuration file by filling in the local path of the fine-tuned model weights to the model_dir
in the pipeline configuration file:
pipeline_name: ts_forecast
SubModules:
TSForecast:
module_name: ts_forecast
model_name: DLinear
model_dir: null # Can be modified to the local path of the fine-tuned model
batch_size: 1
Subsequently, refer to the command line method or Python script method in the local experience section to load the modified production line configuration file.
5. Multi-Hardware Support¶
PaddleX supports a variety of mainstream hardware devices, including NVIDIA GPU, Kunlunxin XPU, Ascend NPU, and Cambricon MLU. Simply modify the --device
parameter to seamlessly switch between different hardware devices.
For example, if you are using Ascend NPU for inference in the time-series forecasting production line, the Python command you would use is:
If you want to use the General Time-Series Forecasting Pipeline on a wider range of hardware, please refer to the PaddleX Multi-Device Usage Guide.