Skip to content

Time Series Forecasting Pipeline Tutorial

1. Introduction to the General Time Series Forecasting Pipeline

Time series forecasting is a technique that utilizes historical data to predict future trends by analyzing the patterns of change in time series data. It is widely applied in fields such as financial markets, weather forecasting, and sales prediction. Time series forecasting often employs statistical methods or deep learning models (e.g., LSTM, ARIMA), capable of handling temporal dependencies in data to provide accurate predictions, assisting decision-makers in better planning and response. This technology plays a crucial role in various industries, including energy management, supply chain optimization, and market analysis.

The General Time Series Forecasting Pipeline includes a time series forecasting module. If you prioritize model accuracy, choose a model with higher accuracy. If you prioritize inference speed, select a model with faster inference. If you prioritize model storage size, choose a model with a smaller storage size.

Model NameModel Download Link MSE MAE Model Storage Size (M)
DLinearInference Model/Trained Model 0.382 0.394 72K
NLinearInference Model/Trained Model 0.386 0.392 40K
NonstationaryInference Model/Trained Model 0.600 0.515 55.5 M
PatchTSTInference Model/Trained Model 0.385 0.397 2.0M
RLinearInference Model/Trained Model 0.384 0.392 40K
TiDEInference Model/Trained Model 0.405 0.412 31.7M
TimesNetInference Model/Trained Model 0.417 0.431 4.9M

Test Environment Description:

  • Performance Test Environment
  • Test Dataset: ETTH1.
  • Hardware Configuration:

    • GPU: NVIDIA Tesla T4
    • CPU: Intel Xeon Gold 6271C @ 2.60GHz
    • Other Environments: Ubuntu 20.04 / cuDNN 8.6 / TensorRT 8.5.2.2
  • Inference Mode Description

Mode GPU Configuration CPU Configuration Acceleration Technology Combination
Normal Mode FP32 Precision / No TRT Acceleration FP32 Precision / 8 Threads PaddleInference
High-Performance Mode Optimal combination of pre-selected precision types and acceleration strategies FP32 Precision / 8 Threads Pre-selected optimal backend (Paddle/OpenVINO/TRT, etc.)

2. Quick Start

The pre-trained model pipelines provided by PaddleX allow for quick experience of their effects. You can experience the effects of the General Time Series Forecasting Pipeline online or locally using command line or Python.

2.1 Online Experience

You can experience the General Time Series Forecasting Pipeline online using the demo provided by the official team, for example:

If you are satisfied with the pipeline's performance, you can directly integrate and deploy it. If not, you can also use your private data to fine-tune the model within the pipeline online.

Note: Due to the close relationship between time series data and scenarios, the official built-in models for online time series tasks are scenario-specific and not universal. Therefore, the experience mode does not support using arbitrary files to experience the effects of the official model solutions. However, after training a model with your own scenario data, you can select your trained model solution and use data from the corresponding scenario for online experience.

2.2 Local Experience

Before using the general time-series forecasting pipeline locally, please ensure that you have completed the installation of the PaddleX wheel package according to the PaddleX Local Installation Guide.

2.2.1 Command Line Experience

You can quickly experience the time-series forecasting pipeline with a single command. Use the test file and replace --input with your local path for prediction.

paddlex --pipeline ts_forecast --input ts_fc.csv --device gpu:0 --save_path ./output

The relevant parameter descriptions can be found in the parameter explanation section of 2.2.2 Python Script Integration.

After running, the result will be printed to the terminal as follows:

👉 Click to Expand
{'input_path': 'ts_fc.csv', 'forecast':                            OT
date
2018-06-26 20:00:00  9.586131
2018-06-26 21:00:00  9.379762
2018-06-26 22:00:00  9.252275
2018-06-26 23:00:00  9.249993
2018-06-27 00:00:00  9.164998
...                       ...
2018-06-30 15:00:00  8.830340
2018-06-30 16:00:00  9.291553
2018-06-30 17:00:00  9.097666
2018-06-30 18:00:00  8.905430
2018-06-30 19:00:00  8.993793

[96 rows x 1 columns]}

For the explanation of the running result parameters, you can refer to the result interpretation in 2.2.2 Python Script Integration.

The time-series file results are saved under save_path.

2.2.2 Python Script Integration

The above command line is for quickly experiencing and viewing the results. Generally, in a project, it is often necessary to integrate through code. You can complete the fast inference of the production line with just a few lines of code. The inference code is as follows:

from paddlex import create_pipeline

pipeline = create_pipeline(pipeline="ts_forecast")

output = pipeline.predict(input="ts_fc.csv")
for res in output:
    res.print() ## 打印预测的结构化输出
    res.save_to_csv(save_path="./output/") ## 保存csv格式结果
    res.save_to_json(save_path="./output/") ## 保存json格式结果

In the above Python script, the following steps are executed:

(1) Instantiate the pipeline object using create_pipeline(). The specific parameters are described as follows:

Parameter Description Type Default
pipeline Pipeline name or path to pipeline config file, if it's set as a pipeline name, it must be a pipeline supported by PaddleX. str None
config Specific configuration information for the pipeline (if set simultaneously with the pipeline, it takes precedence over the pipeline, and the pipeline name must match the pipeline). dict[str, Any] None
device Pipeline inference device. Supports specifying the specific GPU card number, such as "gpu:0", other hardware specific card numbers, such as "npu:0", CPU such as "cpu". str None
use_hpip Whether to enable high-performance inference, only available when the pipeline supports high-performance inference. bool False

(2) Call the predict() method of the ts_forecast pipeline object to perform inference and prediction. This method returns a generator. The parameters and their descriptions for the predict() method are as follows:

Parameter Description Type Options Default Value
input The data to be predicted. It supports multiple input types and is required. Python Var|str|list
  • Python Var: Time-series data represented by pandas.DataFrame.
  • str: Local path to the time-series file (e.g., /root/data/ts.csv); URL link, such as the network URL of the time-series file: Example; Local directory, which should contain the time-series to be predicted (e.g., /root/data/).
  • List: Elements of the list must be of the above types, such as [pandas.DataFrame, pandas.DataFrame], ["/root/data/ts1.csv", "/root/data/ts2.csv"], ["/root/data1", "/root/data2"].
None
device The device for pipeline inference. str|None
  • CPU: Use CPU for inference (e.g., cpu).
  • GPU: Use the specified GPU for inference (e.g., gpu:0 for the first GPU).
  • NPU: Use the specified NPU for inference (e.g., npu:0 for the first NPU).
  • XPU: Use the specified XPU for inference (e.g., xpu:0 for the first XPU).
  • MLU: Use the specified MLU for inference (e.g., mlu:0 for the first MLU).
  • DCU: Use the specified DCU for inference (e.g., dcu:0 for the first DCU).
  • None: If set to None, the default value used during pipeline initialization will be applied. During initialization, the local GPU 0 will be prioritized; if unavailable, the CPU will be used.
None

(3) Process the prediction results. Each sample's prediction result is of type dict and supports operations such as printing, saving to a csv file, and saving to a json file:

Method Description Parameter Type Explanation Default Value
print() Print the result to the terminal format_json bool Whether to format the output content using JSON indentation True
indent int Specifies the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True 4
ensure_ascii bool Controls whether non-ASCII characters are escaped to Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True False
save_to_json() Save the result as a JSON file save_path str The file path to save the result. When a directory is specified, the saved file name will match the input file name None
indent int Specifies the indentation level to beautify the output JSON data, making it more readable. Only effective when format_json is True 4
ensure_ascii bool Controls whether non-ASCII characters are escaped to Unicode. When set to True, all non-ASCII characters will be escaped; False retains the original characters. Only effective when format_json is True False
save_to_csv() Save the result as a CSV file save_path str The file path to save the result. Supports both directory and file paths None
  • Calling the print() method will print the result to the terminal. The printed content is explained as follows:

    • input_path: (str) The input path of the time-series file to be predicted.

    • forecast: (Pandas.DataFrame) The time-series prediction result, including future time points and corresponding predicted values.

  • Calling the save_to_json() method will save the above content to the specified save_path. If a directory is specified, the saved path will be save_path/{your_ts_basename}_res.json. If a file is specified, the result will be saved directly to that file. Since JSON files do not support saving NumPy arrays, numpy.array types will be converted to lists.

  • Calling the save_to_csv() method will save the result to the specified save_path. If a directory is specified, the saved path will be save_path/{your_ts_basename}_res.csv. If a file is specified, the result will be saved directly to that file.

  • In addition, it also supports obtaining prediction results in different formats through attributes, as follows:

Attribute Description
json Obtain the prediction result in json format
csv Obtain the prediction result in csv format
  • The prediction result obtained through the json attribute is of type dict, and its content is consistent with the result saved by the save_to_json() method.
  • The csv attribute returns a Pandas.DataFrame type data, which contains the time-series prediction results.

In addition, you can obtain the ts_forecast production line configuration file and load the configuration file for prediction. You can execute the following command to save the result in my_path:

paddlex --get_pipeline_config ts_forecast --save_path ./my_path

If you have obtained the configuration file, you can customize the settings for the time-series forecasting pipeline by simply modifying the pipeline parameter value in the create_pipeline method to the path of the pipeline configuration file.

For example, if your configuration file is saved at ./my_path/ts_forecast.yaml, you only need to execute:

from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="./my_path/ts_forecast.yaml")
output = pipeline.predict("ts_fc.csv")
for res in output:
    res.print() ## 打印预测的结构化输出
    res.save_to_csv("./output/") ## 保存csv格式结果
    res.save_to_json("./output/") ## 保存json格式结果

Note: The parameters in the configuration file are the initialization parameters for the production line. If you wish to change the initialization parameters for the ts_forecasts production line, you can directly modify the parameters in the configuration file and load the configuration file for prediction. Additionally, CLI prediction also supports passing in a configuration file, simply specify the path to the configuration file with --pipeline.

3. Development Integration/Deployment

If the production line meets your requirements for inference speed and accuracy, you can proceed directly with development integration/deployment.

If you need to integrate the production line directly into your Python project, you can refer to the example code in 2.2.2 Python Script Integration.

In addition, PaddleX also provides three other deployment methods, which are detailed as follows:

🚀 High-Performance Inference: In practical production environments, many applications have strict performance requirements for deployment strategies, especially in terms of response speed, to ensure the efficient operation of the system and a smooth user experience. To this end, PaddleX provides a high-performance inference plugin, which aims to deeply optimize the performance of model inference and pre/post-processing to significantly speed up the end-to-end process. For detailed information on high-performance inference, please refer to the PaddleX High-Performance Inference Guide.

☁️ Service-Oriented Deployment: Service-oriented deployment is a common form of deployment in practical production environments. By encapsulating the inference functionality into a service, clients can access these services via network requests to obtain inference results. PaddleX supports multiple service-oriented deployment solutions for production lines. For detailed information on service-oriented deployment, please refer to the PaddleX Service-Oriented Deployment Guide.

Below are the API references for basic service-oriented deployment and examples of multi-language service calls:

API Reference

For the main operations provided by the service:

  • The HTTP request method is POST.
  • Both the request body and the response body are JSON data (JSON objects).
  • When the request is processed successfully, the response status code is 200, and the attributes of the response body are as follows:
Name Type Description
logId string The UUID of the request.
errorCode integer Error code. Fixed as 0.
errorMsg string Error message. Fixed as "Success".
result object The result of the operation.
  • When the request is not processed successfully, the attributes of the response body are as follows:
Name Type Description
logId string The UUID of the request.
errorCode integer Error code. Same as the response status code.
errorMsg string Error message.

The main operations provided by the service are as follows:

  • infer

Perform time-series forecasting.

POST /time-series-forecasting

  • The attributes of the request body are as follows:
Name Type Description Required
csv string The URL of a CSV file accessible by the server or the Base64-encoded content of a CSV file. The CSV file must be encoded in UTF-8. Yes
  • When the request is processed successfully, the result in the response body has the following attributes:
Name Type Description
csv string The time-series forecasting result in CSV format. Encoded in UTF-8+Base64.

An example of result is as follows:

{
"csv": "xxxxxx"
}
Multi-Language Service Call Examples
Python
import base64
import requests

API_URL = "http://localhost:8080/time-series-forecasting"  # Service URL
csv_path = "./test.csv"
output_csv_path = "./out.csv"

# Encode the local CSV file using Base64
with open(csv_path, "rb") as file:
    csv_bytes = file.read()
    csv_data = base64.b64encode(csv_bytes).decode("ascii")

payload = {"csv": csv_data}

# Call the API
response = requests.post(API_URL, json=payload)

# Process the returned data
assert response.status_code == 200
result = response.json()["result"]
with open(output_csv_path, "wb") as f:
    f.write(base64.b64decode(result["csv"]))
print(f"Output time-series data saved at {output_csv_path}")
C++
#include <iostream>
#include "cpp-httplib/httplib.h" // https://github.com/Huiyicc/cpp-httplib
#include "nlohmann/json.hpp" // https://github.com/nlohmann/json
#include "base64.hpp" // https://github.com/tobiaslocker/base64

int main() {
    httplib::Client client("localhost:8080");
    const std::string csvPath = "./test.csv";
    const std::string outputCsvPath = "./out.csv";

    httplib::Headers headers = {
        {"Content-Type", "application/json"}
    };

    // Encode the CSV file using Base64
    std::ifstream file(csvPath, std::ios::binary | std::ios::ate);
    std::streamsize size = file.tellg();
    file.seekg(0, std::ios::beg);

    std::vector<char> buffer(size);
    if (!file.read(buffer.data(), size)) {
        std::cerr << "Error reading file." << std::endl;
        return 1;
    }
    std::string bufferStr(reinterpret_cast<const char*>(buffer.data()), buffer.size());
    std::string encodedCsv = base64::to_base64(bufferStr);

    nlohmann::json jsonObj;
    jsonObj["csv"] = encodedCsv;
    std::string body = jsonObj.dump();

    // Call the API
    auto response = client.Post("/time-series-forecasting", headers, body, "application/json");
    // Process the returned data
    if (response && response->status == 200) {
        nlohmann::json jsonResponse = nlohmann::json::parse(response->body);
        auto result = jsonResponse["result"];

        // Save the data
        std::string decodedString;
        encodedCsv = result["csv"];
        decodedString = base64::from_base64(encodedCsv);
        std::vector<unsigned char> decodedCsv(decodedString.begin(), decodedString.end());
        std::ofstream outputCsv(outputCsvPath, std::ios::binary | std::ios::out);
        if (outputCsv.is_open()) {
            outputCsv.write(reinterpret_cast<char*>(decodedCsv.data()), decodedCsv.size());
            outputCsv.close();
            std::cout << "Output time-series data saved at " << outputCsvPath << std::endl;
        } else {
            std::cerr << "Unable to open file for writing: " << outputCsvPath << std::endl;
        }
    } else {
        std::cout << "Failed to send HTTP request." << std::endl;
        std::cout << response->body << std::endl;
        return 1;
    }

    return 0;
}
Java
import okhttp3.*;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Base64;

public class Main {
    public static void main(String[] args) throws IOException {
        String API_URL = "http://localhost:8080/time-series-forecasting";
        String csvPath = "./test.csv";
        String outputCsvPath = "./out.csv";

        // Encode the local CSV file using Base64
        File file = new File(csvPath);
        byte[] fileContent = java.nio.file.Files.readAllBytes(file.toPath());
        String csvData = Base64.getEncoder().encodeToString(fileContent);

        ObjectMapper objectMapper = new ObjectMapper();
        ObjectNode params = objectMapper.createObjectNode();
        params.put("csv", csvData);

        // Create an OkHttpClient instance
        OkHttpClient client = new OkHttpClient();
        MediaType JSON = MediaType.Companion.get("application/json; charset=utf-8");
        RequestBody body = RequestBody.Companion.create(params.toString(), JSON);
        Request request = new Request.Builder()
                .url(API_URL)
                .post(body)
                .build();

        // Call the API and process the response data
        try (Response response = client.newCall(request).execute()) {
            if (response.isSuccessful()) {
                String responseBody = response.body().string();
                JsonNode resultNode = objectMapper.readTree(responseBody);
                JsonNode result = resultNode.get("result");

                // Save the returned data
                String base64Csv = result.get("csv").asText();
                byte[] csvBytes = Base64.getDecoder().decode(base64Csv);
                try (FileOutputStream fos = new FileOutputStream(outputCsvPath)) {
                    fos.write(csvBytes);
                }
                System.out.println("Output time-series data saved at " + outputCsvPath);
            } else {
                System.err.println("Request failed with code: " + response.code());
            }
        }
    }
}
Go
package main

import (
    "bytes"
    "encoding/base64"
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
)

func main() {
    API_URL := "http://localhost:8080/time-series-forecasting"
    csvPath := "./test.csv";
    outputCsvPath := "./out.csv";

    // Read the csv file and encode it in Base64
    csvBytes, err := ioutil.ReadFile(csvPath)
    if err != nil {
        fmt.Println("Error reading csv file:", err)
        return
    }
    csvData := base64.StdEncoding.EncodeToString(csvBytes)

    payload := map[string]string{"csv": csvData} // Base64-encoded file content
    payloadBytes, err := json.Marshal(payload)
    if err != nil {
        fmt.Println("Error marshaling payload:", err)
        return
    }

    // Call the API
    client := &http.Client{}
    req, err := http.NewRequest("POST", API_URL, bytes.NewBuffer(payloadBytes))
    if err != nil {
        fmt.Println("Error creating request:", err)
        return
    }

    res, err := client.Do(req)
    if err != nil {
        fmt.Println("Error sending request:", err)
        return
    }
    defer res.Body.Close()

    // Process the response data
    body, err := ioutil.ReadAll(res.Body)
    if err != nil {
        fmt.Println("Error reading response body:", err)
        return
    }
    type Response struct {
        Result struct {
            Csv string `json:"csv"`
        } `json:"result"`
    }
    var respData Response
    err = json.Unmarshal([]byte(string(body)), &respData)
    if err != nil {
        fmt.Println("Error unmarshaling response body:", err)
        return
    }

    // Decode the Base64-encoded csv data and save it as a file
    outputCsvData, err := base64.StdEncoding.DecodeString(respData.Result.Csv)
    if err != nil {
        fmt.Println("Error decoding base64 csv data:", err)
        return
    }
    err = ioutil.WriteFile(outputCsvPath, outputCsvData, 0644)
    if err != nil {
        fmt.Println("Error writing csv to file:", err)
        return
    }
    fmt.Printf("Output time-series data saved at %s.csv", outputCsvPath)
}
C#
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Linq;

class Program
{
    static readonly string API_URL = "http://localhost:8080/time-series-forecasting";
    static readonly string csvPath = "./test.csv";
    static readonly string outputCsvPath = "./out.csv";

    static async Task Main(string[] args)
    {
        var httpClient = new HttpClient();

        // Encode the local CSV file using Base64
        byte[] csvBytes = File.ReadAllBytes(csvPath);
        string csvData = Convert.ToBase64String(csvBytes);

        var payload = new JObject{ { "csv", csvData } }; // Base64-encoded file content
        var content = new StringContent(payload.ToString(), Encoding.UTF8, "application/json");

        // Call the API
        HttpResponseMessage response = await httpClient.PostAsync(API_URL, content);
        response.EnsureSuccessStatusCode();

        // Process the returned data
        string responseBody = await response.Content.ReadAsStringAsync();
        JObject jsonResponse = JObject.Parse(responseBody);

        // Save the CSV file
        string base64Csv = jsonResponse["result"]["csv"].ToString();
        byte[] outputCsvBytes = Convert.FromBase64String(base64Csv);
        File.WriteAllBytes(outputCsvPath, outputCsvBytes);
        Console.WriteLine($"Output time-series data saved at {outputCsvPath}");
    }
}
Node.js
const axios = require('axios');
const fs = require('fs');

const API_URL = 'http://localhost:8080/time-series-forecasting';
const csvPath = "./test.csv";
const outputCsvPath = "./out.csv";

let config = {
   method: 'POST',
   maxBodyLength: Infinity,
   url: API_URL,
   data: JSON.stringify({
    'csv': encodeFileToBase64(csvPath)  // Base64-encoded file content
  })
};

// Read the CSV file and convert it to Base64
function encodeFileToBase64(filePath) {
  const bitmap = fs.readFileSync(filePath);
  return Buffer.from(bitmap).toString('base64');
}

axios.request(config)
.then((response) => {
    const result = response.data["result"];

    // Save the CSV file
    const csvBuffer = Buffer.from(result["csv"], 'base64');
    fs.writeFile(outputCsvPath, csvBuffer, (err) => {
      if (err) throw err;
      console.log(`Output time-series data saved at ${outputCsvPath}`);
    });
})
.catch((error) => {
  console.log(error);
});
PHP
<?php

$API_URL = "http://localhost:8080/time-series-forecasting"; // Service URL
$csv_path = "./test.csv";
$output_csv_path = "./out.csv";

// Encode the local CSV file in Base64
$csv_data = base64_encode(file_get_contents($csv_path));
$payload = array("csv" => $csv_data); // Base64-encoded file content

// Call the API
$ch = curl_init($API_URL);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($payload));
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
curl_close($ch);

// Process the response data
$result = json_decode($response, true)["result"];

file_put_contents($output_csv_path, base64_decode($result["csv"]));
echo "Output time-series data saved at " . $output_csv_path . "\n";

?>


📱 Edge Deployment: Edge deployment is a method of placing computing and data processing capabilities directly on the user's device, allowing the device to process data without relying on remote servers. PaddleX supports deploying models on edge devices such as Android. For detailed instructions, please refer to the PaddleX Edge Deployment Guide. You can choose the appropriate deployment method based on your needs to integrate the model pipeline into subsequent AI applications.

4. Custom Development

If the default model weights provided by the time-series forecasting pipeline are not satisfactory in terms of accuracy or speed for your specific scenario, you can attempt to further fine-tune the existing models using your own domain-specific or application data to improve the performance of the time-series forecasting pipeline in your scenario.

4.1 Model Fine-Tuning

Since the general time-series forecasting pipeline includes a time-series forecasting module, if the pipeline's performance does not meet expectations, you need to refer to the Custom Development section in the Time-Series Forecasting Module Development Tutorial to fine-tune the time-series forecasting model using your private dataset.

4.2 Model Application

After completing fine-tuning with your private dataset, you will obtain the local model weight file.

If you need to use the fine-tuned model weights, simply modify the pipeline configuration file by filling in the local path of the fine-tuned model weights to the model_dir in the pipeline configuration file:

pipeline_name: ts_forecast

SubModules:
  TSForecast:
    module_name: ts_forecast
    model_name: DLinear
    model_dir: null # Can be modified to the local path of the fine-tuned model
    batch_size: 1

Subsequently, refer to the command line method or Python script method in the local experience section to load the modified production line configuration file.

5. Multi-Hardware Support

PaddleX supports a variety of mainstream hardware devices, including NVIDIA GPU, Kunlunxin XPU, Ascend NPU, and Cambricon MLU. Simply modify the --device parameter to seamlessly switch between different hardware devices.

For example, if you are using Ascend NPU for inference in the time-series forecasting production line, the Python command you would use is:

paddlex --pipeline ts_forecast --input ts_fc.csv --device npu:0

If you want to use the General Time-Series Forecasting Pipeline on a wider range of hardware, please refer to the PaddleX Multi-Device Usage Guide.

Comments