Data.dataset(数据集) 模块¶
ppsci.data.dataset
¶
IterableNamedArrayDataset
¶
Bases: IterableDataset
IterableNamedArrayDataset for full-data loading.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
Dict[str, ndarray]
|
Input dict. |
required |
label |
Optional[Dict[str, ndarray]]
|
Label dict. Defaults to None. |
None
|
weight |
Optional[Dict[str, ndarray]]
|
Weight dict. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> input = {"x": np.random.randn(100, 1)}
>>> label = {"u": np.random.randn(100, 1)}
>>> weight = {"u": np.random.randn(100, 1)}
>>> dataset = ppsci.data.dataset.IterableNamedArrayDataset(input, label, weight)
Source code in ppsci/data/dataset/array_dataset.py
num_samples
property
¶
Number of samples within current dataset.
NamedArrayDataset
¶
Bases: Dataset
Class for Named Array Dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
Dict[str, ndarray]
|
Input dict. |
required |
label |
Optional[Dict[str, ndarray]]
|
Label dict. Defaults to None. |
None
|
weight |
Optional[Dict[str, ndarray]]
|
Weight dict. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> input = {"x": np.random.randn(100, 1)}
>>> output = {"u": np.random.randn(100, 1)}
>>> weight = {"u": np.random.randn(100, 1)}
>>> dataset = ppsci.data.dataset.NamedArrayDataset(input, output, weight)
Source code in ppsci/data/dataset/array_dataset.py
ChipHeatDataset
¶
Bases: Dataset
ChipHeatDataset for data loading of multi-branch DeepONet model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
Dict[str, ndarray]
|
Input dict. |
required |
label |
Optional[Dict[str, ndarray]]
|
Label dict. Defaults to None. |
required |
index |
tuple[str, ...]
|
Key of input dict. |
required |
data_type |
str
|
One of key of input dict. |
required |
weight |
Optional[Dict[str, ndarray]]
|
Weight dict. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> input = {"x": np.random.randn(100, 1)}
>>> label = {"u": np.random.randn(100, 1)}
>>> index = ('x', 'u', 'bc', 'bc_data')
>>> data_type = 'u'
>>> weight = {"u": np.random.randn(100, 1)}
>>> dataset = ppsci.data.dataset.ChipHeatDataset(input, label, index, data_type, weight)
Source code in ppsci/data/dataset/array_dataset.py
234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 |
|
CSVDataset
¶
Bases: Dataset
Dataset class for .csv file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
CSV file path. |
required |
input_keys |
Tuple[str, ...]
|
List of input keys. |
required |
label_keys |
Tuple[str, ...]
|
List of label keys. |
required |
alias_dict |
Optional[Dict[str, str]]
|
Dict of alias(es) for input and label keys. i.e. {inner_key: outer_key}. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, Union[Callable, float]]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
timestamps |
Optional[Tuple[float, ...]]
|
The number of repetitions of the data in the time dimension. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.CSVDataset(
... "/path/to/file.csv",
... ("x",),
... ("u",),
... )
Source code in ppsci/data/dataset/csv_dataset.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
IterableCSVDataset
¶
Bases: IterableDataset
IterableCSVDataset for full-data loading.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
CSV file path. |
required |
input_keys |
Tuple[str, ...]
|
List of input keys. |
required |
label_keys |
Tuple[str, ...]
|
List of label keys. |
required |
alias_dict |
Optional[Dict[str, str]]
|
Dict of alias(es) for input and label keys. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, Union[Callable, float]]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
timestamps |
Optional[Tuple[float, ...]]
|
The number of repetitions of the data in the time dimension. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.IterableCSVDataset(
... "/path/to/file.csv"
... ("x",),
... ("u",),
... )
Source code in ppsci/data/dataset/csv_dataset.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
|
num_samples
property
¶
Number of samples within current dataset.
ContinuousNamedArrayDataset
¶
Bases: IterableDataset
ContinuousNamedArrayDataset for iterable sampling.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
Callable
|
Function generate input dict. |
required |
label |
Callable
|
Function generate label dict. |
required |
weight |
Optional[Callable]
|
Function generate weight dict. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> import numpy as np
>>> input = lambda : {"x": np.random.randn(100, 1)}
>>> label = lambda inp: {"u": np.random.randn(100, 1)}
>>> weight = lambda inp, label: {"u": 1 - (label["u"] ** 2)}
>>> dataset = ppsci.data.dataset.ContinuousNamedArrayDataset(input, label, weight)
>>> input_batch, label_batch, weight_batch = next(iter(dataset))
>>> print(input_batch["x"].shape)
[100, 1]
>>> print(label_batch["u"].shape)
[100, 1]
>>> print(weight_batch["u"].shape)
[100, 1]
Source code in ppsci/data/dataset/array_dataset.py
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
|
num_samples
property
¶
Number of samples within current dataset.
ERA5Dataset
¶
Bases: Dataset
Class for ERA5 dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Data set path. |
required |
input_keys |
Tuple[str, ...]
|
Input keys, such as ("input",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("output",). |
required |
precip_file_path |
Optional[str]
|
Precipitation data set path. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, float]]
|
Weight dictionary. Defaults to None. |
None
|
vars_channel |
Optional[Tuple[int, ...]]
|
The variable channel index in ERA5 dataset. Defaults to None. |
None
|
num_label_timestamps |
int
|
Number of timestamp of label. Defaults to 1. |
1
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
training |
bool
|
Whether in train mode. Defaults to True. |
True
|
stride |
int
|
Stride of sampling data. Defaults to 1. |
1
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.ERA5Dataset(
... "file_path": "/path/to/ERA5Dataset",
... "input_keys": ("input",),
... "label_keys": ("output",),
... )
Source code in ppsci/data/dataset/era5_dataset.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 |
|
ERA5SampledDataset
¶
Bases: Dataset
Class for ERA5 sampled dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Data set path. |
required |
input_keys |
Tuple[str, ...]
|
Input keys, such as ("input",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("output",). |
required |
weight_dict |
Optional[Dict[str, float]]
|
Weight dictionary. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.ERA5SampledDataset(
... "file_path": "/path/to/ERA5SampledDataset",
... "input_keys": ("input",),
... "label_keys": ("output",),
... )
>>> # get the length of the dataset
>>> dataset_size = len(dataset)
>>> # get the first sample of the data
>>> first_sample = dataset[0]
>>> print("First sample:", first_sample)
Source code in ppsci/data/dataset/era5_dataset.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 |
|
IterableMatDataset
¶
Bases: IterableDataset
IterableMatDataset for full-data loading.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Mat file path. |
required |
input_keys |
Tuple[str, ...]
|
List of input keys. |
required |
label_keys |
Tuple[str, ...]
|
List of label keys. Defaults to (). |
()
|
alias_dict |
Optional[Dict[str, str]]
|
Dict of alias(es) for input and label keys. i.e. {inner_key: outer_key}. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, Union[Callable, float]]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
timestamps |
Optional[Tuple[float, ...]]
|
The number of repetitions of the data in the time dimension. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.IterableMatDataset(
... "/path/to/file.mat"
... ("x",),
... ("u",),
... )
Source code in ppsci/data/dataset/mat_dataset.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
|
num_samples
property
¶
Number of samples within current dataset.
MatDataset
¶
Bases: Dataset
Dataset class for .mat file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Mat file path. |
required |
input_keys |
Tuple[str, ...]
|
List of input keys. |
required |
label_keys |
Tuple[str, ...]
|
List of label keys. Defaults to (). |
()
|
alias_dict |
Optional[Dict[str, str]]
|
Dict of alias(es) for input and label keys. i.e. {inner_key: outer_key}. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, Union[Callable, float]]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
timestamps |
Optional[Tuple[float, ...]]
|
The number of repetitions of the data in the time dimension. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.MatDataset(
... "/path/to/file.mat"
... ("x",),
... ("u",),
... )
Source code in ppsci/data/dataset/mat_dataset.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
IterableNPZDataset
¶
Bases: IterableDataset
IterableNPZDataset for full-data loading.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Npz file path. |
required |
input_keys |
Tuple[str, ...]
|
List of input keys. |
required |
label_keys |
Tuple[str, ...]
|
List of label keys. Defaults to (). |
()
|
alias_dict |
Optional[Dict[str, str]]
|
Dict of alias(es) for input and label keys. i.e. {inner_key: outer_key}. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, Union[Callable, float]]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
timestamps |
Optional[Tuple[float, ...]]
|
The number of repetitions of the data in the time dimension. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.IterableNPZDataset(
... "/path/to/file.npz"
... ("x",),
... ("u",),
... )
Source code in ppsci/data/dataset/npz_dataset.py
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 |
|
num_samples
property
¶
Number of samples within current dataset.
NPZDataset
¶
Bases: Dataset
Dataset class for .npz file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Npz file path. |
required |
input_keys |
Tuple[str, ...]
|
List of input keys. |
required |
label_keys |
Tuple[str, ...]
|
List of label keys. Defaults to (). |
()
|
alias_dict |
Optional[Dict[str, str]]
|
Dict of alias(es) for input and label keys. i.e. {inner_key: outer_key}. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, Union[Callable, float]]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
timestamps |
Optional[Tuple[float, ...]]
|
The number of repetitions of the data in the time dimension. Defaults to None. |
None
|
transforms |
Optional[Compose]
|
Compose object contains sample wise transform(s). Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.NPZDataset(
... "/path/to/file.npz"
... ("x",),
... ("u",),
... )
Source code in ppsci/data/dataset/npz_dataset.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 |
|
CylinderDataset
¶
Bases: Dataset
Dataset for training Cylinder model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Data set path. |
required |
input_keys |
Tuple[str, ...]
|
Input keys, such as ("states","visc"). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("pred_states", "recover_states"). |
required |
block_size |
int
|
Data block size. |
required |
stride |
int
|
Data stride. |
required |
ndata |
Optional[int]
|
Number of data series to use. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, float]]
|
Weight dictionary. Defaults to None. |
None
|
embedding_model |
Optional[Arch]
|
Embedding model. Defaults to None. |
None
|
embedding_batch_size |
int
|
The batch size of embedding model. Defaults to 64. |
64
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.CylinderDataset(
... "file_path": "/path/to/CylinderDataset",
... "input_keys": ("x",),
... "label_keys": ("v",),
... "block_size": 32,
... "stride": 16,
... )
Source code in ppsci/data/dataset/trphysx_dataset.py
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 |
|
LorenzDataset
¶
Bases: Dataset
Dataset for training Lorenz model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Data set path. |
required |
input_keys |
Tuple[str, ...]
|
Input keys, such as ("states",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("pred_states", "recover_states"). |
required |
block_size |
int
|
Data block size. |
required |
stride |
int
|
Data stride. |
required |
ndata |
Optional[int]
|
Number of data series to use. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, float]]
|
Weight dictionary. Defaults to None. |
None
|
embedding_model |
Optional[Arch]
|
Embedding model. Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.LorenzDataset(
... "file_path": "/path/to/LorenzDataset",
... "input_keys": ("x",),
... "label_keys": ("v",),
... "block_size": 32,
... "stride": 16,
... )
Source code in ppsci/data/dataset/trphysx_dataset.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
|
RosslerDataset
¶
Bases: LorenzDataset
Dataset for training Rossler model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
Data set path. |
required |
input_keys |
Tuple[str, ...]
|
Input keys, such as ("states",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("pred_states", "recover_states"). |
required |
block_size |
int
|
Data block size. |
required |
stride |
int
|
Data stride. |
required |
ndata |
Optional[int]
|
Number of data series to use. Defaults to None. |
None
|
weight_dict |
Optional[Dict[str, float]]
|
Weight dictionary. Defaults to None. |
None
|
embedding_model |
Optional[Arch]
|
Embedding model. Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.RosslerDataset(
... "file_path": "/path/to/RosslerDataset",
... "input_keys": ("x",),
... "label_keys": ("v",),
... "block_size": 32,
... "stride": 16,
... )
Source code in ppsci/data/dataset/trphysx_dataset.py
VtuDataset
¶
Bases: Dataset
Dataset class for .vtu file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_path |
str
|
*.vtu file path. |
required |
input_keys |
Optional[Tuple[str, ...]]
|
Tuple of input keys. Defaults to None. |
None
|
label_keys |
Optional[Tuple[str, ...]]
|
Tuple of label keys. Defaults to None. |
None
|
time_step |
Optional[int]
|
Time step with unit second. Defaults to None. |
None
|
time_index |
Optional[Tuple[int, ...]]
|
Time index tuple in increasing order. |
None
|
labels |
Optional[Dict[str, float]]
|
Temporary variable for [load_vtk_with_time_file]. |
None
|
transforms |
Compose
|
Compose object contains sample wise. transform(s). |
None
|
Examples:
>>> # get the length of the dataset
>>> dataset_size = len(dataset)
>>> # get the first sample of the data
>>> first_sample = dataset[0]
>>> print("First sample:", first_sample)
Source code in ppsci/data/dataset/vtu_dataset.py
MeshAirfoilDataset
¶
Bases: Dataset
Dataset for MeshAirfoil
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
Tuple[str, ...]
|
Name of input data. |
required |
label_keys |
Tuple[str, ...]
|
Name of label data. |
required |
data_dir |
str
|
Directory of MeshAirfoil data. |
required |
mesh_graph_path |
str
|
Path of mesh graph. |
required |
transpose_edges |
bool
|
Whether transpose the edges array from (2, num_edges) to (num_edges, 2) for convenient of slicing. |
False
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.MeshAirfoilDataset(
... "input_keys": ("input",),
... "label_keys": ("output",),
... "data_dir": "/path/to/MeshAirfoilDataset",
... "mesh_graph_path": "/path/to/file.su2",
... "transpose_edges": False,
... )
Source code in ppsci/data/dataset/airfoil_dataset.py
91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 |
|
MeshCylinderDataset
¶
Bases: Dataset
Dataset for MeshCylinder
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
Tuple[str, ...]
|
Name of input data. |
required |
label_keys |
Tuple[str, ...]
|
Name of label data. |
required |
data_dir |
str
|
Directory of MeshCylinder data. |
required |
mesh_graph_path |
str
|
Path of mesh graph. |
required |
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.MeshAirfoilDataset(
... "input_keys": ("input",),
... "label_keys": ("output",),
... "data_dir": "/path/to/MeshAirfoilDataset",
... "mesh_graph_path": "/path/to/file.su2",
... )
Source code in ppsci/data/dataset/cylinder_dataset.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 |
|
RadarDataset
¶
Bases: Dataset
Class for Radar dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
Tuple[str, ...]
|
Input keys, such as ("input",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("output",). |
required |
image_width |
int
|
Image width. |
required |
image_height |
int
|
Image height. |
required |
total_length |
int
|
Total length. |
required |
dataset_path |
str
|
Dataset path. |
required |
data_type |
str
|
Input and output data type. Defaults to paddle.get_default_dtype(). |
get_default_dtype()
|
weight_dict |
Optional[Dict[str, float]]
|
Weight dictionary. Defaults to None. |
None
|
Examples:
>>> import ppsci
>>> dataset = ppsci.data.dataset.RadarDataset(
... "input_keys": ("input",),
... "label_keys": ("output",),
... "image_width": 512,
... "image_height": 512,
... "total_length": 29,
... "dataset_path": "datasets/mrms/figure",
... "data_type": paddle.get_default_dtype(),
... )
Source code in ppsci/data/dataset/radar_dataset.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
|
DGMRDataset
¶
Bases: Dataset
Dataset class for DGMR (Deep Generative Model for Radar) model. This open-sourced UK dataset has been mirrored to HuggingFace Datasets https://huggingface.co/datasets/openclimatefix/nimrod-uk-1km. If the reader cannot load the dataset from Hugging Face, please manually download it and modify the dataset_path to the local path for loading.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
Tuple[str, ...]
|
Input keys, such as ("input",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("output",). |
required |
split |
str
|
The split of the dataset, "validation" or "train". Defaults to "validation". |
'validation'
|
num_input_frames |
int
|
Number of input frames. Defaults to 4. |
4
|
num_target_frames |
int
|
Number of target frames. Defaults to 18. |
18
|
dataset_path |
str
|
Path to the dataset. Defaults to "openclimatefix/nimrod-uk-1km". |
'openclimatefix/nimrod-uk-1km'
|
Examples:
Source code in ppsci/data/dataset/dgmr_dataset.py
DarcyFlowDataset
¶
Bases: Dataset
Loads a small Darcy-Flow dataset
Training contains 1000 samples in resolution 16x16. Testing contains 100 samples at resolution 16x16 and 50 samples at resolution 32x32.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
Tuple[str, ...]
|
Input keys, such as ("input",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("output",). |
required |
data_dir |
str
|
The directory to load data from. |
required |
weight_dict |
Optional[Dict[str, float]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
test_resolutions |
List[int, ...]
|
The resolutions to test dataset. Default is [16, 32]. |
[32]
|
grid_boundaries |
List[int, ...]
|
The boundaries of the grid. Default is [[0,1],[0,1]]. |
[[0, 1], [0, 1]]
|
positional_encoding |
bool
|
Whether to use positional encoding. Default is True |
True
|
encode_input |
bool
|
Whether to encode the input. Default is False |
False
|
encode_output |
bool
|
Whether to encode the output. Default is True |
True
|
encoding |
str
|
The type of encoding. Default is 'channel-wise'. |
'channel-wise'
|
channel_dim |
int
|
The location of unsqueeze. Default is 1. where to put the channel dimension. Defaults size is batch, channel, height, width |
1
|
data_split |
str
|
Wether to use training or test dataset. Default is 'train'. |
'train'
|
Source code in ppsci/data/dataset/darcyflow_dataset.py
146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 |
|
SphericalSWEDataset
¶
Bases: Dataset
Loads a Spherical Shallow Water equations dataset
Training contains 200 samples in resolution 32x64. Testing contains 50 samples at resolution 32x64 and 50 samples at resolution 64x128.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_keys |
Tuple[str, ...]
|
Input keys, such as ("input",). |
required |
label_keys |
Tuple[str, ...]
|
Output keys, such as ("output",). |
required |
data_dir |
str
|
The directory to load data from. |
required |
weight_dict |
Optional[Dict[str, float]]
|
Define the weight of each constraint variable. Defaults to None. |
None
|
test_resolutions |
Tuple[str, ...]
|
The resolutions to test dataset. Defaults to ["34x64", "64x128"]. |
['34x64', '64x128']
|
train_resolution |
str
|
The resolutions to train dataset. Defaults to "34x64". |
'34x64'
|
data_split |
str
|
Specify the dataset split, either 'train' , 'test_32x64',or 'test_64x128'. Defaults to "train". |
'train'
|
Source code in ppsci/data/dataset/spherical_swe_dataset.py
build_dataset(cfg)
¶
Build dataset
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cfg |
List[DictConfig]
|
Dataset config list. |
required |
Returns:
Type | Description |
---|---|
Dataset
|
Dict[str, io.Dataset]: dataset. |