unitorch.cli.writer¤
GeneralParquetWriter¤
Tip
core/writer/parquet
is the section for configuration of GeneralParquetWriter.
Bases: GenericWriter
Initialize GeneralParquetWriter.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output_file |
str
|
The path to the output file. |
required |
nrows_per_sample |
int
|
The number of rows per sample. Defaults to None. |
None
|
columns |
List[str]
|
The list of columns to include in the output file. Defaults to None. |
None
|
schema |
str
|
The Parquet schema in string format. Defaults to None. |
None
|
compression |
str
|
The compression algorithm to use. Defaults to "snappy". |
'snappy'
|
Source code in src/unitorch/cli/writers/__init__.py
223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 |
|
from_core_configure
classmethod
¤
from_core_configure(config, **kwargs)
Create an instance of GeneralParquetWriter from a core configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config |
The core configuration. |
required | |
**kwargs |
Additional keyword arguments. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
GeneralParquetWriter |
An instance of GeneralParquetWriter. |
Source code in src/unitorch/cli/writers/__init__.py
248 249 250 251 252 253 254 255 256 257 258 259 260 261 |
|
process_chunk ¤
process_chunk(outputs: WriterOutputs)
Process a chunk of data during the writing process.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
outputs |
WriterOutputs
|
The writer outputs. |
required |
Source code in src/unitorch/cli/writers/__init__.py
293 294 295 296 297 298 299 300 301 302 303 304 305 306 |
|
process_end ¤
process_end()
Process the end of the writing process.
Source code in src/unitorch/cli/writers/__init__.py
289 290 291 |
|
process_start ¤
process_start(outputs: WriterOutputs)
Process the start of the writing process.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
outputs |
WriterOutputs
|
The writer outputs. |
required |
Source code in src/unitorch/cli/writers/__init__.py
263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 |
|