Last updated
Was this helpful?
Last updated
Was this helpful?
Gravity can write data files to several destinations. The same principles apply to all file destination types.
The available for file destinations are:
- extracts the full source data on each run and replaces the target file
- extracts the source data incrementally and creates a new file with the data from each run
The destination file format can be set for each job. Gravity supports the following file formats:
CSV: Gravity generates semi-colon (;) delimited files. Values are quoted (") to handle special characters
JSON: Gravity generates newline delimited JSON, which is the same format as the format.
Parquet: Gravity generates Parquet standard Parquet files. The block size, page size, group size and row count can be customised for each job if required.
File partitioning is available for jobs. Gravity supports partitioning files using the following formats.
gravity_id=<runid>
gravity_inserted_year=<yyyy>/gravity_inserted_month=<MM>/gravity_inserted_day=<dd>
gravity_inserted_year=<yyyy>/gravity_inserted_month=<MMMM>/gravity_inserted_year=<dd>
gravity_inserted=<yyyy>-<MM>-<dd>
gravity_inserted=<yyyy>-<MMMM>-<dd>
Destination files can be compressed. The compression type depends on the file format selected. The table below describes which compression types are available for each file format.
The above partitions are not applicable for job mode since the file is replace every run.
CSV
zip, gzip
JSON
zip, gzip
Parquet
snappy, gzip