Files Sources
Gravity can read data files from several destinations. The same principles apply to all file source types.
File Format
The destination file format can be set for each job. Gravity supports the following file formats:
CSV: Gravity can read delimited files.
JSON: Gravity can read newline delimited JSON, which is the same format as the JSON Lines format.
Parquet: Gravity can read Parquet standard Parquet files.
Partitioning
Gravity can read partitioned files. If the file is partitioned, select the appropriate directory path for the file. Gravity will recursively ingest all files that are identified by name.
Compression
Gravity can read compressed files. The compression type depends on the file format selected. The table below describes which compression types are available for each file format.
File Format | Compression Type |
---|---|
CSV | zip, gzip |
JSON | zip, gzip |
Parquet | snappy, gzip |
Nested Source Data
Depending on the destination, Gravity may need to flatten the source data before loading it. The table below describes how Gravity handles nested source data. If required, the objects will be flattened as specified by the ** _data model**_ configuration.
For parquet and JSON source files the following data model configurations are available:
Flattened Documents Model: Implicitly join nested object arrays into a single table.
Document Model: Model a top-level view of a document. Nested object arrays are returned as strings.
Destination | Flatten or Maintain Model |
---|---|
BigQuery | Maintain Model |
Snowflake | Flatten |
Redshift | Flatten |
Azure | Flatten |
SFTP | Maintain the Model if the file remains in the same format. If the file is converted to a different format (e.g. Parquet->CSV), Flatten the data |
S3 | Maintain the Model if the file remains in the same format. If the file is converted to a different format (e.g. Parquet->CSV), Flatten the data |
Last updated