Files Sources

Gravity can read data files from several destinations. The same principles apply to all file source types.

File Format

The destination file format can be set for each job. Gravity supports the following file formats:

  • CSV: Gravity can read delimited files.

  • JSON: Gravity can read newline delimited JSON, which is the same format as the JSON Lines format.

  • Parquet: Gravity can read Parquet standard Parquet files.

Partitioning

Gravity can read partitioned files. If the file is partitioned, select the appropriate directory path for the file. Gravity will recursively ingest all files that are identified by name.

Compression

Gravity can read compressed files. The compression type depends on the file format selected. The table below describes which compression types are available for each file format.

File FormatCompression Type

CSV

zip, gzip

JSON

zip, gzip

Parquet

snappy, gzip

Nested Source Data

Depending on the destination, Gravity may need to flatten the source data before loading it. The table below describes how Gravity handles nested source data. If required, the objects will be flattened as specified by the ** _data model**_ configuration.

For parquet and JSON source files the following data model configurations are available:

DestinationFlatten or Maintain Model

BigQuery

Maintain Model

Snowflake

Flatten

Redshift

Flatten

Azure

Flatten

SFTP

Maintain the Model if the file remains in the same format. If the file is converted to a different format (e.g. Parquet->CSV), Flatten the data

S3

Maintain the Model if the file remains in the same format. If the file is converted to a different format (e.g. Parquet->CSV), Flatten the data

Last updated