Job Modes

The job mode controls two key components:

  • Whether the job requests all historical data or moves forward incrementally

  • How the records are inserted or merged into the destination

The Gravity app will set a default job mode based for the nature of the job. The user can override this default if there is a more appropriate job mode for their use case. There are three available job modes which are detailed below, including their function and purpose.

Merge by key

  • Available for all source types; and

  • Available for database destinations only.

Merge by key mode extracts the latest data from a source and merges the new records into a database destination.

When to use this mode?

Merge by key mode is recommended for replicating larger data tables to database destinations. It is not efficient to extract and replicate historical data on a schedule if new records can be isolated and merged into existing destination data.

This mode is recommended in the following scenarios:

  1. A source database table is replicated to a database destination. The table contains a logical date or ID column which can be used to incrementally fetch new records and there is a composite key available to use for merging the data into the destination

  2. A source file is replicated to a database destination. The file contains a composite key to use for merging the data

  3. Any API source data table is replicated to a database destination.

Merge by key mode is recommended for these example scenarios:

  • Facebook Ad Insights (API) is replicated to a database destination

  • A PostgreSQL source data table, with an incremental column and a composite key, is replicated to another database destination

  • Daily CSV files, which contain the last 30 days of data and have a composite key, are replicated to a database destination

How is the data fetched?

The job will function incrementally. For database and API sources, the app will track the most recently loaded records by date or ID as relevant. When the job is next run, data will be fetched from this point onwards.

For database sources, users will be asked to configure the date or ID field that should be used to track the incremental loading.

For API sources, the app will detect the logical date or ID field to track the incremental loading. The app will also account for API refresh windows and refresh this historical data as needed.

For flat file sources, the app will read any source files that have been added or updated since the previous run.

How is the data merged?

The app will merge the incremental data into the destination based on the table's composite key. For Database and flat file sources, the user will be asked to configure the composite key for the data. For API sources, the app will detect the logical composite key.

Delete and Replace

  • Available for all source types except for large API data tables; and

  • Available for all destination types

Delete and replace mode completely refreshes the replicated data from source to destination.

When to use this mode?

This mode is recommended in the following scenarios:

  1. A small database or API source table contains all historical data and the entire table can be fetched and loaded quickly

  2. A large database source table contains all historical data but does not have a logical date or ID field to incrementally load data

  3. A large database source table or source file contains all historical data but does not have a logical composite key to incrementally load data to database destination

  4. Historical records are often deleted or updated in the source table

Delete and replace mode is recommended for these example scenarios:

  • Facebook Ad Accounts is replicated to a database destination

  • PostgreSQL source lookup tables are replicated to an SFTP destination

  • A source CSV file, without a composite key, which is updated to include new records periodically and is replicated to a database destination

How is the data fetched?

For database and API sources, the entire historical data is fetched on each run. For flat file sources, the app will read any source files that have been added or updated since the previous run.

How is the data merged?

For database destinations, the table will be truncated and the data will be reloaded. For file destinations, the existing file from a previous run will be overridden by the latest data.

Append

  • Available for file type sources; or

  • Available for file type destinations

Append mode continuously adds new data to the destination without attempting to merge the records by key.

Append mode is available for file type sources because they can be less structured than API and database sources and they often do not contain a composite key. Append mode is available for file type destinations because merging records into files is not a supported operation.

When to use this mode?

Append mode is used in similar circumstances as merge by key mode, but in cases where the data cannot be merged into the destination due to some limitation.

This mode is recommended in the following scenarios:

  1. A database or API source table is incrementally replicated to a file type destination (e.g. SFTP) which does not support record merging.

  2. A source file is replicated to a database destination. The file does not contain all historical data and the file does not have a logical composite key to merge on. The data cannot be deleted and reinserted without losing historical data and, equally, the data cannot be merged by ID.

  3. source data files are replicated to a database destination. The files do not contain any overlapping records and thus do not need to be merged

Append mode is recommended for these example scenarios:

  • A larger PostgreSQL source data table is replicated to an SFTP every day. The records can be incrementally fetched but cannot be merged into the destination files.

  • A source CSV updated periodically with new records is loaded to a destination database. The CSV file does not contain all of the records that have historically been loaded to the destination table. The file also does not have a composite key.

  • A source CSV file which is updated daily to include yesterday's records. There is no historical data in the file and the files have no overlapping records that need to be merged

How is the data fetched?

The job will function incrementally. For database and API sources, the app will track the most recently loaded records by date or ID as relevant. When the job is next run, data will be fetched from this point onwards.

For database sources, users will be asked to configure the date or ID field that should be used to track the incremental loading.

For API sources, the app will detect the logical date or ID field to track the incremental loading. The app will also account API refresh windows and refresh this historical data as needed.

For flat file sources, the app will read any source files that have been added or updated since the previous run.

How is the data merged?

For database destination, the new records are added directly to the target table. There will be no attempt to merge the data.

For flat file destinations, the new records are added to a new file.

Last updated