Metadata

We want you to see what's going on behind-the-scenes so, we've created what we call metadata columns. They are added to your data so you can identify:

  • Which Job captured and loaded the data

  • When the data was loaded

  • When the data was updated

Of course, there's plenty more to come! This is part of our bigger project around data observation and transparency.

How does it work?

In your Job configuration, there is an option to enable or disable this feature. It's enabled by default, but you can change it at any time.

When enabled, columns will be added to the end of the table your Job is pointing to. These columns have the prefix gravity_ in the names so you can identify them easily.

Each Job creates two tables in your database: a staging table and target table. You can think of the staging table as a temporary table. Your data's stored there only for the duration of the Job. It uses this table to do some "lite" transformation such as deduplication and identifying new data. The target table is the final destination for your data. You would have given this table a name in your Job configuration.

Next, SQL statements are created to merge that data from the staging table into the target table. Included in them is the code for populating these metadata columns:

  • gravity_id

    • The ID number of the Job run that inserted or updated the row in the target table. Useful for auditing and customer support

  • gravity_inserted

    • The date and time (UTC) the row was inserted into the target table

  • gravity_updated

    • The date and time (UTC) the row was updated in the target table

gravity_updated is only populated if you set your Job mode to "Append"

It's as simple as that!

Will it cost me more if it's enabled?

Yes and no.

No, our charges/pricing aren't affected by this feature being enabled or not.

Yes, your database costs are affected, but it depends on how many Jobs and how much data you have that these metadata columns would be added to (storage costs) and if you use them (compute costs).

We can't give accurate estimates as it varies greatly from customer to customer and from database to database. The costs will be much less than your overall spend on data storage and computation, but it is going to cost you something.

You have the power to decide if this feature is enabled or disabled though. At any time you can switch it off.

What happens if I disable it?

As mentioned earlier, you can disable it at any time. It won't delete the gravity_ prefixed columns in the table, it'll just stop populating them whenever your Job runs.

Feel free to delete the metadata columns if you've disabled it, but remember: if you do delete them and then decide to enable it again at a later time, all that metadata will be lost. We don't keep copies and can't historically populate it.

Will my Jobs be slower?

No, they shouldn't be affected unless your database has major performance degradation which would affect not just your Jobs but also any other systems, processes and users outside Gravity that use your database.

Last updated