About Google BigQuery

How to Connect Google BigQuery

These are some key points about Google BigQuery, especially if you're leveraging our Google BigQuery Destination.

Before you continue, it's worth us mentioning that if you have worked with other data warehouses and databases, don't view Google BigQuery as an equivalent. Yes, it is a 'data warehouse' and has similarities to others on the market, but, it has key differences and limitations that may affect you.

Data Schema

Database object (e.g. schemas, tables, views) names are case sensitive, can be up to 1024 characters long and can contain letters (upper or lower case), numbers, and underscores.

Some exceptions are to this are table names which can contain Unicode characters in category L (letter), M (mark), N (number), Pc (connector, including underscore), Pd (dash), Zs (space) and column names which can only be up to 128 characters long and must start with a letter or underscore.

Text data (e.g. campaign names, product descriptions, comments) that contains special characters such as double quote ("), comma (,) and new line (\n) can cause problems during data load. In such cases, you'll commonly see error messages about too many or too few columns. You can try:

  • Changing your Data Set delimiter to something that's less likely to be in your data e.g. # (pound) or ~ (tilde)

  • Removing problematic text data column(s) from your Data Set

  • (Only applies to some Data Sources that export data to file): Changing Data Source configuration to unenclose or enclose column(s) in double quotes ("...") and double-escape special characters (e.g. " becomes "").

Only UTF-8 and ISO-8859-1 data encodings are supported by BigQuery at this time. Also, Gravity (currently) does not handle translation from unsupported data encodings (e.g. UTF-16, UTF-16LE) to one of the supported types. That means you are responsible for doing that before your Data Sets load the data.

The number of data types BigQuery supports is limited, especially complex types. Generally speaking, the types and formats are simpler. Some key points are:

  • For numeric (e.g. FLOAT, DECIMAL) types, you don't need to specify scale and precision.

  • DATE, TIME, DATETIME and TIMESTAMP support ISO formats only. Also, you will experience precision loss if your data has times and timestamps that go beyond 6 fractional digits (millisecond precision).

SQL

BigQuery has two SQL dialects, Standard and Legacy. The preferred is Standard as it also complies with the SQL 2011 standard. The Legacy dialect (previously known as BigQuery SQL) is an earlier, non-standard SQL language. It is still supported, however, Gravity is designed to work with the newer dialect, Standard.

Multi-statement transactions are not supported by BigQuery at this time. See https://cloud.google.com/bigquery/docs/reference/standard-sql/data-manipulation-language#limitations for more details.

Last updated