BigQuery 102-Practical Use Cases

102.1 — I want my table back

To retrieve a table at a timestamp, use SYSTEM_TIME:

102.2 I need to control access to the tables I own.

When you give a service account dataset level, users have access to all

Click Table, Share Table instead of Share dataset
Use SELECT to filter columns and use WHERE to filter rows.
Save result: Set a destination table for query results.
Downside: This derived table is a PoT snapshot table, so there is a lag.
After you created the view, if the user does not have access to the original table, they cannot retrieve data from the view. So you will have to grant the view to run the query on your behalf following the steps below:Go to original table 1. Click “Share” — Authorized Views — Column level permission (E.g. PII)— Click Through Schema, Set, Data Catalog — Policy Tags.2. Create tags like ‘high (e.g. SSN)’, ‘medium(e.g. Customer Name)’3. Go back to table, select schema, add policy tag to the table.Note: IAM like BigQuery Admin will be overriden by Policy Tags.To make that tag’s data avaialble to a user:— Data Catalog — Add members to a tag with Fine-Grained Reader role.

102.3 Querying is slow or generates a lot of costs.

Usually, there is a trade-off between a normalized table and a de-normalized table. Normalized tables use the Snowflake/Star schema, which requires a lot of joins thus increasing the cost of querying. De-normalized tables (Storing as many fields as possible in one table)require fewer joins and usually generate less cost in BigQuery.

To create a column with nested data, set the data type of the column to RECORD in the schema. A RECORD can be accessed as a STRUCT type in standard SQL. A STRUCT is a container of ordered fields.To create a column with repeated data, set the mode of the column to REPEATED in the schema. A repeated field can be accessed as an ARRAY type in standard SQL.A RECORD column can have REPEATED mode, which is represented as an array of STRUCT types. Also, a field within a record can be repeated, which is represented as a STRUCT that contains an ARRAY. An array cannot contain another array directly. For more information, see Declaring an ARRAY type.

References:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sean Zhang

Sean Zhang

Data Science | Machine Learning| Data Engineer