8/23/2020 BigQuery audit logs overview | Google Cloud
BigQuery audit logs overview
Overview
Cloud Audit Logs are a collection of logs provided by Google Cloud that provide insight into operational concerns related to your use of Google Cloud services. This page provides details about BigQuery speci c log information, and it demonstrates how to use BigQuery to analyze logged activity.
Versions
The audit log message system relies on structured logs, and the BigQuery service provides three distinct kinds of messages:
AuditData (#auditdata_examples): The old version of logs, which reports API invocations.
BigQueryAuditMetadata (#bigqueryauditmetadata_examples): The new version of logs, which reports resource interactions such as which tables were read from and written to by a given query job and which tables expired due to having an expiration time con gured.
In general, users will want to leverage the new BigQueryAuditMetadata logs functionality.
AuditLog (#auditlog_examples): The logs that BigQuery Reservations (/bigquery/docs/reservations-intro) and BigQuery Connections (/bigquery/docs/reference/bigqueryconnection/rest) use when reporting requests.
Message Formats
AuditData format
The AuditData (/bigquery/docs/reference/auditlogs/rest/Shared.Types/AuditData) messages are communicated within the protoPayload.serviceData submessage within the Cloud Logging LogEntry (/logging/docs/reference/v2/rest/v2/LogEntry) message.
https://cloud.google.com/bigquery/docs/reference/auditlogs 1/11 8/23/2020 BigQuery audit logs overview | Google Cloud
BigQueryAuditMetadata format
You can nd BigQueryAuditMetadata (/bigquery/docs/reference/auditlogs/rest/Shared.Types/BigQueryAuditMetadata) details in the protoPayload.metadata submessage that is in the Cloud Logging LogEntry (/logging/docs/reference/v2/rest/v2/LogEntry) message.
In the Cloud Logging logs, the protoPayload.serviceData information is not set or used. In BigQueryAuditMetadata messages, there is more information:
resource.type is set to one of the following values:
bigquery_project for jobs
resource.labels.location contains the location of the job.
bigquery_dataset for storage
resource.labels.dataset_id contains the encapsulating dataset.
protoPayload.methodName is set to one of the following values:
google.cloud.bigquery.v2.TableService.InsertTable
google.cloud.bigquery.v2.TableService.UpdateTable
google.cloud.bigquery.v2.TableService.PatchTable
google.cloud.bigquery.v2.TableService.DeleteTable
google.cloud.bigquery.v2.DatasetService.InsertDataset
google.cloud.bigquery.v2.DatasetService.UpdateDataset
google.cloud.bigquery.v2.DatasetService.PatchDataset
google.cloud.bigquery.v2.DatasetService.DeleteDataset
google.cloud.bigquery.v2.TableDataService.List
google.cloud.bigquery.v2.JobService.InsertJob
google.cloud.bigquery.v2.JobService.Query
google.cloud.bigquery.v2.JobService.GetQueryResults
InternalTableExpired
https://cloud.google.com/bigquery/docs/reference/auditlogs 2/11 8/23/2020 BigQuery audit logs overview | Google Cloud
protoPayload.resourceName now contains the URI for the referenced resource. For example, a table created by using an insert job reports the resource URI of the table The earlier format reported the API resource (the job identi er).
protoPayload.authorizationInfo only includes information relevant to the speci c event. With earlier AuditData messages, you could merge multiple records when source and destination tables were in the same dataset in a query job.
AuditLog format
BigQuery Reservations (/bigquery/docs/reservations-intro) uses the AuditLog (/logging/docs/reference/audit/auditlog/rest/Shared.Types/AuditLog) format when reporting requests. Logs contain information such as:
resource.type is set to:
bigquery_project for jobs
resource.labels.location contains the location of the reservation-related resource.
protoPayload.methodName is set to one of the following values:
google.cloud.bigquery.reservation.v1beta1.ReservationService.CreateReserva tion
google.cloud.bigquery.reservation.v1beta1.ReservationService.DeleteReserva tion
google.cloud.bigquery.reservation.v1beta1.ReservationService.UpdateReserva tion
google.cloud.bigquery.reservation.v1beta1.ReservationService.CreateCapacit yCommitment
google.cloud.bigquery.reservation.v1beta1.ReservationService.DeleteCapacit yCommitment
google.cloud.bigquery.reservation.v1beta1.ReservationService.CreateAssignm ent
google.cloud.bigquery.reservation.v1beta1.ReservationService.DeleteAssignm ent
https://cloud.google.com/bigquery/docs/reference/auditlogs 3/11 8/23/2020 BigQuery audit logs overview | Google Cloud
google.cloud.bigquery.reservation.v1beta1.ReservationService.MoveAssignmen t
BigQuery Connections (/bigquery/docs/reference/bigqueryconnection/rest) uses the AuditLog (/logging/docs/reference/audit/auditlog/rest/Shared.Types/AuditLog) format when reporting requests. Logs contain information such as:
resource.type is set to:
audited_resource
resource.labels.method contains the full method name.
resource.labels.project_id contains the project name.
resource.service contains service name.
protoPayload.methodName is set to one of the following values:
google.cloud.bigquery.connection.v1.ConnectionService.CreateConnection
google.cloud.bigquery.connection.v1.ConnectionService.DeleteConnection
google.cloud.bigquery.connection.v1.ConnectionService.UpdateConnection
google.cloud.bigquery.connection.v1.ConnectionService.SetIamPolicy
Mapping audit entries to log streams
Audit logs are organized into the following three streams. For more information about the streams, see the Cloud Audit Logs (/logging/docs/audit) documentation.
Data access
System event
Admin activity
Data access (data_access)
The data_access stream contains entries about jobs by using the JobInsertion and JobChange events and about table data modi cations by using the TableDataChange and TableDataRead
https://cloud.google.com/bigquery/docs/reference/auditlogs 4/11 8/23/2020 BigQuery audit logs overview | Google Cloud
events.
For example, when a load job appends data to a table, the data_access stream adds a TableDataChange event. A TableDataRead event indicates when a consumer reads a table.
Note: BigQuery does not emit data access log entries in the following scenarios:
Data appended to a table by using the streaming insert mechanism does not generate TableDataChange log entries.
Recursive dataset deletions, such as removing a dataset and its contents in a single API call, do not yield deletion entries for each resource contained in the dataset. The dataset removal is present in the activity log.
Partitioned tables do not generate TableDataChange entries for partition expirations.
Wildcard tables access generates a single TableDataRead entry and doesn't write a separate entry for each queried table.
System event (system_event)
You can set an expiration time on tables to remove them at a speci ed time. The system_event stream reports a TableDeletion event when the table expires and is removed.
Admin activity (activity)
The main activity stream reports all remaining activities and events such as table and dataset creation.
Creating, deleting, and updating resources related to BigQuery Reservations (/bigquery/docs/reservations-concepts) are reported in the admin activity stream.
Visibility and access control
BigQuery audit logs can include information that users might consider sensitive, such as SQL text, schema de nitions, and identi ers for resources such as table and datasets. For information about managing access to this information, see the Cloud Logging access control documentation (/logging/docs/access-control).
https://cloud.google.com/bigquery/docs/reference/auditlogs 5/11 8/23/2020 BigQuery audit logs overview | Google Cloud
Caller identities and resource names
Caller identities and IP addresses are redacted from the audit logs if all of the following conditions are true:
This is a read-only access.
The resource is public.
The identity is not a service account that belongs to the project.
The identity does not belong to the same customer as the project.
For cross-project access, there are additional rules that apply:
The billing project must be the project that sends the request, and the data project must be the project whose resources are also accessed during the job. For example, a query job in a billing project reads some table data from the data project.
The billing project resource ID is redacted from the data project log unless the projects have the same domain associated with them or are in the same organization.
Identities and caller IP addresses are not redacted from the data project log if either one of the preceding conditions apply or the billing project and the data project are in the same organization and the billing project already includes the identity and caller IP address.
Cloud Logging expo s
BigQuery automatically sends audit logs to Cloud Logging. Cloud Logging lets users lter and export messages to other services (/logging/docs/export), including Pub/Sub, Cloud Storage, and BigQuery.
With long term log retention and log exports to BigQuery, you can do aggregated analysis on logs data. Cloud Logging documents how messages are transformed (/logging/docs/export/bigquery) when exported to BigQuery.
Filtering expo s
To lter relevant BigQuery Audit messages, you can express lters as part of the export.
https://cloud.google.com/bigquery/docs/reference/auditlogs 6/11 8/23/2020 BigQuery audit logs overview | Google Cloud
For example, the following advanced lter represents an export that only includes the newer BigQueryAuditMetadata format:
Payload.metadata."@type"="type.googleapis.com/google.cloud.audit.BigQueryAuditMetada
You can express additional lters based on the elds within the log messages. For more information about crafting advanced lters, see the advanced log lter documentation (/logging/docs/view/advanced- lters).
De ning a BigQuery log sink using gcloud
The following example command line shows how you can use the gcloud command-line tool to create a logging sink (/sdk/gcloud/reference/logging/sinks/create) in a dataset named auditlog_dataset that only includes BigQueryAuditMetadata messages:
d logging sinks create my-example-sink bigquery.googleapis.com/projects/my-project-i -log-filter='protoPayload.metadata."@type"="type.googleapis.com/google.cloud.audit.B
Querying expo ed logs
AuditData examples
The following examples show how you can use AuditData messages to analyze BigQuery usage. AuditData elds are present in the protopayload_auditlog.servicedata_v1_bigquery record in the schema.
Note: Change the FROM clause, MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD, to the dataset and table date you've con gured in the Cloud Logging export.
Example: Query cost breakdown by identity
This query shows estimated query costs by user identity. It estimates costs based on the list price for on-demand queries in the US. This pricing might not be accurate for other locations or
https://cloud.google.com/bigquery/docs/reference/auditlogs 7/11 8/23/2020 BigQuery audit logs overview | Google Cloud
for customers who are leveraging at-rate billing.
andardSQL H data as
ELECT protopayload_auditlog.authenticationInfo.principalEmail as principalEmail, protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent AS jobCompletedEven ROM `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD`
ECT rincipalEmail, ORMAT('%9.2f',5.0 * (SUM(jobCompletedEvent.job.jobStatistics.totalBilledBytes)/POWER M ata RE obCompletedEvent.eventName = 'query_job_completed' UP BY principalEmail ER BY Estimated_USD_Cost DESC
Example: Hourly cost breakdown
This query shows estimated query costs by hour.
andardSQL ECT IMESTAMP_TRUNC(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.j ORMAT('%9.2f',5.0 * (SUM(protopayload_auditlog.servicedata_v1_bigquery.jobCompletedE M MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_YYYYMMDD` RE rotopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.eventName = 'query_jo UP BY time_window ORDER BY time_window DESC
BigQueryAuditMetadata examples
https://cloud.google.com/bigquery/docs/reference/auditlogs 8/11 8/23/2020 BigQuery audit logs overview | Google Cloud
The following examples show how you can use BigQueryAuditMetadata messages to analyze BigQuery usage. Because of the schema conversion done during the export from Cloud Logging into BigQuery, the message bodies are presented in semi-structured form. The protopayload_auditlog.metadataJson is a STRING eld, and it contains the JSON representation of the message. You can leverage JSON functions (/bigquery/docs/reference/standard-sql/json_functions) in standard SQL to analyze this content.
Note: Change the FROM clause in each of these examples to the corresponding exported tables in your project.
Example: Report expired tables
BigQueryAuditMetadata messages log when a table is deleted because its expiration time was reached. The following sample query shows when these messages occur and includes a URI that references the table resource that was removed.
andardSQL ECT rotopayload_auditlog.resourceName AS resourceName, eceiveTimestamp as logTime M `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_system_event_201901*` RE rotopayload_auditlog.methodName = 'InternalTableExpired' ER BY resourceName
Example: Most popular datasets
This query shows coarse, per-dataset statistics about table reads and table modi cations.
andardSQL ECT EGEXP_EXTRACT(protopayload_auditlog.resourceName, '^projects/[^/]+/datasets/([^/]+)/ OUNT(DISTINCT REGEXP_EXTRACT(protopayload_auditlog.resourceName, '^projects/[^/]+/da OUNTIF(JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataRead") IS NOT NU OUNTIF(JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataChange") IS NOT M `MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_data_access_2019*` RE SON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataRead") IS NOT NULL
https://cloud.google.com/bigquery/docs/reference/auditlogs 9/11 8/23/2020 BigQuery audit logs overview | Google Cloud
R JSON_EXTRACT(protopayload_auditlog.metadataJson, "$.tableDataChange") IS NOT NULL UP BY datasetRef ER BY datasetRef
AuditLog examples
The following examples use AuditLog (/logging/docs/reference/audit/auditlog/rest/Shared.Types/AuditLog) messages to analyze BigQuery Reservations (/bigquery/docs/reservations-intro) usage.
Example: Find users who purchased slots
This query shows the email address of the users who purchased slots.
andardSQL ECT rotopayload_auditlog.requestMetadata.requestAttributes.time request_time, rotopayload_auditlog.methodName, rotopayload_auditlog.authenticationInfo.principalEmail, SON_EXTRACT(protopayload_auditlog.requestJson , "$.capacityCommitment.slotCount") sl M MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_activity` RE rotopayload_auditlog.methodName like "%CreateCapacityCommitment%" ER by request_time
Example: History of a project assignment
This query shows the history of a project's reservation assignments.
andardSQL ECT rotopayload_auditlog.requestMetadata.requestAttributes.time request_time, rotopayload_auditlog.methodName, rotopayload_auditlog.authenticationInfo.principalEmail, SON_EXTRACT(protopayload_auditlog.requestJson , "$.assignment.assignee") assignee, SON_EXTRACT(protopayload_auditlog.requestJson , "$.assignment.jobType") job_type, M
https://cloud.google.com/bigquery/docs/reference/auditlogs 10/11 8/23/2020 BigQuery audit logs overview | Google Cloud
MYPROJECTID.MYDATASETID.cloudaudit_googleapis_com_activity` RE rotopayload_auditlog.methodName like "%Assignment%" ND assignee like "%OTHERPROJECTID%" ER by request_time
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/), and code samples are licensed under the Apache 2.0 License (https://www.apache.org/licenses/LICENSE-2.0). For details, see the Google Developers Site Policies (https://developers.google.com/site-policies). Java is a registered trademark of Oracle and/or its a liates.
Last updated 2020-06-26 UTC.
https://cloud.google.com/bigquery/docs/reference/auditlogs 11/11