Paging Through Table Data | Bigquery | Google Cloud
Total Page:16
File Type:pdf, Size:1020Kb
8/23/2020 Paging through table data | BigQuery | Google Cloud Paging through table data This document describes how to page through table data and query results using the BigQuery REST API. Paging through results using the API All *collection*.list methods return paginated results under certain circumstances. The number of results per page is controlled by the maxResults property. Default Maximum Maximum Method Pagination criteria maxResultsmaxResultsmaxFieldValues value value value Tabledata. Returns paginated results if the response size 100,000 Unlimited Unlimited list is more than 10 MB1 of data or more than maxResults rows. All other Returns paginated results if the response is 10,000 Unlimited 300,000 *collection*. more than maxResults rows and also below list methods the maximum limits. If the result is larger than the byte or eld limit, the result is trimmed to t the limit. If one row is greater than the byte or eld limit, tabledata.list can return up to 100 MB of data1, which is consistent with the maximum row size limit for query results. 1The row size is approximate, as the size is based on the internal representation of row data. The maximum row size limit is enforced during certain stages of query job execution. jobs.getQueryResult can return 20 MB of data unless explicitly requested more through support. A page is a subset of the total number of rows. If your results are more than one page of data, the result data will have a pageToken property. To retrieve the next page of results, make another list call and include the token value as a URL parameter named pageToken. The tabledata.list (/bigquery/docs/reference/rest/v2/tabledata/list) method, which is used to page through table data, uses a row offset value or a page token. See Browsing table data https://cloud.google.com/bigquery/docs/paging-results/ 1/4 8/23/2020 Paging through table data | BigQuery | Google Cloud (/bigquery/docs/managing-table-data#browse-table) for information. The following samples demonstrate paging through BigQuery table data. C#Java (#java)Go (#go)Node.js (#node.js)PHP (#php)Python (#python)Ruby (#ruby) Before trying this sample, follow the C# setup instructions in the BigQuery Quickstart Using Client Libraries (/bigquery/docs/quickstarts/quickstart-client-libraries). For more information, see the BigQuery C# API reference documentation (https://googleapis.github.io/google-cloud-dotnet/docs/Google.Cloud.BigQuery.V2/). s-samples/blob/468bcdc02567a8e337bfa43db3fd523298265225/bigquery/api/Snippets/BrowseTable.cs) using Google.Api.Gax; using Google.Apis.Bigquery.v2.Data; using Google.Cloud.BigQuery.V2; using System; using System.Collections.Generic; using System.Linq; public class BigQueryBrowseTable { public void BrowseTable( string projectId = "your-project-id" ) { BigQueryClient client = BigQueryClient.Create(projectId); TableReference tableReference = new TableReference() { TableId = "shakespeare", DatasetId = "samples", ProjectId = "bigquery-public-data" }; // Load all rows from a table PagedEnumerable<TableDataList, BigQueryRow> result = client.ListRows( tableReference: tableReference, schema: null ); // Print the first 10 rows foreach (BigQueryRow row in result.Take(10)) { https://cloud.google.com/bigquery/docs/paging-results/ 2/4 8/23/2020 Paging through table data | BigQuery | Google Cloud Console.WriteLine($"{row["corpus"]}: {row["word_count"]}"); } } } Requesting arbitrary pages and avoiding redundant list calls When you page backwards or jump to arbitrary pages using cached pageToken values, it is possible that the data in your pages might have changed since it was last viewed but there is no clear indication that the data might have changed. To mitigate this, you can use the Etag property. Every collection.list method (except for Tabledata) returns an Etag property in the result. This property is a hash of the page results that can be used to verify whether the page has changed since last request. When you make a request to BigQuery with an Etag value, BigQuery compares the Etag value to the ETag value returned by the API and responds based on whether the ETag values match. You can use ETags to help avoid redundant list calls in the following ways: If you only want to return list values if the values have changed: If you only want to return a page of list values if the values have changed, you can make a list call with a previously-stored ETag using the HTTP "if-none-match" header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26). If the ETag you provide doesn't match the ETag on the server, BigQuery returns a page of new list values. If the ETags do match, BigQuery returns a HTTP 304 "Not Modied" result and no values. An example of this might be a webpage where users might periodically ll in information that is stored in BigQuery. You can avoid making redundant list calls to BigQuery if there are no changes to your data by using the if-none-match header with ETags. If you only want to return list values if the values have not changed: If you only want to return a page of list values if the list values have not changed, you can use the HTTP "if-match" header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24). BigQuery matches the ETag values and returns the page of results if the results have not changed or returns a 412 "Precondition Failed" result if the page has changed. https://cloud.google.com/bigquery/docs/paging-results/ 3/4 8/23/2020 Paging through table data | BigQuery | Google Cloud Although ETags are a great way to avoid making redundant list calls, you can apply the same methods to iden objects have changed. For example, you can perform a Get request for a specic table and use ETags to determ ble has changed before returning the full response. Paging through query results Each query writes to a destination table. If no destination table is provided, the BigQuery API automatically populates the destination table property with a reference to a temporary anonymous table (/bigquery/docs/writing-results#temporary_and_permanent_tables). APIJava (#java)Python (#python) Read the jobs.config.query.destinationTable (/bigquery/docs/reference/rest/v2/Job#JobCongurationQuery.FIELDS.destination_table) eld to determine the table that query results have been written to. Call the tabledata.list (/bigquery/docs/reference/rest/v2/tabledata/list) to read the query results. Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/), and code samples are licensed under the Apache 2.0 License (https://www.apache.org/licenses/LICENSE-2.0). For details, see the Google Developers Site Policies (https://developers.google.com/site-policies). Java is a registered trademark of Oracle and/or its aliates. Last updated 2020-08-10 UTC. https://cloud.google.com/bigquery/docs/paging-results/ 4/4.