在 Google Cloud Platform 上進行高大量位址驗證的設計模式
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
目標
大量地址驗證教學課程說明瞭可使用大量地址驗證的不同情境。在本教學課程中,我們將介紹 Google Cloud Platform 中執行大量地址驗證的不同設計模式。
首先,我們會概略說明如何在 Google Cloud Platform 中,使用 Cloud Run、Compute Engine 或 Google Kubernetes Engine 執行大量地址驗證。接著,我們會瞭解如何將這項功能納入資料管道。
讀完本文後,您應該就能充分瞭解在 Google Cloud 環境中,以高用量執行地址驗證的不同選項。
本節將深入探討使用 Google Cloud Platform 進行大量地址驗證的不同設計模式。在 Google Cloud Platform 上執行時,您可以與現有程序和資料管道整合。
下圖為參考架構,說明如何在 Google Cloud Platform 上建構整合功能,更適合用於一次性作業或測試。

在這種情況下,建議將 CSV 檔案上傳至 Cloud Storage bucket。接著,您可以在 Cloud Run 環境中執行大量地址驗證指令碼。
不過,您可以在任何其他執行階段環境中執行,例如 Compute Engine 或 Google Kubernetes Engine。
您也可以將輸出 CSV 檔案上傳至 Cloud Storage 值區。
上一節顯示的部署模式非常適合快速測試大量地址驗證,以供一次性使用。不過,如果您需要經常使用這項功能做為資料管道的一部分,建議您善用 Google Cloud Platform 原生功能,讓這項功能更加強大。您可以進行的變更包括:

從資料管道執行指令碼,做為長期執行的週期性程序
另一個常見做法是定期驗證串流資料管道中的一批地址。您也可能在 BigQuery 資料存放區中擁有地址。在這個方法中,我們會瞭解如何建構週期性資料管道 (需要每天/每週/每月觸發)

這個架構具有下列優點:
- 您可以使用 Cloud Scheduler 定期驗證地址。建議您每月重新驗證地址,或每月/每季驗證任何新地址。這個架構有助於解決該使用情境的問題。
如果顧客資料位於 BigQuery,則驗證過的地址或驗證標記可直接快取至該處。
注意:如要瞭解可快取的內容和方式,請參閱大量地址驗證文章。
使用 Memorystore 可提高復原能力,並處理更多地址。這個步驟會為整個處理管道新增狀態,以處理非常龐大的地址資料集。您也可以使用其他資料庫技術,例如 Cloud SQL[https://cloud.google.com/sql] 或 Google Cloud Platform 提供的任何其他資料庫類型。不過,我們認為 Memorystore 完美平衡了擴充性和簡便性需求,因此應該是首選。
結論
套用本文所述模式後,您就能在 Google Cloud Platform 上,針對不同用途使用 Address Validation API。
我們編寫了開放原始碼 Python 程式庫,協助您開始使用上述用途。您可以從電腦的指令列叫用,也可以從 Google Cloud Platform 或其他雲端服務供應商叫用。
如要進一步瞭解如何使用程式庫,請參閱這篇文章。
後續步驟
下載「提供可靠的地址,改善結帳、送貨和營運流程 」白皮書,並觀看「提供可靠的地址,改善結帳、送貨和營運流程 」網路研討會。
建議閱讀:
貢獻者
本文由 Google 維護。以下是這篇文章的原始撰稿人。
主要作者:
Henrik Valve | 解決方案工程師
Thomas Anglaret | 解決方案工程師
Sarthak Ganguly | 解決方案工程師
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-08-27 (世界標準時間)。
[null,null,["上次更新時間:2025-08-27 (世界標準時間)。"],[[["\u003cp\u003eThis guide provides various design patterns for performing high volume address validation using Google Cloud Platform, integrating with existing processes and pipelines.\u003c/p\u003e\n"],["\u003cp\u003eYou can leverage Cloud Run, Compute Engine, or Google Kubernetes Engine for one-time address validation tasks, uploading data to Cloud Storage for processing.\u003c/p\u003e\n"],["\u003cp\u003eFor recurring data pipelines, use Cloud Storage, Dataflow, and BigQuery to efficiently process and validate large address datasets regularly.\u003c/p\u003e\n"],["\u003cp\u003eTo implement a long-lasting recurring address validation process, use Memorystore for persistent storage, Cloud Scheduler for periodic triggers, and BigQuery for caching results.\u003c/p\u003e\n"],["\u003cp\u003eThis approach allows for periodic revalidation of existing addresses and validation of new ones, offering higher resiliency and the ability to process large datasets.\u003c/p\u003e\n"]]],["This content outlines how to perform high-volume address validation on Google Cloud Platform. It details running validation scripts from Cloud Run, Compute Engine, or Kubernetes Engine for one-time tasks. For data pipelines, it suggests using Cloud Storage for CSV files, Dataflow for processing, and BigQuery for caching. Recurring processes leverage Cloud Storage, Memorystore, and BigQuery, scheduled by Cloud Scheduler. It highlights an open-source Python library for implementation and provides resources for further learning.\n"],null,["Objective\n\nThe [High Volume Address Validation](https://developers.google.com/maps/architecture/high-volume-address-validation) tutorial guided you through different scenarios where high volume address validation can be used. In this tutorial, we will introduce you to different design patterns within Google Cloud Platform for running High Volume Address Validation.\n\nWe will start with an overview on running High Volume Address Validation in Google Cloud Platform with Cloud Run, Compute Engine or Google Kubernetes Engine for one time executions. We will then see how this capability can be included as part of a data pipeline.\n\nBy the end of this article you should have a good understanding of the different options for running Address Validation in high volume in your Google Cloud environment.\n| **Try now:** Visit cloud console and enable the [Address Validation API](https://console.cloud.google.com/marketplace/product/google/addressvalidation.googleapis.com?utm_source=architecture_high_volume).\n\nReference architecture on Google Cloud Platform\n\nThis section dives deeper into different design patterns for High Volume Address Validation using [Google Cloud Platform](https://cloud.google.com/). By running on Google Cloud Platform, you can integrate with your existing processes and data pipelines.\n\nRunning High Volume Address Validation one time on Google Cloud Platform\n\nShown below is a reference architecture of how to build an integration\non Google Cloud Platform which is more suitable for one off operations or testing.\n\nIn this case, we recommend uploading the CSV file to a [Cloud Storage](https://cloud.google.com/storage/docs/creating-buckets) bucket. The High Volume Address Validation script can then be run from a [Cloud Run](https://cloud.google.com/run) environment.\nHowever you can execute it any other runtime environment like [Compute Engine](https://cloud.google.com/compute) or [Google Kubernetes Engine](https://cloud.google.com/kubernetes-engine).\nThe output CSV can also be uploaded to the [Cloud Storage](https://cloud.google.com/storage/docs/creating-buckets) bucket.\n\nRunning as a Google Cloud Platform data pipeline\n\nThe deployment pattern shown in the previous section is great for quickly testing High Volume Address Validation for one time usage.\nHowever if you need to use it regularly as part of a data pipeline, then you can better leverage Google Cloud Platform native capabilities to make it more robust. Some of the changes which you can make include:\n\n- In this case, you can dump CSV files in [Cloud Storage](https://cloud.google.com/storage/docs/creating-buckets) buckets.\n- A [Dataflow](https://cloud.google.com/dataflow) job can pick up the addresses to be processed and then cache in [BigQuery](https://cloud.google.com/bigquery/docs/introduction).\n- The [Dataflow Python library](https://cloud.google.com/dataflow/docs/quickstarts/create-pipeline-python) can be be extended to have logic for High Volume Address Validation to validate the addresses from the Dataflow job.\n\nRunning the script from a data pipeline as a long lasting recurring process\n\nAnother common approach is to validate a batch of addresses as part of a streaming data pipeline as a recurring process.\nYou may also have the addresses in a bigquery datastore. In this approach we will see how to build out a recurring data pipeline (which needs to be triggered daily/weekly/monthly)\n\n- Upload the initial CSV file to a [Cloud Storage](https://cloud.google.com/storage/docs/creating-buckets) bucket.\n- Use [Memorystore](https://cloud.google.com/memorystore) as a persistent datastore to maintain intermediate state for the long running process.\n- Cache the final addresses in a [BigQuery](https://cloud.google.com/bigquery/docs/introduction) datastore.\n- Set up [Cloud Scheduler](https://cloud.google.com/scheduler) to run the script periodically.\n\nThis architecture has the following advantages:\n\n- Using [Cloud Scheduler](https://cloud.google.com/scheduler), address validation can be done periodically. You might want to revalidate the addresses on a monthly basis or validate any new addresses on a monthly/quarterly basis. This architecture helps solve that use case.\n- If customer data is in [BigQuery](https://cloud.google.com/bigquery/docs/introduction), then the validated addresses or the validation Flags can be cached directly there.\n Note: What can be cached and how is described in details in the [High Volume Address Validation article](https://developers.google.com/maps/architecture/high-volume-address-validation#caching_for_production_use)\n\n- Using [Memorystore](https://cloud.google.com/memorystore) provides higher resiliency and ability to process more addresses. This steps adds a statefulness to the whole processing pipeline which is needed for handling very large address datasets.\n Other database technologies like cloud SQL\\[https://cloud.google.com/sql\\] or any other [flavour of database](https://cloud.google.com/products/databases) which Google cloud Platform offers can be used here as well. However we believe memorystore perfectless balances the scaling and simplicity needs, thus should be the first choice.\n\nConclusion\n\nBy applying the patterns described here, you can use Address Validation API for different use cases and from different use cases on Google Cloud Platform.\n\nWe have written an open-source Python library to help you get started with the use cases described above. It can be invoked from a command line on your computer or it can be invoked from [Google Cloud Platform](https://cloud.google.com/) or other cloud providers.\n\nLearn more about how to use the library from this [article](https://developers.google.com/maps/architecture/high-volume-address-validation-library-oss).\n\nNext Steps\n\nDownload the [Improve checkout, delivery, and operations with reliable addresses](https://mapsplatform.withgoogle.com/address-validation-whitepaper/home.html?utm_source=architecture&utm_medium=website&utm_campaign=FY23-Q1-global-Maps-website-dl-GMP-AV-Whitepaper&utm_content=av_whitepaper)Whitepaper and view the [Improving checkout, delivery, and operations with Address Validation](https://mapsonair.withgoogle.com/events/improving-checkout-operations-with-address-validation-webinar?utm_source=architecture&utm_medium=website&utm_campaign=FY23-Q1-global-Maps-onlineevent-er-GMP-Improving-Ops-with-Address-Validation&utm_content=january_webinar)Webinar.\n\nSuggested further reading:\n\n- [Address Validation API Documentation](https://developers.google.com/maps/documentation/address-validation)\n- [Geocoding and Address Validation](https://developers.google.com/maps/architecture/geocoding-address-validation)\n- Explore the [Address Validation demo](https://developers.google.com/maps/documentation/address-validation/demo)\n\nContributors\n\nGoogle maintains this article. The following contributors originally wrote it. \n\nPrincipal authors:\n\n[Henrik Valve](https://www.linkedin.com/in/henrikvalve/) \\| Solutions Engineer \n\n[Thomas Anglaret](https://www.linkedin.com/in/thomas-anglaret/) \\| Solutions Engineer \n\n[Sarthak Ganguly](https://www.linkedin.com/in/sarthakganguly/) \\| Solutions Engineer"]]