The AdWords API is no longer supported. Use the Google Ads API instead.

Rate Limits

To deliver reliable services to AdWords API users across the globe, we use a Token Bucket algorithm to meter the requests and determine the queries per second (QPS) rate. This is intended to prevent malicious or out-of-control software from overwhelming the AdWords API servers and affecting other users.

For example, if a runaway client accidentally spawned thousands of threads to make concurrent AdWords API calls, AdWords API servers will notice and return a RateExceededError requesting the calling software to slow down.

It's important to remember that rate limits can fluctuate based on different variables, including server load. Hence, we don't recommend a fixed Queries Per Second (QPS) limit. It's extremely important to understand how to handle RateExceededError, and develop your software with rate limits in mind.

This guide will dive into more details to help you understand RateExceededError, and how to avoid exceeding the rate limit.


In older AdWords API versions, requests made in excess of the limit were queued on the API server until they could run, resulting in what seemed like very long execution times for some requests. The current API, instead of blocking the client for a long time, is fail-fast and will return a RateExceededError. We believe this is an important feedback mechanism that ensures you're aware of the issue and can adjust your applications accordingly.

Types of rate limits

We recognize that your AdWords API client application may sometimes exceed the limit and receive a RateExceededError due to factors not fully in your control. It's important to note that there is no penalty for doing so. RateExceededError is usually transient and automatically resolves after 30 seconds of inactivity.

There are a number of different types of rate limits that may be enforced by the server. A client application may exceed a rate limit within a manager account's developer token scope, or Google Ads account scope. Within each scope, rather than a strict QPS rate limit, rate limits are metered in terms of Requests Per Minute, Operations Per Minute, and/or other types of rate limits. This allows both steady and burst traffic to the AdWords API. Both the scope and the name of the rate limit are returned as part of RateExceededError.

Operational limit based on access level

There is only one type of rate limit that doesn't fluctuate: the operational limit based on the access level of your developer token. There are two access levels: Basic and Standard. A Basic access level account has limits of 10,000 operations per day and 1,000 report downloads per day. A newly approved developer token is assigned the Basic access level by default. If you plan to execute more than 10,000 operations or 1,000 report downloads per day, you can apply for Standard access level by completing the AdWords API Standard Access Application Form. There is no charge for either access level. Please check out the Rate Sheet to determine how operations are counted.

Other than the operational limit, all other rate limits may fluctuate. Thus, it's important to handle RateExceededError in your application.

Elements of RateExceededError

Let's take a look at RateExceededError in more detail--it contains three very important fields:

  • rateScope--The scope of the rate that was exceeded, which can be ACCOUNT or DEVELOPER.
  • rateName--Contains the name of the rate limit that was exceeded. The value can be, for example, RequestsPerMinute.
  • retryAfterSeconds--Contains the number of seconds your application should at least wait before retrying the request. We recommend applying a random multiplier (e.g., a float value between 1 and 2 inclusively) to retryAfterSeconds when determining seconds to wait. If your programs send requests in parallel (e.g., multi-threading), ensure that they're not sending new requests at the same time after waiting.

If your application continuously exceeds the rate limit, then you'll need to understand rateScope and rateName to implement a more permanent throttling strategy in your application.

Account scope vs. developer scope

The Rate Scope value can be either ACCOUNT or DEVELOPER. This corresponds to whether the Rate Limit was exceeded at a Google Ads Account level or at a developer token level.

Developer token rate scope

Every Google Ads manager account signed up to use AdWords API has a single developer token, and every request you make is most likely associated with that developer token. If the combined QPS for all client requests using the same developer token exceeds a certain rate limit, then RateExceededError may be returned indicating Developer Rate Scope.

For example, if a manager account manages 100 Google Ads accounts, and there are multiple client software instances using the same developer token to make a total of hundreds of requests per second across different processes, threads, or machines, then the client software may receive a RateExceededError for the developer token rate scope.

Account rate scope

If the same application makes a high number of requests per second on a single Google Ads account managed by a manager account, the AdWords API Server may return a RateExceededError for exceeding rate limit within the Account scope. This could happen, for example, if your client application spawned multiple threads to perform an excess number of mutate() operations against a single Google Ads account.

Keep in mind that this rate limit across Account rate scope is metered on all requests to a single Google Ads account--no matter which developer token was used to make the request.

Suppose a single Google Ads account is managed by five different manager accounts, then it's possible that all five manager accounts are making requests against the same Google Ads account at the same time. If the combined QPS across all manager accounts exceeds the limit, then the clients will receive RateExceededError in the Account rate scope.

Rate name and why it matters?

In addition to understanding rate scope, it's also important to understand the type of rate limit that was exceeded. The type of the rate limit is returned in the rateName field. The commonly seen rate limit names are:

  • RequestsPerMinute
  • OperationsPerMinute
Difference between a request and an operation

So what's the difference between RequestsPerMinute and OperationsPerMinute? Every SOAP service call is counted as a Request. For example, every time you call CampaignService.mutate(), it's counted as one request. However, within the mutate request, you may have passed 100 CampaignOperations--this would be counted as 100 operations!

In the example above, while you may have avoided a RequestPerMinute rate limit by combining multiple operations into one request, you may still hit an OperationsPerMinute rate limit.

You can find more examples on how operations are counted on the Rate Sheet page.

Edge Cases

While the above Rate Names are more common, it's important to keep in mind there may be other types of rate limits you can exceed. If you experience these issues, please let us know in the AdWords API forum.

Slow down

If your application receives a RateExceededError, then it's time to slow down! Failing to do so may further delay your application's ability to recover from the error. One of the simplest ways to do this is to honor the RateExceededError.retryAfterSeconds value when you retry the request, and/or continue other requests.

For example, in Java, one of the simplest ways to pause your thread before processing another request is to do a Thread.sleep().

try {
} catch (ApiException e) {
  for (ApiError error : e.getErrors()) {
    if (error instanceof RateExceededError) {
      RateExceededError rateExceeded = (RateExceededError) error;
      Thread.sleep(rateExceeded.getRetryAfterSeconds() * 1000);

While this approach is simple and straightforward, it may be suboptimal for achieving better overall throughput and should be used as the last line of defense.

There are a number of ways to mitigate the chances of exceeding the rate limit. Becoming familiar with Enterprise Integration Patterns (EIP) concepts such as Messaging, Redelivery, and Throttling can help you build a more robust client application.

We'll look into those recommended practices in the next section. Keep in mind, even when mitigation practices are applied, it's still important to be able to handle RateExceededError.

Take control

You can take control of your application and mitigate RateExceededError as much as possible by both actively reducing the number of requests and throttling QPS from the client side.

The following are some recommended practices ordered by complexity, with simpler strategies at the top and more robust but sophisticated architectures after:

  • Limit Concurrent Threads
  • Batching Requests
  • Use BatchJobService
  • Throttling / Rate Limiter
  • Interleave Requests to Different Accounts
  • Queueing
  • Differentiate New Accounts vs. Established Accounts

Limit concurrent threads

More often than not, the root cause of a RateExceededError is that the client application is spawning an excessive number of threads, and all threads are concurrently invoking the AdWords API. While we don't limit the number of concurrent threads a client application can have, concurrently sending requests via an unlimited number of threads can easily exceed the Requests Per Second limit at the developer token level.

It's recommended to set a reasonable upper bound for the total number of concurrent threads that are going to make requests (across all processes and machines), and adjust upward from there to optimize your throughput without exceeding the rate limit.

Furthermore, you can consider throttling QPS from the client side across all threads (see Throttling / Rate Limiter).

Batching requests

Whenever possible, consider batching multiple requests into a single request. This is most applicable on mutate() calls. For example, if you're updating status for multiple instance of AdGroupAd--instead of calling mutate() once for each AdGroupAd, you can possibly call mutate() once, and pass in multiple AdGroupAdOperation in one shot. Check our Best Practices for some additional examples as well as best ways to group operations.

When combining multiple operations into a single request, it's important to remember that most requests are atomic. If an operation fails, then the whole request will fail--and nothing will be updated. You may change this behavior by taking advantage of the partial failure feature.

Lastly, while batching requests will reduce the number of total requests and mitigate the Requests Per Minute rate limit, it may trigger the Operations Per Minute rate limit if you perform a large number of operations against a single account.

Caching responses

Some services, such as the one for generating targeting ideas, will return the same response when queried with the same parameters in a relatively long time span. For example, if you're querying for keyword ideas related to a set of existing keywords, the response is not going to change over a few hours, or perhaps even days.

Given this characteristic, if you run into a rate limit error for TargetingIdeaService queries, you can cache locally the request-response pairs to postpone running into the rate limit.

If you don't want to implement a caching strategy on your own, there are several caching libraries available for the most common languages, such as, for instance, Guava Caching for Java.

Use BatchJobService

Consider using the BatchJobService for long running jobs, to process a large number of operations, or if a large number of operations span across multiple services. BatchJobService can schedule and execute thousands of operations for you, asynchronously, within the Google cloud--all you have to do is poll for the result to see if the job has completed.

See the Batch Processing guide for details.

Throttling / rate limiter

In addition to limiting the total number of threads in your client application, you can also implement rate limiters on the client side. Such a strategy can ensure all the threads across your processes and/or clusters are governed by a specific QPS limit from the client side.

You can, for example, check out Guava Rate Limiter, or implement your own Token Bucket based algorithm for a clustered environment. For example, you could generate tokens and store them in a shared transactional storage such as a database, and each client would have to acquire and consume a token before it processes the request. If the tokens were used up, the client would have to wait until the next batch of tokens is generated.

In most cases, throttling will help you avoid exceeding rate limits in the developer token scope.

Interleave requests to different accounts

If you exceed rate limit in the Account scope, you can rate limit QPS for the account on the client side--but this may be non-trivial if you have thousands of accounts to manage. A simpler strategy is to interleave requests based on accounts.

For example, if you're performing 5000 mutate() operations to 10 accounts, one way to do this is to sequentially send batched operations to Account 1 first, then Account 2, then Account 3:

  1. Send 500 mutate operations for Account 1 (and repeat 10 times for 5000 operations)
  2. Send 500 mutate operations for Account 2 (and also do this 10 times)
  3. … (until you finish all 10 accounts)

While this approach is straightforward, you may exceed Rate Limit in Account scope for number of Operations Per Minute.

By interleaving the accounts, the requests would look like this:

  1. Send 500 mutate operations for Account 1
  2. Send 500 mutate operations for Account 2
  3. Send 500 mutate operations for Account 3
  4. … (until you finish all 10 accounts)
  5. Send 500 mutate operations for Account 1
  6. Send 500 mutate operations for Account 2
  7. … (until you finish all 5000 operations for every account)

While this example illustrates how to interleave requests based on accounts, you should also check whether BatchJobService meets your requirements. See the Batch Processing guide for more information.


A message queue is the best solution for operation load distribution, while also controlling request and consumer rates. There are a number of message queue options available--some open source, some proprietary--and many of them can work with different languages.

When using message queues, you can have multiple producers pushing messages to the queue, and multiple consumers processing those messages. Throttles can be implemented at the consumer side by limiting the number of concurrent consumers, or implement rate limiters or throttlers for either the producers or consumers.

For example, if a message consumer encounters a RateExceededError, that consumer can return the request back to the queue to be retried. At the same time, that consumer can also notify all other consumers to pause processing for a number of seconds to recover from the error.

Differentiate queue or rate limiter for new accounts vs. established Accounts

When you implement queuing or rate limiter strategies, it's important to keep in mind that a new Google Ads account may have a significantly more restrictive rate limit (i.e., lower QPS) than established accounts. Thus, if your organization constantly creates new accounts and has a large number of older accounts to manage, you should consider using different rate limiters or throttles for the two types of accounts.

This way, you can maximize your throughput for both types of accounts, rather than being limited by the account with the lowest QPS.

Typically, the more restrictive limits on new Google Ads accounts are relaxed once the account has published ads.