To deliver reliable services to AdWords API users across the globe, we use a Token Bucket algorithm to meter the requests and determine the queries per second (QPS) rate. This is intended to prevent malicious or out-of-control software from overwhelming the AdWords API servers and affecting other users.
For example, if a runaway client accidentally spawned thousands of threads to
make concurrent AdWords API calls, AdWords API servers will notice and return a
RateExceededError
requesting the calling software to slow down.
It's important to remember that rate limits can fluctuate based on
different variables, including server load. Hence, we don't recommend a fixed
Queries Per Second (QPS) limit. It's extremely important to understand how to handle
RateExceededError
, and develop your software with rate limits in mind.
This guide will dive into more details to help you understand
RateExceededError
, and how to avoid exceeding the rate limit.
History
In older AdWords API versions, requests made in excess of the limit
were queued on the API server until they could run, resulting in what seemed
like very long execution times for some requests. The current API, instead of
blocking the client for a long time, is fail-fast and will return a
RateExceededError
. We believe this is an important feedback mechanism that
ensures you're aware of the issue and can adjust your applications
accordingly.
Types of rate limits
We recognize that your AdWords API client application may sometimes exceed the
limit and receive a RateExceededError
due to factors not fully in your
control. It's important to note that there is no penalty for doing so.
RateExceededError
is usually transient and automatically resolves after 30
seconds of inactivity.
There are a number of different types of rate limits that may be enforced by
the server. A client application may exceed a rate limit within a manager account's
developer token scope, or Google Ads account scope. Within each scope, rather
than a strict QPS rate limit, rate limits are metered in terms
of Requests Per Minute, Operations Per Minute, and/or other types of rate
limits. This allows both steady and burst traffic to the AdWords API. Both
the scope and the name of the rate limit are returned as part of
RateExceededError
.
Operational limit based on access level
There is only one type of rate limit that doesn't fluctuate: the operational limit based on the access level of your developer token. There are two access levels: Basic and Standard. A Basic access level account has limits of 10,000 operations per day and 1,000 report downloads per day. A newly approved developer token is assigned the Basic access level by default. If you plan to execute more than 10,000 operations or 1,000 report downloads per day, you can apply for Standard access level by completing the AdWords API Standard Access Application Form. There is no charge for either access level. Please check out the Rate Sheet to determine how operations are counted.
Other than the operational limit, all other rate limits may fluctuate. Thus,
it's important to handle RateExceededError
in your application.
Elements of RateExceededError
Let's take a look at RateExceededError
in more detail--it contains three very
important fields:
rateScope
--The scope of the rate that was exceeded, which can beACCOUNT
orDEVELOPER
.rateName
--Contains the name of the rate limit that was exceeded. The value can be, for example,RequestsPerMinute
.retryAfterSeconds
--Contains the number of seconds your application should at least wait before retrying the request. We recommend applying a random multiplier (e.g., a float value between 1 and 2 inclusively) toretryAfterSeconds
when determining seconds to wait. If your programs send requests in parallel (e.g., multi-threading), ensure that they're not sending new requests at the same time after waiting.
If your application continuously exceeds the rate limit, then you'll need to
understand rateScope
and rateName
to implement a more permanent throttling
strategy in your application.
Account scope vs. developer scope
The Rate Scope value can be either ACCOUNT
or DEVELOPER
. This corresponds
to whether the Rate Limit was exceeded at a Google Ads Account level or at a
developer token level.
Developer token rate scope
Every Google Ads manager account signed up to use AdWords API has a single
developer token, and every request you make is most likely associated with that
developer token. If the combined QPS for all client requests using the same
developer token exceeds a certain rate limit, then RateExceededError
may be
returned indicating Developer Rate Scope.
For example, if a manager account manages 100 Google Ads accounts, and there are
multiple client software instances using the same developer token to make a
total of hundreds of requests per second across different processes, threads,
or machines, then the client software may receive a RateExceededError
for the
developer token rate scope.
Account rate scope
If the same application makes a high number of requests per second on a single
Google Ads account managed by a manager account, the AdWords API Server may
return a RateExceededError
for exceeding rate limit within the Account scope.
This could happen, for example, if your client application spawned multiple
threads to perform an excess number of mutate()
operations against a single
Google Ads account.
Keep in mind that this rate limit across Account rate scope is metered on all requests to a single Google Ads account--no matter which developer token was used to make the request.
Suppose a single Google Ads account is managed by five different manager
accounts, then it's possible that all five manager accounts are making requests
against the same Google Ads account at the same time. If the combined QPS across
all manager accounts exceeds the limit, then the clients will receive
RateExceededError
in the Account rate scope.
Rate name and why it matters?
In addition to understanding rate scope, it's also important to understand the
type of rate limit that was exceeded. The type of the rate limit is returned
in the rateName
field. The commonly seen rate limit names are:
RequestsPerMinute
OperationsPerMinute
Difference between a request and an operation
So what's the difference between RequestsPerMinute
and
OperationsPerMinute
? Every SOAP service call is counted as a Request. For
example, every time you call CampaignService.mutate()
, it's counted as one
request. However, within the mutate request, you may have passed 100
CampaignOperation
s--this would be counted as 100 operations!
In the example above, while you may have avoided a RequestPerMinute
rate
limit by combining multiple operations into one request, you may still hit an
OperationsPerMinute
rate limit.
You can find more examples on how operations are counted on the Rate Sheet page.
Edge Cases
While the above Rate Names are more common, it's important to keep in mind there may be other types of rate limits you can exceed. If you experience these issues, please let us know in the AdWords API forum.
Slow down
If your application receives a RateExceededError
, then it's time to slow down!
Failing to do so may further delay your application's ability to recover from
the error. One of the simplest ways to do this is to honor the
RateExceededError.retryAfterSeconds
value when you retry the request, and/or
continue other requests.
For example, in Java, one of the simplest ways to pause your thread before
processing another request is to do a Thread.sleep()
.
try {
...
} catch (ApiException e) {
for (ApiError error : e.getErrors()) {
if (error instanceof RateExceededError) {
RateExceededError rateExceeded = (RateExceededError) error;
Thread.sleep(rateExceeded.getRetryAfterSeconds() * 1000);
}
}
...
}
While this approach is simple and straightforward, it may be suboptimal for achieving better overall throughput and should be used as the last line of defense.
There are a number of ways to mitigate the chances of exceeding the rate limit. Becoming familiar with Enterprise Integration Patterns (EIP) concepts such as Messaging, Redelivery, and Throttling can help you build a more robust client application.
We'll look into those recommended practices in the next section. Keep in mind,
even when mitigation practices are applied, it's still important to be able to
handle RateExceededError
.
Take control
You can take control of your application and mitigate RateExceededError
as
much as possible by both actively reducing the number of requests and
throttling QPS from the client side.
The following are some recommended practices ordered by complexity, with simpler strategies at the top and more robust but sophisticated architectures after:
- Limit Concurrent Threads
- Batching Requests
- Use BatchJobService
- Throttling / Rate Limiter
- Interleave Requests to Different Accounts
- Queueing
- Differentiate New Accounts vs. Established Accounts
Limit concurrent threads
More often than not, the root cause of a RateExceededError
is that the client
application is spawning an excessive number of threads, and all threads are
concurrently invoking the AdWords API. While we don't limit the number of
concurrent threads a client application can have, concurrently sending requests
via an unlimited number of threads can easily exceed the Requests Per Second
limit at the developer token level.
It's recommended to set a reasonable upper bound for the total number of concurrent threads that are going to make requests (across all processes and machines), and adjust upward from there to optimize your throughput without exceeding the rate limit.
Furthermore, you can consider throttling QPS from the client side across all threads (see Throttling / Rate Limiter).
Batching requests
Whenever possible, consider batching multiple requests into a single request.
This is most applicable on mutate()
calls. For example, if you're updating
status for multiple instance of AdGroupAd
--instead of calling mutate()
once
for each AdGroupAd
, you can possibly call mutate()
once, and pass in
multiple AdGroupAdOperation
in one shot. Check our Best
Practices for some
additional examples as well as best ways to group
operations.
When combining multiple operations into a single request, it's important to remember that most requests are atomic. If an operation fails, then the whole request will fail--and nothing will be updated. You may change this behavior by taking advantage of the partial failure feature.
Lastly, while batching requests will reduce the number of total requests and mitigate the Requests Per Minute rate limit, it may trigger the Operations Per Minute rate limit if you perform a large number of operations against a single account.
Caching responses
Some services, such as the one for generating targeting ideas, will return the same response when queried with the same parameters in a relatively long time span. For example, if you're querying for keyword ideas related to a set of existing keywords, the response is not going to change over a few hours, or perhaps even days.
Given this characteristic, if you run into a rate limit error for TargetingIdeaService queries, you can cache locally the request-response pairs to postpone running into the rate limit.
If you don't want to implement a caching strategy on your own, there are several caching libraries available for the most common languages, such as, for instance, Guava Caching for Java.
Use BatchJobService
Consider using the BatchJobService for long running jobs, to process a large number of operations, or if a large number of operations span across multiple services. BatchJobService can schedule and execute thousands of operations for you, asynchronously, within the Google cloud--all you have to do is poll for the result to see if the job has completed.
See the Batch Processing guide for details.
Throttling / rate limiter
In addition to limiting the total number of threads in your client application, you can also implement rate limiters on the client side. Such a strategy can ensure all the threads across your processes and/or clusters are governed by a specific QPS limit from the client side.
You can, for example, check out Guava Rate Limiter, or implement your own Token Bucket based algorithm for a clustered environment. For example, you could generate tokens and store them in a shared transactional storage such as a database, and each client would have to acquire and consume a token before it processes the request. If the tokens were used up, the client would have to wait until the next batch of tokens is generated.
In most cases, throttling will help you avoid exceeding rate limits in the developer token scope.
Interleave requests to different accounts
If you exceed rate limit in the Account scope, you can rate limit QPS for the account on the client side--but this may be non-trivial if you have thousands of accounts to manage. A simpler strategy is to interleave requests based on accounts.
For example, if you're performing 5000 mutate()
operations to 10 accounts, one
way to do this is to sequentially send batched operations to Account 1 first,
then Account 2, then Account 3:
- Send 500 mutate operations for Account 1 (and repeat 10 times for 5000 operations)
- Send 500 mutate operations for Account 2 (and also do this 10 times)
- … (until you finish all 10 accounts)
While this approach is straightforward, you may exceed Rate Limit in Account scope for number of Operations Per Minute.
By interleaving the accounts, the requests would look like this:
- Send 500 mutate operations for Account 1
- Send 500 mutate operations for Account 2
- Send 500 mutate operations for Account 3
- … (until you finish all 10 accounts)
- Send 500 mutate operations for Account 1
- Send 500 mutate operations for Account 2
- … (until you finish all 5000 operations for every account)
While this example illustrates how to interleave requests based on accounts, you should also check whether BatchJobService meets your requirements. See the Batch Processing guide for more information.
Queueing
A message queue is the best solution for operation load distribution, while also controlling request and consumer rates. There are a number of message queue options available--some open source, some proprietary--and many of them can work with different languages.
When using message queues, you can have multiple producers pushing messages to the queue, and multiple consumers processing those messages. Throttles can be implemented at the consumer side by limiting the number of concurrent consumers, or implement rate limiters or throttlers for either the producers or consumers.
For example, if a message consumer encounters a RateExceededError
, that
consumer can return the request back to the queue to be retried. At the same
time, that consumer can also notify all other consumers to pause processing for
a number of seconds to recover from the error.
Differentiate queue or rate limiter for new accounts vs. established Accounts
When you implement queuing or rate limiter strategies, it's important to keep in mind that a new Google Ads account may have a significantly more restrictive rate limit (i.e., lower QPS) than established accounts. Thus, if your organization constantly creates new accounts and has a large number of older accounts to manage, you should consider using different rate limiters or throttles for the two types of accounts.
This way, you can maximize your throughput for both types of accounts, rather than being limited by the account with the lowest QPS.
Typically, the more restrictive limits on new Google Ads accounts are relaxed once the account has published ads.