Long-term google trends

I took part in Signals for a while, and in the process, designed a system for downloading long-term google trends, by pulling overlapping trend periods and rescaling i+1th to the ith.

I’ve turned it into a python package available for general use: Github / PyPI.

pip install longtrends

from longtrends import LongTrend
from datetime import datetime

keyword = 'suncream'

# Create LongTrend object
longtrend = LongTrend(
                      start_date=datetime(2018, 1, 1),
                      end_date=datetime(2022, 3, 31))        # use verbose=True for print output
# Build long-term trends
lt_built = longtrend.build()

# Plot (matplotlib required)
lt_built.plot(title=f"Google Trends: {longtrend.keyword}", figsize=(15, 3))

More info and images illustrating how the rescaling works here.


It has been a while, but a number of years ago I was doing some stuff with google trends and found that it was really hard to stitch together different time periods or to compare two trends from different queries because the scale from each query was totally arbitrary (was just scaled according to the results gotten, I think). Also seemed to be a fair amount of randomness, i.e. the sampling wasn’t consistent even when doing the same query over again (on the past) – I’d get (at least somewhat) different results. It was good for comparing two or more trends in a relative way (or the relative ups & downs of a single trend) in the same query, but very challenging to track results over time, and no way to pin anything down to an absolute objective level.

@wigglemuse , you’re right that the scale is different for each period; the top result is always 100 and the bottom result can be as low as 0. You’re also right that overlapping periods aren’t 100% consistent, because Google looks at a sample of searches for each request (it doesn’t look at every single search for obvious reasons). This means there is some variation.

However, usually, the shapes of each period are similar, which allows them to be rescaled, keeping most of the ‘signal’ intact. This is what my longtrends package does under the hood:

Non scaled:

Scaled to each other:

You can see the shape of the overall trend is quite consistent, even in this example where there was a big spike in interest for the search term.


I’ve written a more in-depth article about how it works here:

I have a published paper that uses google search trends as a proxy for retail investor demand: