Tweepy code samples

Common use-cases for the Twitter API and how to solve them in Python 3 using Tweepy

Quick links

Jump to some highlighted sections

Get users — Get tweets — Post tweet — Search API — Streaming

This section aims at making at easier by doing that work for you and suggesting a good path, by providing recommended code snippets and samples of the data or returned. This guide is not meant to be complete, but rather to cover typical situations in a way that is easy for beginners to follow.

This based on Tweepy docs, Tweepy code and the Twitter API docs.

Snippet use:
You may copy and paste the code here into your own project and modify it as you need.

Pasting into a script and running is straightforward. And pasting most code into the *interactive terminal is fine, but you'll get a syntax error if you paste a function which has empty lines, so use a script instead for that.

Naming conventions

A tweet is called a status in the API and Tweepy.
A profile is called a user or author in the API and Tweepy.
A username is called a screen name in the API and Tweepy.

These terms will be used interchangeably in this guide.

Tweepy API overview

The api object returned in the auth section above will cover most of your needs for requesting the Twitter API, whether fetching or sending data.

The api object is an instance of tweepy.API class and is covered in the docs here and is useful to see the allowed parameters, how they are used and what is returned.

The methods on tweepy.API also include some useful links in their docstrings, pointing to the Twitter API endpoints docs. These do not appear in the Tweepy docs. Therefore you might want to look at the api.py script in the Tweepy repo to see these links.

Twitter API docs: API reference index - a list of all available endpoints. Tweepy implements most of these I think. For more info on the API, see Resources page.

How do I get a high of volume of tweets?

Add a waiting config option as per the auth guide so that Tweepy will automatically wait when it rates a rate limit exceeded point.
Use Paging here so that Tweepy will iterate over multiple pages for you.
Pick a token auth approach that gives the most tweets in a window. See the Rate limits section on Twitter policies page. For example, App-only Token is more suitable for search than for App Access Token (with user context).

Paging

Follow the Tweepy tutorial to get familiar with how to use a Cursor to do paging - iterate over multiple pages of items of say 100 tweets each.

Tweepy docs: Cursor tutorial. The tutorial also explains truncated and full text.

Setup the cursor

An api method must be passed to the cursor, along with any parameters.

cursor = tweepy.Cursor(
    api.search,
    query,
    count=100
)

Tweepy repo: Cursor class

Pages and items

When iterating over the cursor, you must specify if you want the response to be pages or items.

Pages is how Twitter API works - you get multiple pages of say 100 tweets each, so you iterate over page which then have a list (or iterator) of tweets.

for page in cursor.pages():
    for tweet in page:
        print(tweet.id)

Or you can use items approach, where Tweepy flattens multiple pages into what feels like one long list (or iterator).

for tweet in cursor.items():
    print(tweet.id)

Limit

The cursor will carry on it until it gets all available data. You can optionally limit this by omitting the limit.

In both examples below, we process 5 pages of 100 tweets each and get a total of 500 tweets.

for tweet in cursor.items(500):
    print(tweet.id)

for page in cursor.pages(5):
    for tweet in page:
        print(tweet.id)

Get users

Various approaches to get profiles of Twitter users

Use api.get_user to get one user by ID or screen name, or use api.look_users to get many users. Read on for more details.

Fetch the profile for the authenticated user

api.me()

Get the author of a tweet

Whenever you have a tweet object you can find the profile that authored the tweet, without a doing a further API call.

tweet.author

See models page for in this guide for attributes on a User instance.

Fetch profile by ID

Lookup a single profile

By screen name.

screen_name = "foo"
user = api.get_user(screen_name=screen_name)

Or by user ID.

user_id = "foo"
user = api.get_user(user_id=user_id)

Tweepy docs: API.get_user

Then you can inspect the user object or do actions on it. See the User section of the models page.

Example:

user.screen_name
# => "foo"

user.id
# => 1234567

user.followers_count
# => 99

Lookup user ID for a screen name

How to get the profile and user ID for a given screen name.

user = api.get_user(screen_name='foo')

Get the user ID as an int.

user_id = user.id
# 1234567

Get the user ID as a str. You probably don't need this. Use the .id one rather.

user_id = user.id_str
# "1234567"

Lookup many profiles

Lookup one or more users at once using their screen names.

screen_names = ["foo", "bar", "baz"]
users = api.lookup_users(screen_names=screen_names)

Or lookup one or more users by their IDs.

user_ids = [123, 456, 789]
users = api.lookup_users(user_ids=user_ids)

Tweepy docs: API.lookup_users

The endpoint only lets you request up to 100 IDs at once, so you'll never than more than one page of results. Therefore you get more results, you should batch your IDs into groups of 100 and then lookup each group.

Search for user

users = api.search_users(q, count=20)

The count argument may not be greater than 20 according to Tweepy docs, but you may use paging.

Tweepy docs: API.search_users

Get followers of a user

Followers method

Get the followers of a given user.

api.followers

Returns a user’s followers ordered in which they were added. If no user is specified by id/screen name, it defaults to the authenticated user.
Specify user ID or screen name.
Supports paging.
Returns a list of tweepy.User objects.

for user in api.followers(screen_name="foo"):
    print(user.screen_name)

Follower IDs method

Similar to above, but only returns user IDs and not users.

api.followers_ids

Returns an array containing the IDs of users following the specified user.
Specify user ID or screen name.
Supports paging.
Return a list of int objects.
This can be useful if you want to map user IDs to user IDs in a graph of followers and maybe combined with tweet IDs, without actually using the profile data like screen name.

for user_id in api.followers(screen_name="foo"):
    print(user_id)

With paging:

cursor = tweepy.Cursor(
    api.followers,
    screen_name="foo",
    count=100
)
user_id_pages = list(cursor.pages())

You can combine this approach with Lookup users method, to lookup a batch users with known IDs or screen names.

cursor = tweepy.Cursor(
    api.lookup_users,
    user_ids=user_id_pages,
    count=100
)

You will have to split the user IDs into batches of at most 100 items so that the query will work. Here we use pages from above so it will already be batched.

This uses two steps to get IDs and the users, so consider the rate limit impact for the first and second step.

Rate limits on follower approaches

See Rate Limits on Twitter Policies page details.

If you want to see which approach works better for you at scale, see these references from people who have done research:

Tweepy issue 627

API	Max Return/Call Size	Requests / 15-min window	Total Results Per Window
followers/list	200	15	3000
followers/ids	5000	15	75000
users/lookup	100	180	18000

StackOverflow

Twitter provides two ways to fetch the followers

Fetching full followers list (using followers/list in Twitter API or api.followers in tweepy) - Alec and mataxu have provided the approach to fetch using this way in their answers. The rate limit with this is you can get at most 200 * 15 = 3000 followers in every 15 minutes window.
Second approach involves two stages:-
a) Fetching only the followers ids first (using followers/ids in Twitter API or api.followers_ids in tweepy).you can get 5000 * 15 = 75K follower ids in each 15 minutes window.
b) Looking up their usernames or other data (using users/lookup in twitter api or api.lookup_users in tweepy). This has rate limitation of about 100 * 180 = 18K lookups each 15 minute window.
Considering the rate limits, Second approach gives followers data 6 times faster when compared to first approach.

Get tweets

If you want to do a search for tweets based on hashtags or phrases or that are directed at a user, go to the Search API section.

Links:

Twitter API: Timelines overview
Twitter API: Post, retrieve, and engage with Tweets

Get a user's most recent status

This may be truncated since you can't specify tweet mode as extended.

Note Twitter API says this is supplied if available - but this is not guaranteed especially during high activity, so make you application robust enough to handle this.

Get the most recent status on a user object.

user.status

See the Get user section for getting a user.

Get exactly one status for a given user.

api.user_timeline(screen_name, count=1)

Get my timeline

Get tweets from your own users's timeline, as a mix of their own and friend's tweets.

tweets = api.home_timeline()

Returns the 20 most recent statuses, including retweets, posted by the authenticating user and that user’s friends. This is the equivalent of /timeline/home on the Web.

Tweepy docs: API.home_timeline

Get a user's timeline

Get the most recent by a user. You can specify user_id or screen_name to target a user.

screen_name = "foo"
tweets = api.user_timeline(screen_name=screen_name)

If you don't specify a user, the default behavior is for the authenticated user.

tweets = api.user_timeline()

The API doesn't say what the default is but the max without paging is 200, so you can request 1 to 200 without paging.

tweets = api.user_timeline(count=200)

Tweepy docs: API.user_timeline
Twitter API docs: GET statuses/user_timeline - note daily limit of 100k tweets and getting 3,200 most recent tweets, otherwise there is not really a date restriction on how many days or years you can go back to.

Fuller examples

Get the latest 200 tweets of a user.

See Extended message section regarding the Tweet mode parameter.

screen_name = "foo"
tweets = api.user_timeline(
    screen_name=screen_name,
    count=200,
    tweet_mode="extended",
)

for tweet in tweets:
    try:
        print(tweet.full_text)
    except AttributeError:
        print(tweet.text)

Using paging to get 1000 tweets - 3200 is the max for a timeline.

screen_name = "foo"
cursor = tweepy.Cursor(
    api.user_timeline,
    screen_name=screen_name,
    count=200,
    tweet_mode="extended",
)

for tweet in cursor.items(1000):
    try:
        print(tweet.full_text)
    except AttributeError:
        print(tweet.text)

Get expanded message on a user's retweets

Note that even though we use extended mode to show expanded rather than truncated tweets, the message of a retweet will still be truncated. So you can this approach to get the full message on the original tweet.

Example from source.

tweets = api.user_timeline(id=2271808427, tweet_mode="extended")

# This is still truncated.
tweets[6].full_text
# => 'RT @blawson_lcsw: So proud of these amazing @HSESchools students who presented their ideas on how to help their peers manage stress in mean…'

# Original expanded text.
tweets[6].retweeted_status.full_text
# => 'So proud of these amazing @HSESchools students who presented their ideas on how to help their peers manage stress in meaningful ways! Thanks @HSEPrincipal for giving us your time!'

Tweepy docs: Handling Retweets in Extended Tweets guide.

Get the latest tweet from users

You can use this approach, which is fine to do for one user.

tweets = api.user_timeline(count=1)

If you need to go through 100 users and get their latest tweet, this would take 100 separate requests.

A more efficent way would be to lookup the 100 profiles at once and then get the latest tweet on each user object.

screen_names = ["foo", "bar", "baz"]

users = api.lookup_users(screen_names=screen_names)

Getting the latest tweet on each user is not covered here.

Fetch tweets by ID

If you know the ID of a tweet, you can fetch it. This is useful if you want to find the latest engagements count on a tweet, or if you have a list of just IDs from outside Tweepy and you want to turn them into Tweepy objects so you can get the message, author, date, etc.

Lookup a single tweet

tweet_id = 123
api.get_status(tweet_id)

Tweepy docs: API.get_status

Lookup many tweets

tweet_ids = [123, 456, 789]
api.statuses_lookup(tweet_ids)

Tweepy docs: API.statuses_lookup

Get retweets of a tweet

Get up to 100 retweets on a given tweet.

tweet_id = 123
count = 100
retweets = api.retweets(tweet_id, count)

Tweepy docs: API.retweets

Get the target of a reply

Get original tweet on the current tweet, if it has one.

original_tweet_id = tweet.in_reply_to_status_id

if original_tweet_id is not None:
    original_tweet = api.get_status(original_tweet_id)

Get user who was the target of the reply.

original_user = tweet.in_reply_to_user_id

Get the target of a retweet

If the current tweet is a retweet (i.e. starts with "RT @") then it will have the original tweet as an attribute. Use this code to get the original tweet and default to None if it does not exist.

original_tweet = getattr(tweet, "retweeted_status", None)

You can get the ID or author on that tweet.

Or you can just check if the tweet is a retweet by checking if the value is None.

Get media on a tweet

A tweet can have up to 4 item items on it and these can be photos, videos or GIFs.

Get the media by reading the entities attribute and getting the media field, which only exists if there actually media items.

You must used extended mode otherwise you will not see media.

Get a tweet - the example below uses apu.get_status, but this can be applied to other cases.

tweet_id = 1256704946717822977
tweet = api.get_status(tweet_id, tweet_mode="extended")

Get the media list on the tweet.

media = tweet.entities.get("media", [])

Here we default to an empty list in case the key is not set.

Then you you can get the HTTPS media URL on the items media list.

Example:

for item in media:
    url = item["media_url_https"]
    # => "https://pbs.twimg.com/media/EXC2A8vXgAEM7Nm.jpg"

Get tweet engagements

See more on the models page of this guide.

Get favorites

tweet.favorite_count
# => 0

Get the favorites list. Supports paging.

tweet.favorites

Get retweets

tweet.retweet_count
# => 0

Get a list of retweets of the tweet. This has a max of 100 but supports paging.

api.retweets(tweet.id)

# Untested
retweets = tweet.retweets()

Get retweeters

Get the user IDs of the users who retweeted the tweet. This has a max of 100 but supports paging.

# Untested
retweeters = tweet.retweeters

Filter tweets by language

Twitter assigns a tweet a language e.g. en for English or it for Italian. These languages are available to filter by when doing a search or stream and you can also read the attribute on a fetched tweet.

Twitter dev docs: Supported languages

Twitter API docs: Get Supported Languages endpoint. There is some sample output there.

Where the value come from?

These language labels are based on the content of the tweet and is inferred.

Tweepy docs say "Language detection is best-effort.".

Warning: In my experience this is not reliable. Tweets appear as unknown language, or a user making several tweets which I can see are all in one language get labelled as different language. If you still want to use language, you can continue.

What about the settings of the user?

There is no account setting to change what language you are posting in.

There is a Display Language setting in Twitter account settings, but this how the interface appears to you. The help text for the item explain that is does not affect the content of Tweets.

See the Search API section on this page for more details how on to do searches.

Show the language

tweets = api.search("python")

for tweet in tweets:
    print(tweet.lang, tweet.text)
   if tweet.lang == "en":
      print(tweet.text)

Filter on the result

tweets = api.search("python")

for tweet in tweets:
    if tweet.lang == 'en':
      print(tweet.text)

Filter query

Some endpoints let you specify languages so that only matching tweets will be returned.

Search filtered by language

From the api.search docs:

lang – Restricts tweets to the given language, given by an ISO 639-1 code. Language detection is best-effort.

e.g.

tweets = api.search("python", lang="en")

Streaming filtered by language

Note use of languages, not lang.

e.g.

stream.filter(track=["python"], languages=["en"])

Get replies to a tweet

The only way to get replies to a tweet is using the Search API, which means you can only get replies which happened in the past week.

This approach gets all replies to a user with screen name foo. You can replace the handle with your own.

to:foo filter:replies

That can be tested into browser.

Here is how to do it with Tweepy.

screen_name
query = "to:{} filter:replies".format(screen_name)
tweets = api.search(query)

To get replies to a specify tweet, you'll have to check the tweet.in_reply_to_status_id attribute for a match on the current ID.

This can be further optimized by specifying a condition in the search which only shows tweets after the target tweet ID, but if you're iterating back from most recent tweets the way Twitter does then it only helps a bit.

You'll also have to apply recursive logic to get replies to replies.

Engage with a tweet

Note that you should only use these actions if you included them in your dev application otherwise you may get blocked. Also if you have a read-only app, you can upgrade to a read and write app.

?! Please use these sparingly. The automation policy for Twitter API allows use of these actions as long as they are not used indiscriminately. If do favorite or retweet every tweet on a timeline or in a stream, you may get blocked for spammy low-quality behavior. If you do a search for popular tweets matching a hashtag and engage with a few of them, this will be fine.

See this guide's Twitter policies page

Favorite

tweet.favorite()

Retweet

tweet.retweet()

Reply

See Create a reply section.

Post tweet

FAQs

Important: Please understand what you are allowed to tweet before doing it.

Can I reply to a tweet or `@mention` someone?

Yes, but only if they have first messaged you. The Twitter automation policy is strict on this. Please make sure you understand it before replying to tweets.

Doing a search for tweets and replying to them without the user opting in (such as by tweeting to you) is considered spammy behavior and will likely get your account shutdown.

Can I make a plain tweet?

If you just want to make a tweet message without replying or mentioning, yes you are allowed to do this using the API. For example a bot which posts content daily from Reddit or a weather or finance service. Or posts a random message from a list or posts a message from a schedule.

Tweet a text message

msg = 'Hello, world!'

tweet = api.update_status(msg)

Tweepy docs: API.update_status. ?> Twitter API docs: POST statuses/update

To choose a random text message:

msgs = ["Foo", "Bar baz")
msg = randon.choice(msgs)

Tweet a message with media

Upload an image or animated GIF. Video upload is not supported by Tweepy yet.

media_path = 'foo.gif'
msg = 'Hello, world!'

tweet = api.update_with_media(media_path, status=msg)

Tweepy docs: API.update_with_media.

Note that this method does still work, but the Tweepy docs says this is deprecated. The preferred approach is to use api.upload_media and then attach the returned ID as part of the media_ids list parameter on the api.update_status method covered above.

Create a reply

A reply is a tweet directed at another tweet ID or user. When you reply to a tweet, it becomes a "thread" or "threaded conversation".

Read the Twitter policies page automation rules carefully before automating replies to users. Any message directed at a user without them requesting it from your bot can be considered spam by Twitter. Twitter docs are very specific on when you may reply.

A safe way to make replies is to reply to your own tweets only. This can be used to create a tweet chain such as a 10-part tutorial with text or images.

According to the Tweepy docs for this endpoint, you must do a mention of the screen name somewhere in your message along with using the reply parameter in order for your tweet to count as a reply.

Bearing the notices above in mind, here is how to create a reply.

Read more on the Twitter policies page of this guide.

Here is the general form:

tweet = api.update_status(
    message,
    in_reply_to_status_id=target_id,
)

Reply example

If you were replying to a tweet directed at your user:

target_id = tweet.id
screen_name = tweet.author.screen_name

msg = f"@{screen_name} thank you!"

api.update_status(
    msg,
    in_reply_to_status_id=target_id,
)

Reply

Below how to a reply chain aka threaded tweets. This will make an initial tweet and then a series of replies to each additional tweet

This is a a novel way to make replies without hitting policy restrictions is to make a tweet and then reply to yourself. This means you could chain together a list of say 10 items perhaps with pictures and group them together. I've seen this before and is a great way to overcome the character limit for writing a blog post.

Untested code - it might be better to reply to the initial ID only.

screen_name = api.me().screen_name

messages = [
    "foo bar",
    "fizz buzz",
    "#tweepy #twitterapi",
]
target_id = None

for message in messages:
    if target_id is None:
        print("Initital tweet!")
    else:
        print(f"Replying to tweet ID: {target_id}")
        message = f"@{screen_name} {message}"

    tweet = api.update_status(
        message,
        in_reply_to_status_id=target_id,
    )
    target_id = tweet.id

Handle time values

Tips on dealing with time values from the Twitter API

Date and time

The Twitter API often provides a datetime value in ISO 8601 format and Tweepy returns this to you as a string still.

e.g. "2020-05-03T18:01:41+00:00".

This section covers how to parse a datetime string to a timezone-aware datetime object, to make it more useful for calculations and representations.

import datetime


TIME_FORMAT_IN = r"%Y-%m-%dT%H:%M%z"


def parse_datetime(value):
    """
    Convert from Twitter datetime string to a datetime object.

    >>> parse_datetime("2020-01-24T08:37:37+00:00")
    datetime.datetime(2020, 1, 24, 8, 37, tzinfo=datetime.timezone.utc)
    """
    dt = ":".join(value.split(":", 2)[:2])
    tz = value[-6:]
    clean_value = f"{dt}{tz}"

    return datetime.datetime.strptime(clean_value, TIME_FORMAT_IN)

When splitting, we don't need seconds and any decimals values. Plus, these have changed style before between API versions so are unreliable. So we just ignore after the 2nd colon (minutes) and pick up the timezone from the last 6 characters.

The datetime value from Twitter will be always be UTC zone (GMT+00:00), regardless of your location or profile settings. Lookup the datetime docs for more info.

Example usage:

>>> dt = parse_datetime(tweet.created_at)
>>> print(dt.year)
2020

Timestamp

If you get any numbers which are timestamps such as from the Rate Limit endpoint, you can convert them to datetime objects.

import datetime


timestamp = "1403602426"
datetime.datetime.fromtimestamp(float(timestamp))
# => datetime.datetime(2014, 6, 24, 11, 33, 46)

Search API

The Twitter Search API lets you get tweets made in the past 6 to 9 days. The approaches below take you from getting 20 tweets to thousands of tweets but always bound by the time restriction.

If you want a live stream of tweets, see the Streaming section.

If you want to go back more than a week and are willing to pay, see the Batch historical tweets API docs.

Query syntax

Twitter has a flexible search syntax for using "and" / "or" logic and quoting phrases.

Twitter API docs on search:

Be sure to use the standard docs as the premium operators do not work on the free search services.

You can test a search query out in the Twitter search bar before trying it in the API.

Search query examples

Basic

Some examples to demonstrate common use of the search syntax.

Single term
- foo
- #foo
- @some_handle
Require all terms. Note that AND logic is implied. The order does not matter.
- foo bar baz
- to:some_handle foo
- from:some_handle foo
Require at least one term - uses the OR keyword.
- foo OR bar
- #foo OR bar
Exact match on phrase. i.e. all words must be used and in order.
- foo bar
Exclusion - Using leading minus sign.
- foo -bar
Groups
- Require all groups.
  - (foo OR bar) (spam OR eggs)
  - (foo OR bar) -(spam OR eggs)
- Require any group.
  - (foo OR bar) OR (spam OR eggs)
Exact match on a phrase
- "Foo bar"
- "Foo bar" OR "Fizz buzz" OR spam

Searching is case insensitive.

The to and from operators are provided by the Twitters docs. Using @some_handle might provide the same as to:some_handle but I have not tested. Using @some_handle might include tweets by the user too.

When looking up a user, you may wish to leave off the @ to get more results which are still relevant, provided the handle is not a common word. I found this increase the volume.

When combing AND and OR functionality in a single rule, the AND logic is evaluated first. Such that foo OR bar fizz is equivalent to foo OR (bar fizz). Though, braces are preferred for readability.

Note for the last example above that double-quoted phrases must be before ordinary terms, due to a known Twitter Search API bug.

Advanced

See the links in Query syntax section for more details.

Query	Description
`to:some_handle`	Mentions of user `@some_handle`.
`filter:retweets #bar`	Retweets only about `#bar`.
`-filter:retweets #bar`	Exclude retweets about `#bar`.
`filter:replies #bar`	Replies only about `#bar`.
`to:some_handle filter:replies`	Replies to `@some_handle`.

Tweepy search method

Tweepy docs: API.search - that section explains how it works and what the method parameters do.
Twitter API docs: Standard search API

Define query

Create a variable which contains your query. The query should be a single string, not a list, and should match exactly what you'd put in the Twitter.com search bar (which also makes it easy to test).

Examples:

Basic.
```
  query = "#python"
```
Complex - Use the rules linked above or see the Query syntax section.
```
  query = "foo bar"

  query = "foo OR bar"
```
An exact match phrase in quotes - just change the outside to single quotes.
```
  query = '"foo bar"'
```

Basic

Return tweets for a search query. Only gives 20 tweets by default, so read on to get more.

tweets = api.search(query)

Or use q explicitly, for the same result.

tweets = api.search(q=query)

Example of iterating over the results in tweets object:

def process_tweet(tweet):
    print(tweet.id, tweet.author, tweet.text)


for tweet in tweets:
    process_tweet(tweet)

Get a page of 100 tweets

With search API, you can specify a max of up to 100 items (tweets) per page. The other endpoints like user timelines seem to mostly allow up to 200 items on a page.

tweets = api.search(
    query,
    count=100
)

If you want to get the next 100 tweets after that, you could get the ID of the last tweet and use that to start the search at the next page, modified with since_id=last_tweet_id-1. You'd also have to check when there are no Tweets left and then stop searching. However, it is much more practical to use Tweepy's Cursor approach to do paging, covered next.

Get many tweets using paging

This approach using the Paging approach to do multiple requests for pages of up to 100 tweets each, allowing you get thousands of tweets.

The Twitter API imposes rate limiting against a token, to prevent abuse. So, after you've met your quota of searches in a 15-minute window (whether new searches or paging on one search), you will have have to wait until it resets and then do more queries. Any requests before then will fail (though other will have their own limit). This waiting can be turned on as a config option on setting up the auth object, as covered in Installation section.

cursor = tweepy.Cursor(
    api.search,
    query,
    count=100
)

for tweet in cursor.items():
    process_tweet(tweet)

See Paging section for more info.

Extended message

It is useful to use extended mode when doing a search.

Do this with `tweet_mode='extended'.
Twitter by defaults returns messages truncated to 140 characters (with an ellipsis), even though users may enter tweets up to 280 characters. So use this option to the full message.
Note that retweets messages might still be truncated even with this option but there is a workaround.

When using this option, make sure to use the tweet.full_text attribute and not tweet.text. But still allow fallback to plain tweet.text. Since the Tweepy docs say:

If status is a Retweet, it will not have an extended_tweet attribute, and status.text could be truncated.

Example:

tweets = api.search(
    query,
    tweet_mode="extended",
)

for tweet in tweets:
    try:
        print(tweet.full_text)
    except AttributeError:
        print(tweet.text)

Tweepy docs: Extended mode

As a function:

def get_message(tweet):
    """
    Robustly get a message on a tweet.

    This ideal for extended mode, but also works on standard mode when tweets
    are truncated. And it handles retweets, which ALWAYS use the `.text`
    attribute even in extended mode according to the API docs.
    """
    try:
        return tweet.full_text
    except AttributeError:
        return tweet.text


print(get_message(tweet))

Result type

Set result_type to one of the following, according to Twitter API:

mixed - A balance of the other two. Default option.
recent - The tweets that are the most recent.
popular- The tweets with the highest engagements. Note that this list might be very short (just a few tweets) - compared with running the recent query.

result_type = "popular"
count = 100

tweets = api.search(
    query,
    count=count,
    result_type=result_type,
)

Limit date range

You can specify that the tweets should be up to a date. If you don't care about tweets in the last few days or you already stored them, this can be useful to go back further.

Add until as a parameter with year, month, date formatted date as a string.

e.g.

api.search(
    q=query,
    until="2020-05-07",
)

You are still bound by the search API's limit of one week, so if you set until to be a week ago you'll get close to zero tweets.

Filter by location

Search for tweets at a point within a radius.

You can leave the search query parameter q unset and this will still work.

Format of a geocode value:

LATITUDE,LONGITUDE,RADIUS

Example usage:

api.search(geocode="33.333,12.345,10km")

api.search(geocode="37.781157,-122.398720 ,mi")

Twitter API docs: Standard Search API - see geocode under Parameters.

Returns tweets by users located within a given radius of the given latitude/longitude. The location is preferentially taking from the Geotagging API, but will fall back to their Twitter profile.
The parameter value is specified by latitude,longitude,radius, where radius units must be specified as either mi (miles) or km (kilometers).
Note that you cannot use the near operator via the API to geocode arbitrary locations; however you can use this geocode parameter to search near geocodes directly.
A maximum of 1,000 distinct "sub-regions" will be considered when using the radius modifier.

Full search example

Get tweets for a keyword search (Basic)
Excluding replies and retweets based on the query (Advanced)
Getting as many tweets as possible by setting max count using paging (Get many tweets using paging)
Using full message text (Extended message)
For a given language - this is not reliable but it is an option (Filter tweets by language)

See Authentication page of this guide for setting up the api object.

Click to expand:

search.py

def get_message(tweet):
    """
    Robustly get a message on a tweet.

    Even if not extended mode or is a retweet (always truncated).
    """
    try:
        return tweet.full_text
    except AttributeError:
        return tweet.text


query = "-filter:retweets -filter:replies python"
lang = "en"

cursor = tweepy.Cursor(
    api.search,
    q=query,
    count=100,
    tweet_mode="extended",
    lang=lang,
)

results = []

for tweet in cursor.items():
    parsed_tweet = {
        "id": tweet.id,
        "screen_name": tweet.author.screen_name,
        "message": get_message(tweet),
    }
    print(parsed_tweet)
    results.append(parsed_tweet)

print(len(results)

Get entities on tweets

Get media

How to get images on tweets.

This example is for the Search API but can work for other methods too such as User timeline.

Add entities to your request - this may not always be needed on some endpoints such as .search where the default is True. Check the Tweepy docs.

Then use the media value, if one exists on a tweet's entities.

cursor = tweepy.Cursor(
    api.search,
    query,
    count=count,
    include_entities=True,
)

for tweet in cursor:
    if 'media' in tweet.entities:
        for image in tweet.entities['media']:
            print(image['media_url'])

Streaming

This section focuses on the standard and free "filtered" Streaming API service. There are more services available, covered in the Other streams subsection.

What is streaming and how many tweets can I get?

The Search API gives about 90% of tweets and back 7 days, but you have to query it repeatedly if you want "live" data and this can result in reaching API limits.

The filtered streaming API lets you connect to the firehose of Twitter tweets made in realtime. You must specify a filter to apply - either keywords or users to track.

However, the volume is much lower than the search API.

Studies have estimated that using Twitter’s Streaming API users can expect to receive anywhere from 1% of the tweets to over 40% of tweets in near real-time.
The reason that you do not receive all of the tweets from the Twitter Streaming API is simply because Twitter doesn’t have the current infrastructure to support it, nor do they don’t want to support it; hence, the Twitter Firehose. source

Streaming resources

Tweepy

Streaming tutorial in the docs.
streaming.py module in the repo. This is useful to find or override existing methods.
- See StreamListener class.
- See Stream class and Stream.filter method.
streaming.py example script in the repo.
test_streaming.py - Python tests for streaming module.

Twitter API docs

Filter realtime Tweets
- Make sure to use "POST statuses/filter" as the other endpoints are premium only.
- Note deprecation warning:
  
  This endpoint will be deprecated in favor of the filtered stream endpoint, now available in Twitter Developer Labs.
POST statuses/filter endpoint reference
- Including URL and response structure.
- Including allowed parameters.
Basic stream parameters
- Covers parameters in more detail.
- filter_level
  - The default value is none, which is all available tweets. If you don't need all tweets or performance is an issue, you can set this to low or medium.
- language
  - You can this to a standard code like en. However, when using the Search API I found the labels were inconsistent even on several tweets from the same person. Twitter guesses the language, it doesn't use your settings.
Premium stream operators
- Additonal parameters only available on the paid tier.

Setup stream listener class

Create a class which inherits from StreamListener.

Base

To get started, define a listener class using the example from the Tweepy docs Streaming tutorial.

class MyStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        """Called when a new status arrives"""
        print(status.text)


    def on_error(self, status_code):
        if status_code == 420:
            return False

That will:

Print a tweet immediately when it happens and then return None, which will keep the stream alive.
Disconnect when throttled by rate limiting by returning False. Rate limiting is not measure as requests in a window like the search API, which means you can get a high volume of tweets in realtime. Read the Rate limits section on the Twitter Policies page for more info.

Some people name this class as _StdOutListener.

Override more methods

Most of the methods just return nothing quietly, so you will need to override methods you care about so you can print the output to the console or write to a CSV or database. Check the link above to available methods - the docstrings explain them well.

For example, you could override on_direct_message to handle that event.

You might want to handle some errors, or add least add printing to hep debugging.

Here are some error methods:

Method	Description
`on_exception`	Called when an unhandled exception occurs
`on_limit`	Called when a limitation notice arrives
`on_error`	Called when a non-200 status code is returned
`on_timeout`	Called when stream connection times out
`on_disconnect`	Called when twitter sends a disconnect notice.

Twitter API docs: Streaming message types - includes error codes.

Setup stream instance

myStreamListener = MyStreamListener()
stream = tweepy.Stream(auth=auth, listener=myStreamListener)

Some people do this in one line instead:

stream = tweepy.Stream(auth=auth, listener=MyStreamListener())

Start streaming

Follow the sections below to start streaming with the stream object.

Only the .filter method is covered here as that is accessible without a premium account.

Follow tweets from or to users

Stream public tweets relating one or more users.

According to docs this includes:

Tweets and retweets from the user.
Replies and retweets to the user's tweets.
Original messages to user. i.e. Message starts with "@handle", but not mentions with the handle later in the message.

Twitter API docs: Basic stream parameters (see follow section).

First get the user IDs of one or more Twitter users to follow.

Make sure you specify user IDs and not screen names. If you need to, see the instructions on how to Lookup user ID for a screen name.

Then pass the follow parameter using a list of strings.

e.g.

user_ids = ["1234567", "456789", "9876543"]

stream.filter(follow=user_ids)

Follow tweets matching keywords

Use the track parameter and one or more terms, like keywords or hashtags or URLs.

Example:

track = ["foo", "#bar", "fizz buzz"]

stream.filter(track=track)

OR
- The Twitter API will look for a tweet which contains any (i.e. at least one) of the items in the list, so it uses OR logic.
AND
- Use a space between words to use AND logic. e.g. "fizz buzz".

You cannot use quoted phrases. The API doc says: "Exact matching of phrases (equivalent to quoted phrases in most search engines) is not supported.".

The docs say you can track a URL but recommends including a space between parts for the most inclusive search. "example com".

UTF-8 characters are supported but must be used explicitly in your search. e.g. 'touché', 'Twitter’s'.

Twitter API docs: Basic stream parameters (see track section).

Full stream examples

Click to expand:

tweepy_docs_example.py

"""
Streaming demo - Tweepy docs example.

Based on tutorial: http://docs.tweepy.org/en/latest/streaming_how_to.html
"""
import tweepy


CONSUMER_KEY = ""
CONSUMER_SECRET = ""
ACCESS_TOKEN = ""
ACCESS_SECRET = ""


class MyStreamListener(tweepy.StreamListener):

    def on_status(self, status):
        print(status.text)

    def on_error(self, status_code):
        if status_code == 420:
            # Returning False in on_error disconnects the stream on rate limiting.
            # This is recommended.
            return False

        # Returning non-False reconnects the stream, with backoff.


auth = tweepy.auth.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)

myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth=auth, listener=myStreamListener)

# Follow tweets with the word "python".
# Note that is the command is blocking, so any lines after this will not execute.
myStream.filter(track=["python"])

# Use async flag so that a separate thread is used.
# myStream.filter(track=['python'], is_async=True)

# Follow user ID "2211149702"
# myStream.filter(follow=["2211149702"])

tweepy_example_repo_example.py

"""
Stream watcher - from Tweepy example repo.

Based on PY 2 script here: https://github.com/tweepy/examples/blob/master/streamwatcher.py
"""
import time
from getpass import getpass
from textwrap import TextWrapper

import tweepy


CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_TOKEN = ''
ACCESS_SECRET = ''


class StreamWatcherListener(tweepy.StreamListener):

    status_wrapper = TextWrapper(width=60, initial_indent='    ', subsequent_indent='    ')

    def on_status(self, status):
        try:
            print(self.status_wrapper.fill(status.text))
            print('\n %s  %s  via %s\n'
                % (status.author.screen_name, status.created_at, status.source))
        except Exception:
            # Catch any unicode errors while printing to console
            # and just ignore them to avoid breaking application.
            pass

    def on_error(self, status_code):
        print('An error has occurred! Status code = %s' % status_code)

        # Keep stream alive.

        return True

    def on_timeout(self):
        print('Snoozing Zzzzzz')


auth = tweepy.auth.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)

stream = tweepy.Stream(
    auth,
    StreamWatcherListener(),
    timeout=None
)

Update stream

If you want to update a stream, you must stop it and then start a new stream, according to this Twitter dev page.

One filter rule on one allowed connection, disconnection required to adjust rule.

This also means you are not allowed to have more than one streaming running at a time for account, not in the same script, same machine or even on another machine.

One way is to stop your application, reconfigure it and then start it again.

If you want to keep the script running when switching streams, you can restart like this:

track = ["foo"]
# Start initial stream.
stream.filter(track=track, is_async=True)

time.sleep(5)

# Update. This won't get applied yet.
track.append("bar")

# Stop.
stream.running = False

# Start again.
stream.filter(track=track, is_async=True)

The gap will hopefully be very short so you don't lose much.

How do I stream faster?

The streaming API is meant to be realtime but you have still experience a delay. In one case I heard that posting a tweet was delayed in the streaming up by 5 seconds, which I'd say is still good.

This delay might just be built into the way the Twitter API works.

Here are some ideas to improve performance when streaming:

The obvious ones - improve your internet connection speed or improve your hardware. Use a remote machine through AWS to "rent" a machine in the cloud dedicated to your application. Besides choosing higher specs than your local machine, it can also be online and run 24/7.
Run your script in unbuffered mode. Rather than waiting until the console output meets a threshold, tell Python to print immediately.
- e.g.
```
  python -U script.py
```
If you performance bottleneck is processing the tweet locally (writing to CSV or database), you can make that task asynchronous by using RabbitMQ or similar.
- This may not improve the delay, but it will make sure your application can process every tweet that Twitter Streaming API sends at you and that you don't get disconnected (which can happen if Twitter Streaming API decides you are handling the offloaded tweets to slowly).
- Example of repo which does this (though it's archived, so it's not maintained and might not work).
  - ukgovdatascience/twitter-mq-feed
    
    A script that gets data from the Twitter real-time API, passes it to a message-queue (e.g. RabbitMQ) and stores tweets into MongoDB
If using the premium streaming API, use an advanced filter.

Other streams

Decahose

Enterprise stream to get 10% of tweets.

Twitter API docs: Decahose API reference

Powertrack

Enterprise stream to get 100% of tweets.

The PowerTrack API provides customers with the ability to filter the full Twitter firehose, and only receive the data that they or their customers are interested in.

Twitter API docs: Powertrack API reference

Lab streams

Experimental Twitter API endpoints.

Labs V2 Overview
Sample stream v1 (replaces Sample realtime tweets endpoint)

The sampled stream endpoint allows developers to stream about 1% of all new public Tweets as they happen. You can connect no more than one client per session, and can disconnect and reconnect no more than 50 times per 15 minute window.
Filtered stream v1

The filtered stream endpoints allow developers to filter the real-time stream of public Tweets. Developers can filter the real-time stream by applying a set of rules (specified using a combination of operators), and they can select the response format on connection.
This preview contains a streaming endpoint that delivers Tweets in real-time. It also contains a set of rules endpoints to create, delete and dry-run rule changes. During Labs, you can create up to 10 rules (each one up to 512 characters long) can be set on your stream at the same time. Unlike the existing statuses/filter endpoint, these rules are retained and are not specified at connection time.
COVID-19 stream

How do I store tweets?

You can easily write to a CSV file using the Python csv module.

Here are some options for storing in a database.

Twitter MQ feed - this project stores in MongoDB.
Streaming Twitter Data into a MySQL Database

SQLite

Demo script using SQLite

"""
Python SQLite demo.

The sqlite3 library is a Python builtin. Read more in the Python 3 docs:
    https://docs.python.org/3/library/sqlite3.html

See also the SQLite docs:
    https://www.sqlite.org/docs.html
"""
import sqlite3


conn = sqlite3.connect('db.sqlite')
cur = conn.cursor()


create_sql = """
    CREATE TABLE IF NOT EXISTS tweet(
        id INTEGER PRIMARY KEY,
        status_id INTEGER,
        screen_name VARCHAR(30),
        message VARCHAR(255)
    )
"""
cur.execute(create_sql)
conn.commit()


# Mock data that would be fetched from the API.
# Note each item in the list is a list.
tweets = [
    [123, "foo", "Hello, world!"],
    [124, "bar", "Hello, Tweepy!"],
]

# Note that id is not known upfront but can be left to autoincrement by specifying NULL.
insert_sql = """
    INSERT INTO tweet VALUES (NULL, ?, ?, ?)
"""
cur.executemany(insert_sql, tweets)

fetch_sql = """
    SELECT *
    FROM tweet
"""
cur.execute(fetch_sql)
print(cur.fetchall())

conn.commit()

conn.close()

Direct messages

Methods relating to Twitter account direct messages.

Please ensure you comply with the Twitter API policies and do not spam users. See Twitter policies page to find links to appropriate docs.

Tweepy API docs: Direct message methods

Twitter API docs:

Sending and receiving events overview

Receiving messages events
You can retrieve Direct Messages from up to the past 30 days with GET direct_messages/events/list.
Consuming Direct Messages in real-time can be accomplished via webhooks with the Account Activity API.

List messages

Get direct messages to the authenticated Twitter account (such as your bot) in the last 30 days.

dms = api.list_direct_messages()

The default value for count is 20 and this can be increased to 50.

If you need to get more than that, using paging.

tweepy.Cursor(api.direct_messages, count=50).items(200)

Twitter API docs: List messages endpoint

Get message

Fetch a message by known ID.

dm_id = dms[0].id
dm = api.get_direct_message(dm_id)

Twitter API docs: Show message endpoint

Get attributes on a message object

Get the text of a message.

  dm.message_create['message_data']['text'])

Get recipient user ID:

  dm.message_create['target']['recipient_id']

See the Direct message section on the Models page to see a preview of the full structure.

Show all data

Print the entire object, prettified with the json builtin library.

import json
print(json.dumps(dm.message_create, indent=4))

Filter to messages from a certain user

user_id = "12345"

filtered_dms = [dm for dm in dms if msg.message_create['target']['recipient_id'] == user_id

We use a list comprehension here with an if condition, as it is has faster performance than a standard for loop and also it can be more readable (since it fits on one line and there's no .append step needed).

If don't have a user ID, then Lookup user ID for a screen name.

Here's a more complete example:

dms = api.list_direct_messages()

screen_name = "foo"
user_id = api.get_user(screen_name).id

for dm in dms:
    if dm.message_create['target']['recipient_id'] == str(user_id):
        print(dm.message_create['message_data']['text'])

Send message

Send a direct message to given user ID.

user_id = "123"
msg = "Hello, world!

api.send_direct_message(user_id, msg)

If don't have a user ID, then Lookup user ID for a screen name.

Twitter API docs: Create message endpoint - see optional parameters like quick_reply and attachment.

Get rate limit status

Twitter provides an endpoint to get the rate limit status for your token across all endpoints at once.

data = api.rate_limit_status()

The response is a dict which you can lookup like this:

data['resources']['statuses']['/statuses/home_timeline']
data['resources']['users']['/users/lookup']

See more on the Rate limit status section of the models page.

Twitter API docs: Get app rate limit status

There is also a way to get the rate limit stats on the response object on a successful call, though this is not covered here.