Social Media

Twitter is a treasure chest. There are 330 million of us using Twitter each month to share our thoughts, appraisals and outcries with the world. The vast corpus of wild data this produces each day is useful to data scientists in myriad ways. From training sentiment classification models to understanding public opinion about businesses and individuals, Twitter data can open up research opportunities for many different purposes.

So how do we go about collecting it? At Graphext, we developed Tractor, a desktop application to scrape data from popular platforms including Facebook and Twitter without using code. Since this is a tool available only for Graphext PRO users, we decided to create this guide, alongside a Google Collab notebook helping everyone to access all kinds of data from Twitter.

"Because words have deep meaning, tweets have power."

- Germany Kent

To collect data from Twitter, you need to have keys to the Twitter API. This guide will walk through the process of accessing the keys required to authenticate your requests for data from Twitter. Once you have a consumer key, a consumer secret, an access token and an access token secret you can skip straight to the notebook. Simply enter a query using the Twitter query language and watch the tweets roll in. Finally, our notebook will save your data as a CSV file.

Accessing the Twitter API

API documentation is generally horrible stuff. Stay close and follow the steps below!

To access the Twitter API, you need a Twitter account and a Twitter developers account. Sign up for a developers account here.

Creating a Twitter Developers Account

  1. Start here.
  2. Click 'Apply for developers account'.
  3. Select 'Hobbyist' or whichever category suits you best. We will use 'Hobbyist'.
  4. Select 'Exploring the API' and then 'Get Started'.
  5. Complete the text fields on the 'Basic info' form and select 'Next'.
  6. Inside the 'Intended use' form, Twitter requires some detailed information from you regarding how you intend to use their API. Take your time and fill in each text box thoroughly. When you are finished, click 'Next'.
  7. Inside the 'Review' form, review the information you've just provided.
  8. Inside the 'Terms' form, check over the terms of use and tick the checkbox.
  9. Select 'Submit Application'.
  10. Done ... Twitter will review your application and get back to you.

Creating a Twitter developers account.

Once your developer's account has been approved you can create an 'app' giving you access to the keys required to start making requests. Head over to the 'Developer Portal' to start creating an app.

Creating a Twitter Developer App

  1. Start here.
  2. Click 'Developer Portal' from the top menu.
  3. Select 'Overview' within the 'Projects & Apps' dropdown from the sidebar of your Twitter developer dashboard.
  4. Under 'Standalone Apps', click 'Create App'.
  5. Enter a name for your app.
  6. Select 'Complete'.
  7. Done ... Twitter will now present you with your API key and your API secret key. Make a note of these!

Creating a Twitter developer app.

Once you've created an app you have all of the keys you need to retrieve data from Twitter. If you need to regenerate these keys, navigate to your app's 'Keys and tokens' window.

The 'Key' Information


API Secret

Access Token

Access Token Secret

Collecting Data from Twitter

Now you've set up your app within your Twitter developer portal, you can use the keys provided to access data from Twitter. Our Google Collab notebook provides all of the code required to do this using a Python library called 'tweepy'. Simply add your key information into the relevant variable placeholders, set a search term using the Twitter query language and run the notebook to start collecting data.


Making a Query

Any text you save to the variable 'search_query' will be used to perform your request for data from Twitter. Tweets matched against this query will be retrieved and stored locally within the notebook before you download them as a CSV file.

Twitter's query language follows the same structure used to find tweets inside of the search box on Twitter. Use single quotation marks to set exact terms and use OR as well as AND to match your query to more than one term. You can also use the Twitter query language to find tweets belonging to specific users or users within a list.

Finally, use the 'since' and 'until'' parameters to set a date range for your query. The 'lang' parameter provides the ability to restrict your data to tweets posted in a specific language. Removing these parameters will remove any date or language criteria from your query.

Making a Query

Exporting the Data

Although our notebook is hosted with Google Collab, we have included a command at the bottom of the file to export your data as a CSV file to your local computer. Running the final command will open up a file explorer on your computer allowing you to save the data locally.

In order to run the notebook from your own Google Drive, make a copy of our notebook in your own drive. Once you've copied the file into your own Google Drive, running the code will save your data inside your own drive.

How to Make a Copy of the Notebook?

  1. Open the notebook.
  2. Select 'File' from the top toolbar.
  3. Select 'Save a copy in Drive'.
  4. Done ... Your browser will open the notebook copy you just made in your own Drive. Now, run this file to access some Twitter data.

Alternatively you can download the notebook and run it locally on your computer using tools such as Jupyter Notebooks. Using this method will also save your retrieved data to your local folders.

After you've got your hands on the CSV file full of tweets from your search query, you can analyze this anyway you want including uploading it to Graphext. With Graphext you can run prediction models to understand the relationships between tweets, cluster them based on the similarity of their topics or identify key community members.

Need Something Different?

We know that data isn't always clean and simple.
Have a look through these topics if you can't see what you are looking for.