February 2, 2022
Data-Driven SEO: A Keyword Optimization Guide using Web Scraping & Co-occurrence Analysis (Graphext + Deepnote + Adwords)
To improve our SEO, we built a data-driven method to analyze the content of top-ranking Google search results as part of a keyword optimization process. Starting with a single search term, our technique uses web scraping + NLP techniques to find specific keywords that are already proven to boost the rank of similar pages.
December 8, 2021
When Dating Apps Met Survey Theory: Sampling, Weighting & Romance
A picture of a population is what most surveys hope to achieve. Who doesn't want to know which essential Tinder personality traits help a person to be successful in love? We're taking a look at the fundamentals of survey theory - sampling & weighting - through the lens of a Pew Research survey that examines American attitudes towards relationships and dating apps in 2021.
November 24, 2021
Reverse Engineering Infamous Marketing Strategies from Innocent Drinks
Why are the social media strategies of Innocent Drinks considered as the gold standard for marketing teams the world over? We collected every tweet (10,521) posted by the communication department to deconstruct Innocent's content, style, reach and engagement with a simple topic analysis.
October 26, 2021
How to Perform Simple & Effective Customer Segmentation | A Walkthrough with Data from a Delicatessen
Customer segmentation involves splitting a customer base into distinct groups. These customer segments are defined by specific and shared characteristics, behaviours or preferences that help businesses to spot patterns and associate customers with one another. This article walks through the steps involved in a simple customer segmentation analysis. Using sales data from a delicatessen, we'll segment customers according to their buying preferences and behaviour. To achieve this, we'll use a powerful machine learning technique known as clustering.
September 10, 2021
Sentiment Analysis & Billboard Top 100: The Changing Mood of Popular Music
We used sentiment analysis to model 5100 Billboard chart-toppers between 1964 and 2015. Our analysis predicted whether song lyrics were positive, negative or neutral as well as detecting the topic and intent behind the most popular tunes in music history.
July 22, 2021
How to Study Brand Conversations with Advanced Text Analysis?
How can we use text analysis of data from Twitter to improve our understanding of markets? This is the question prompting Paul, a strategist in our business team, to scrape tweets about Lloyds bank and conduct a Twitter topic analysis using advanced NLP and network creation. First, he collected tweets using Tractor, Graphext's scraping tool for social media analysis. Then, he analyzed the topics of tweets using network analysis. Here's how he did it ...
July 20, 2021
A Beginners Guide to Market Segmentation: Types, Techniques & Examples to Better Understand Your Customer Base (with Data)
Market segmentation means splitting your customer base into distinct communities based on the similarity of their features. Depending on the data you use to segment customers, clustering a market dataset results in the grouping of customers based on geographic, demographic, behavioural and psychographic factors as well as their buying preferences.
June 8, 2021
The Method Behind Our Investigation of Reports of Adverse COVID-19 Vaccine Events
Taking on an investigation into the adverse reactions associated with the COVID-19 vaccination rollout in the USA, our team were aware of the increased need for transparency whilst conducting our analysis. This article documents the methodology behind our study of Vaccine Adverse Event Reporting System (VAERS) data.
June 8, 2021
Conspiracies, Complexity and Clustering: Investigating Reports of Adverse COVID-19 Vaccine Effects
Modelling data from the Vaccine Adverse Event Reporting System (VAERS) - a US government-sponsored vaccine reaction monitoring service - our team set out to investigate reports of adverse health effects related to the seismic rollout of the COVID-19 vaccination programme in the USA.
May 6, 2021
Good Risk vs Bad Risk: Deconstructing the Features of 1000 German Loans
Attempting to discover the most influential features of a loan application when considering risk, our team built a model using the features of a loan application to predict whether an applicant would have a good or bad risk rating.
March 24, 2021
The Moneyball Method: Using Data to Build a Football Dream Team (On a Budget)
Our team set out to build an exceptional football team for less than 100M Euros. Using data provided in the FIFA 2020/2021 dataset - the video game - we built a prediction model in order to find the key performance attributes for each position. Then, we used this to pick out a team of excellent but undervalued players.