September 10, 2021
Sentiment Analysis & Billboard Top 100: The Changing Mood of Popular Music
We used sentiment analysis to model 5100 Billboard chart-toppers between 1964 and 2015. Our analysis predicted whether song lyrics were positive, negative or neutral as well as detecting the topic and intent behind the most popular tunes in music history.
August 23, 2021
The 5 Most Extreme US Office Characters
Testing out our brand spanking new integration with Hugging Face models for NLP, we analyzed speech from characters in all 9 series of the US Office. Added into our Graphext project, the language models focused on classifying the dialogue of Michael, Dwight, Pam, Jim, Daryll and all the other characters according to the detection of sentiment, emotion, offensive language, irony and hate speech.
July 22, 2021
How to Study Brand Conversations with Advanced Text Analysis?
How can we use text analysis of data from Twitter to improve our understanding of markets? This is the question prompting Paul, a strategist in our business team, to scrape tweets about Lloyds bank and conduct a Twitter topic analysis using advanced NLP and network creation. First, he collected tweets using Tractor, Graphext's scraping tool for social media analysis. Then, he analyzed the topics of tweets using network analysis. Here's how he did it ...
July 20, 2021
A Beginners Guide to Market Segmentation: Types, Techniques & Examples to Better Understand Your Customer Base (with Data)
Market segmentation means splitting your customer base into distinct communities based on the similarity of their features. Depending on the data you use to segment customers, clustering a market dataset results in the grouping of customers based on geographic, demographic, behavioural and psychographic factors as well as their buying preferences.
July 5, 2021
Using Mutual Information to Cluster Variables and Discover the Associations Between Survey Questions
Our team set out to build a type of analysis that could be used to measure the strength of association between variables in a dataset. Here's how we did it ...
June 29, 2021
A Market Segmentation of 1000 Supermarket Customers Using Data on Sales, Income and Demographics
Our team clustered 1000 supermarket sales in order to segment customers according to their buying habits. Our market segmentation analysis uses data on the demographics, income and geography of customers to identify key buyer personas and inform marketing strategies and campaigns.
June 11, 2021
Graphext | Graphtex | Graphnext: Grouping Similar Spellings Using Chars2Vec and Agglomerative Clustering
'España' and 'Españha' are just spelling variations. We built a way of grouping words spelt differently but referring to the same concept.
June 8, 2021
The Method Behind Our Investigation of Reports of Adverse COVID-19 Vaccine Events
Taking on an investigation into the adverse reactions associated with the COVID-19 vaccination rollout in the USA, our team were aware of the increased need for transparency whilst conducting our analysis. This article documents the methodology behind our study of Vaccine Adverse Event Reporting System (VAERS) data.
June 8, 2021
Conspiracies, Complexity and Clustering: Investigating Reports of Adverse COVID-19 Vaccine Effects
Modelling data from the Vaccine Adverse Event Reporting System (VAERS) - a US government-sponsored vaccine reaction monitoring service - our team set out to investigate reports of adverse health effects related to the seismic rollout of the COVID-19 vaccination programme in the USA.
May 6, 2021
Good Risk vs Bad Risk: Deconstructing the Features of 1000 German Loans
Attempting to discover the most influential features of a loan application when considering risk, our team built a model using the features of a loan application to predict whether an applicant would have a good or bad risk rating.
April 26, 2021
Jake's Project: Investigating the Data Behind a Good Day
Andy and María meet with Jake to talk about a dataset he's building about himself. From skating to people he sees to whether he flosses or not - Jake's data offers a unique and deeply personal insight into his life. But what makes the difference between good and bad days?