June 11, 2021
Graphext | Graphtex | Graphnext: Grouping Similar Spellings Using Chars2Vec and Agglomerative Clustering
'España' and 'Españha' are just spelling variations. We built a way of grouping words spelt differently but referring to the same concept.
June 8, 2021
The Method Behind Our Investigation of Reports of Adverse COVID-19 Vaccine Events
Taking on an investigation into the adverse reactions associated with the COVID-19 vaccination rollout in the USA, our team were aware of the increased need for transparency whilst conducting our analysis. This article documents the methodology behind our study of Vaccine Adverse Event Reporting System (VAERS) data.
June 8, 2021
Conspiracies, Complexity and Clustering: Investigating Reports of Adverse COVID-19 Vaccine Effects
Modelling data from the Vaccine Adverse Event Reporting System (VAERS) - a US government-sponsored vaccine reaction monitoring service - our team set out to investigate reports of adverse health effects related to the seismic rollout of the COVID-19 vaccination programme in the USA.
May 6, 2021
Good Risk vs Bad Risk: Deconstructing the Features of 1000 German Loans
Attempting to discover the most influential features of a loan application when considering risk, our team built a model using the features of a loan application to predict whether an applicant would have a good or bad risk rating.
April 26, 2021
Jake's Project: Investigating the Data Behind a Good Day
Andy and María meet with Jake to talk about a dataset he's building about himself. From skating to people he sees to whether he flosses or not - Jake's data offers a unique and deeply personal insight into his life. But what makes the difference between good and bad days?
April 16, 2021
Simple Solutions to Prevent Customer Churn
Our team analyzed 7043 current and former customers of a telecoms provider in order to better understand what types of people are most likely to cancel their contracts.
April 7, 2021
How Data Can Help You Keep Your Workers
To showcase how a company could reduce employee turnover, our team clustered a dataset containing information about IBM employees to discover the reasons why employees left their jobs.
March 29, 2021
Menhir & Graphext: Analyzing the Intangible Value of Financial Assets
Working at the intersection of data science and finance, Menhir is using Graphext to understand the composition of financial portfolios, performing analysis that typically takes analysts between two and three weeks in just two days.
March 24, 2021
The Moneyball Method: Using Data to Build a Football Dream Team (On a Budget)
Our team set out to build an exceptional football team for less than 100M Euros. Using data provided in the FIFA 2020/2021 dataset - the video game - we built a prediction model in order to find the key performance attributes for each position. Then, we used this to pick out a team of excellent but undervalued players.
March 1, 2021
Using Customer Data and RFM Analysis to Create Relevant Ad Campaigns
Drafting in a dataset containing information about purchases made over 4 years by 1590 customers on an online superstore, our team wanted to demonstrate the usefulness of RFM analysis.
February 9, 2021
Patriotism, Animals, Comedy and Sex: Clustering 233 Superbowl Ads
We built a model clustering 233 Superbowl ads using data from FiveThirtyEight in order to work out what content brands use to sell their products during America's most-watched sporting event.
February 2, 2021
Health vs Economy: Using Twitter to Investigate How Latin American Leaders Have Responded to Covid-19
Julián Yunez, a political communications consultant, used Graphext to investigate how messages related to public health and economics have been balanced by Latin American leaders during the pandemic.