Social media analysis involves finding information in data gathered from social channels like Twitter, Facebook, Instagram, LinkedIn, Reddit amongst many others.
Revered in the data science community for the immeasurable volume of data published every second, social media channels are thought to be windows into public opinion on a topic. Studying social media data allows businesses to understand consumer opinion, track brand awareness and expose new opportunities for growth.
Our Social Media analysis types take the heavy lifting out of analyzing social media data. We've built context-specific algorithms for community network analysis, topic detection and finding influencers. You can also use Graphext to extract key language features like sentiment, entities and part of speech tags from social media data.
Graphext PRO users can use our custom-built tool - Tractor - to gather information from popular digitals channels in an analysis friendly format. All without writing a line of code or making any API integrations. Find out more.
Social media analysis is constantly evolving. It's more than just measuring followers, retweets or finding trends in geographic regions. Data from social media is now used to enrich decision making in boardrooms across the world and in industries of all shapes and sizes.
NLP (Natural Language Processing) is a big part of social media analysis today and is a research field that covers any kind of computer manipulation of speech, text or other forms of natural language. Because a great deal of human activity on platforms like Twitter, LinkedIn and Facebook comes in the form of written language - data science techniques like topic detection, sentiment analysis and part-of-speech tagging can be used to derive insights from things people post.
"I use social media as an idea generator, trend mapper and strategic compass for all of our online business ventures."
- Paul Barron | CEO, Foodable Network
As well as understanding what people are talking about, social media data can now be used to map the communities behind trending conversations. Businesses can find the influencers driving conversations as well as monitoring how key messages are moving across geographies. Community network analysis is used to understand why and how information is spread by people online and can offer essential insights for marketing and PR teams.
Businesses will also often segment data collected from social media platforms in order to analyze the demographics of communities surrounding an online conversation. Finding detailed information about sub-communities can be an especially useful way to understand the specific demographics of people engaging with topics.
The technology behind the analysis types in Social Media has been built by our team or integrated with open-source machine learning projects. There are a number of different analysis types you can choose from here including topic detection and community network analysis. Whilst we support analysis of data gathered from all popular platforms, Graphext offers a number of Twitter-specific analysis types.
Inside each Social Media analysis type, Graphext will ask you to fill in a series of questions. These questions involve connecting key columns in your data to our pre-built scripts, choosing the language features you want to extract and setting the language of the posts that you will be analyzing.
Then, executing your project will start the analysis. When your project has been built, you'll see the output variables in the data.
Social Media | Topics projects use NLP to detect the main themes of text posted on social media platforms. Built using machine learning technology that can extract the significant terms from your text values, topic projects will group all of your data according to the similarity of the thematic content of your chosen text field.
When setting up a Topics project, Graphext will automatically answer the questions needed to start your analysis. Editing this configuration involves customizing the keywords, hashtags, entities or other language features you want to extract from social posts as well as choosing the way that Graphext recognises the language of the posts in your data.
A huge volume of the content posted on social media comes in the form of written language. Topics projects make sense of this largely unstructured data by grouping social posts that reference similar ideas. In this way, Topic analysis of social media data allows analysts to quickly grasp the main themes of conversations taking place on digital platforms.
Topic detection analysis in Graphext works by embedding - or vectorizing - text in your dataset. These vectors represent your text as lists of numbers that can be used to link words. The closest vectors are linked to one another forming the basis of your clusters.
You can control the number of links each row in your data should have using the setup wizard. Using the links between text, Graphext assigns each row a position on your network visualisation - Graph. This results in the representation of groups of closely related text.
Next, we use our Louvain algorithm to create clusters using the links between text values. These get added to a separate variable in your project Clusters, in which each cluster is labelled according to the most common significant terms in all of the text belonging to it.
It's easy to build a Topic analysis of social media data in Graphext. All the NLP and heavy lifting takes place behind the scenes. Graphext will automatically complete the setup of Topic analysis projects using intelligent recognition of the correct values and language in your social media data.
Nonetheless, you can edit this default configuration if you wish to customize which language features are extracted.
The first thing we need to pay attention to when setting up a Social Media | Topics project is the Data Extraction tab. This tab allows you to specify or infer the language of the text you want to analyze as well as extracting key information from it.
All of the information that you extract will be added as a new categorical variable to your project's dataset.
Setting the language of your text is a crucial part of starting your NLP analysis because all language models are trained using language-specific datasets. If you are confident that all of your text is in one language - specify this in the project setup wizard. You can also choose to use a column that contains the language of each text value or make use of a pre-trained model that can recognise the language of social posts in your dataset.
The Text tab lets you further customize the setup of your Social Media | Topics projects. Here you can choose between the algorithm used to transform your data.
Choosing between speed and precision changes the algorithm used to transform your data. For the best results choose precision, which will deploy a Hugging Face transformer on social posts in your data.
Choosing to generate automatic insights in your Social Media Topic analysis projects instructs Graphext to prepare a few insight cards in your Insights panel drawing attention to key patterns in your analysis.
These include insights about the clusters detected in your data as well as insights about the key language features that Graphext was able to extract. Automatic insights can be a very useful way to kick start your analysis and point out interesting areas requiring further investigation.
Opening up a Topics project, the first thing you'll see is your Graph - network visualisation - where rows in your data are grouped together according to the thematic similarity of the text you analyzed. The clusters variable represents the topics that Graphext was able to detect in your social posts.
Each cluster is labelled using the key significant terms that define text inside the cluster. Click on the clusters and inspect the significant terms variable to check this yourself!
Depending on the information that you choose to extract, Graphext will have added new variables with this data to the data inside your project. Search for them using the search bar in your right sidebar.
Keywords projects are focused on mapping the relationships between important terms posted by members of the public on social media. The underlying technology behind Keywords analysis will extract keywords from text in your data, measure the strength of association between each keyword and plot these relationships in a network visualisation, where each node in the network represents one keyword.
The key variables that emerge in Keywords projects are; Keyword - Cluster - Count. Keyword contains the keywords extracted from social media posts, Count measures how many times they appear and Cluster groups keywords together according to their associations.
Keywords analysis is a form of natural language processing that is intended to reduce your dataset to a collection of related keywords. These keywords are determined by Graphext to be the significant terms within the text you are analyzing and are grouped together to form clusters of closely related terms.
These projects work by firstly extracting the significant terms from the text in each row of your data. Then - according to the threshold that you set - Graphext will neglect keywords that don't appear at least n. times - the default value is 3.
When setting up your Social Media Keyword analysis project, you can choose whether to perform a similarity analysis or a co-occurrence analysis. The underlying algorithms behind this choice will transform your data in different ways.
Choosing Similarity Analysis will instruct Graphext to embed - or vectorize - this collection of keywords in such a way that they can be linked according to their semantic similarity.
Choosing Co-occurrence Analysis will instruct Graphext to embed - or vectorize - this collection of keywords in such a way that links are calculated based on keywords that often occur together.
Using the links between keyword vectors, Graphext assigns each keyword a position on your network visualisation - Graph. This results in the representation of connectivity between keywords with similar meanings.
Next, we use our Louvain algorithm to create clusters using the links between keywords. These get added to a separate variable in your project Clusters, in which each cluster is labelled according to the keywords contained within it.
Keyword analysis projects are quick and easy to deploy. When you build a Social Media Keywords project, Graphext will automatically set up your analysis. Nonetheless, you can edit the default setup and customize the way language is recognised in your data and set the number of times a keyword must appear to take it into account.
It's also crucial to choose whether you want to study semantic similarity or co-occurrence. You can configure this inside the Analysis tab of the project setup wizard.
Because your data is transformed to represent one keyword for each row - these projects are not suitable for other forms of text analysis like part of speech tagging, sentiment and entity recognition. Instead, you should choose Keyword analysis only if you want to study the important words in your data.
Inside of your project setup wizard for Social Media Keywords analysis projects, you'll find the Data Extraction tab. This tab allows you to set the text column you want to analyze and choose how to configure the language of social media posts in your data.
Setting the language of your content is an important part of starting your NLP analysis because all language models are trained using language-specific datasets. If you are confident that all of your social media data is in one language - specify this in the project setup wizard. You can also choose to use a column that contains the language of each text value or make use of a pre-trained model that can recognise the language of text.
The Analysis tab lets you further customize the setup of your Social Media Keywords projects. Here you can choose whether to conduct a similarity analysis or a co-occurrence analysis and set a threshold for the appearance of keywords in your text.
Setting a threshold for the appearance of keywords in your data is a crucial part of controlling what your final project will look like. Any keyword that appears less than this threshold won't be included in your final project. If you are working with a large collection of social media posts - bigger thresholds are recommended to avoid noise in your collection of keywords.
Similarity analysis groups keywords according to their semantic similarity whereas co-occurrence analysis will group keywords that occur frequently together.
Choosing between speed and precision changes the algorithm used to transform your data. For the best results choose precision, which will deploy a Hugging Face transformer on your data. This choice is only available for similarity analysis projects.
You can also choose between analyzing keywords or hashtags. Choosing hashtags will mean that your dataset will be reduced to a collection of hashtags as opposed to keywords.
Opening up a Keywords project, the first thing you'll see is your Graph - network visualisation - where each node in the network represents one keyword. These are the significant terms that Graphext extracted from social media posts in your data. Find and interact with the Keywords variable in your sidebar charts to inspect the keywords in more detail.
Your data will now include only 3 variables; Count - Keyword - Cluster. This is because Graphext reduced your data to a collection of keywords.
Each keyword is linked to others. You can inspect which keywords are linked to one another by clicking on one in the Graph and choosing Select Neighbours.
Graphext uses the Count variable to apply size mapping to nodes in your Graph. Larger nodes appear more frequently in your text than smaller ones.
Your Cluster variable groups related keywords. The labels assigned to each cluster represent the keywords in that cluster.
We know that data isn't always clean and simple.
Have a look through these topics if you can't see what you are looking for.