Analyze Data

Start filtering your data by interacting with the sidebar charts that represent your variables.

Filters affect what data is shown in your Graph, Trends and Details panels. Filtering is a useful way of zooming in on aspects of your data and offers a free-flowing way to investigate details behind specific segmentations.

Types of Variable Filter

Text Filters

Depending on the type of analysis you are doing, Graphext will often extract text from your dataset. Extracted terms appear as a variable in the left or right sidebar with a list of the words and phrases presented in order of frequency under the variable title.

Text filters can be used to create segmentations of your data featuring specific words or phrases. This might be useful if you are working with a dataset of customer reviews and want to focus on reviews that mention your infamous "pumpkin spiced latte".

You can reorder extracted terms in your variable list using the up-down arrow icon. Search for a specific term if a variable list is really long.

How to Create a Text Filter?

  1. Start from your project's Graph, Trends or Details panel.
  2. Locate your text variable in either the left or right sidebar.
  3. Select the text in the variable list representing your extracted word or phrase - use the search tool if there are lots of items.
  4. Your analysis will respond immediately to the filter you just applied.
  5. Hold shift and select another item if you want to include another extracted word or phrase in your filter.
  6. Done ... Your data now represents the points inside of your new filter.

Quantitative Filters

Quantitative variables represent data expressing a certain numerical quantity, amount or range. Height and weight are good examples of quantitative variables. Applying a quantitative filter will restrict the data shown to those whose value for that variable falls within your filter.

Drag your mouse inside the variable filter chart to create a filter range. Pull the boundaries left or right to adjust them. Click on the number above and below a filter range to explicitly set the filter's upper and lower boundaries respectively.

How to Create a Quantitative Filter?

  1. Start from your project's Graph, Trends or Details panel.
  2. Locate your quantitative variable in either the left or right sidebar.
  3. Drag your mouse over the bar in the chart representing the value you want to include in your filter.
  4. Your analysis will immediately respond to the filter you just applied.
  5. Drag the upper and lower boundary lines to expand the range of your filter.
  6. Done ... The data shown now represents the points inside of your new filter.

Categorical Filters

Categorical variables represent types of data that are divided into groups. Nationality is a good example of categorical data. Filtering categorical data will select all data points belonging to a certain group. This allows for a closer examination of the data in that group.

Categorical variables are represented in bar charts within your project's sidebars. You can switch this display to view them as a list. To include more than one category in your filter, hold down shift whilst selecting.

How to Create a Categorical Filter?

  1. Start from your project's Graph, Trends or Details panel.
  2. Locate your categorical variable in either the left or right sidebar.
  3. Select the bar in the variable chart representing your category - use the search tool if there are lots of categories.
  4. Your analysis will respond immediately to the filter you just applied.
  5. Hold shift and select another bar if you want to include another category in your filter.
  6. Done ... Your data now represents the points inside of your new filter.


Taking Action with A Selection

Saving Selections

So you have created a filter and now you want to save the data inside of it as a segmentation?

Saving filters creates new segmentations of your data. These groups become a new variable which you can use to recreate your filter, find trends within its groups or to create new, more detailed segmentations.

Segmentations are a useful way of organising your data into important groups.

How to Save a Selection?

  1. Start from your project's Graph, Trends or Details panel.
  2. At the top of the right sidebar click 'New Segmentation'.
  3. Select 'Manual' from the menu list.
  4. Enter a name for your segmentation and click 'OK'.
  5. Make sure you have applied a filter.
  6. Within your new segmentation card, select the icon to 'Add a segment'.
  7. Enter a name for the new segment and click 'OK'.
  8. This filter has been added as a segment to your new segmentation.
  9. To add another filter to the segmentation, create a different filter then repeat steps 6-8.
  10. To save the segmentation, select the save icon in the top right corner of your new segmentation card.
  11. Done ... These new groups are now variables that you can select again at any point.

Deleting Segmentations

You can delete segmentations that you've created using the more options dropdown menu belonging to each variable chart.

How to Delete a Segmentation?

  1. Start from your project's Graph, Trends or Details panel.
  2. Locate the segmentation you want to remove in the right or left sidebar.
  3. Click the 3 dots icon representing 'More Options'.
  4. Select 'Remove Segmentation' from the bottom of the menu list.

Deleting Selections

So you have created a filter but the data inside is useless?

Deleting selections will remove unwanted data points from your project. Its a two-step process which recreates your project without the data you have deleted.

How to Delete Data Inside a Selection?

  1. Start from your project's Graph.
  2. Make sure you have applied a filter.
  3. Click on the trash icon inside the top menu - next to the 'Publish' button.
  4. Select 'Send to trash'.
  5. Your nodes will turn black, indicating that they are in the trash.
  6. To remove them from the project, empty the trash by clicking on the trash icon again.
  7. Select 'Empty trash' from the menu list.
  8. Select 'Accept' from the confirmation window that pops up.
  9. Done ... A new project will be created without the data inside your selection.

Clustering Selections

After making a selection using filtering, you can automatically cluster the data that lies within the selection. This involves breaking your selection up into smaller and more precise groups. This can be useful in identifying specific patterns within sub-communities and is a good way of inspecting your selection in greater detail.

When editing your segmentation, use the magic wand icon to reconfigure the strength of connections between data points in your newly clustered selection.

How to Cluster Your Selection?

  1. Start from your project's Graph, Trends or Details panel.
  2. Make sure you have applied a filter.
  3. At the top of the right sidebar click 'New Segmentation'.
  4. Select 'Automatic' from the menu list.
  5. Enter a name for your new segmentation.
  6. Click 'OK'.
  7. Graphext will create new clusters inside of your selection.
  8. To change the strength of connections inside of your newly clustered segmentation, click the magic wand icon.
  9. Then drag the slider towards 'Break' to configure looser connections and towards 'Join' to configure stronger connections.
  10. Select the 'Save' icon in the top right of the new segmentation card.
  11. Done ... Start inspecting your new clusters!

Zooming In

When you are working with quantitative filters, you control the range of data displayed inside of your project. Sometimes values in quantitative variables can be bunched together. Moreover, you might want to pick out a smaller range and examine it in greater detail. Using the dropdown menu for a quantitative variable sidebar chart, you can zoom in on small value ranges.

Clicking Zoom In whilst you have an active quantitative selection means that Graphext will create a new variable containing only the range of values inside of your selection. Inside of your zoomed in variable, your values will be distributed across a larger range of bars meaning that you can inspect them with greater precision.

How to Zoom in on Quantitative Ranges?

  1. Start from your project's Graph, Trends or Details panel.
  2. Find the quantitative variable you want to examine.
  3. Drag your cursor over a specific range to select it.
  4. Notice that your selection has changed the active data points inside of your project.
  5. Click the 3 dots from the top right of your variable chart to open up the dropdown menu.
  6. Select Zoom In from the menu list.
  7. That's it. Graphext will automatically generate a new variable chart exclusively representing the range inside of your selection.
  8. Done. Now inspect your values.


Sorting Filter Options

When you have lots of options in your variable sidebar charts, filtering can become confusing. If there are hundreds or thousands of text or categorical filters to choose from, it can be difficult to find the value you need.

You can sort the filter options in your variable sidebar card so that more relevant options are presented first. There are 4 ways of sorting the categories or text filter values belonging to a variable; by everything, by selection, by uplift and by TF-IDF. To sort filter options, select the up and down arrow icon from the top right of a variable card.

Everything

Sorting by everything means that the categories or text values appearing at the top of your list or chart will be the ones most frequently appearing within your entire dataset. This order will remain consistent despite any selections that you make.

Selection

Sorting by selection means that the categories or text values appearing at the top of your list or chart will be the ones most frequently appearing within your selection. The order of the list will update dynamically for any new selections that you make.

TF-IDF

TF-IDF, or term frequency-inverse document frequency, is a method of sorting your values that is intended to reflect the importance of a single category or text value in relation to the entire set values belonging to a variable. Sorting by TF-IDF means that categories will appear at the top of your list if they are found more often in your selection than would be expected from their occurrence in the whole dataset. The more over-represented a category in your selection, the higher up it will appear in your list.

Uplift

Uplift measures the percentage change in frequency of a category in your selection relative to its frequency in the whole dataset. Sorting by uplift means that the values in your variable list or chart will be presented in order of the biggest difference between the number of times a value appears in the whole dataset and the number of times that it appears in a selection you make. Values that don't appear often in the entire dataset but are frequent in your selection will be the first presented.

How to Sort Filter Options?

  1. Start from the Graph, Details or Trends panel of your project.
  2. Find the categorical or text variable inside of your sidebars.
  3. Click the up and down arrow icon representing Sorting.
  4. Choose a Sorting option from the menu list.
  5. That's it. Your filter options will appear in the order defined by your method of sorting. If you chose Uplift or Selection, they will dynamically update with new selections that you make.
  6. Done ... Toggle between Sorting options until you find a method that presents the most relevant values.


Discarding Filter Options

You can discard categories from your variable to reduce the number of filter options available to you. This can be particularly useful when you are sorting filter options using TF-IDF or uplift.

When you discard filter options, Graphext will ask you to set a threshold to limit the number of filters presented in your variable sidebar chart or list. This threshold means that filters appearing in a number of nodes smaller than your threshold will not be presented. For instance, you can specify that your filter should only contain values that appear in 10% or more of your data points. This results in your filter list being reduced to present only the values that meet this criteria.

Everything

Discarding from everything means that the threshold you set will match up against all values in your dataset regardless of any filters that you apply.

Selection

Discarding from your means that the threshold you set will update dynamically depending on the number of values in any new selection you make.

How to Discard Filter Options?

  1. Start from the Graph, Details or Trends panel of your project.
  2. Find the categorical or text variable inside of your sidebars.
  3. Click the 3 dots from the top right of the variable card.
  4. Select Discard categories from the menu list.
  5. Decide whether your threshold will apply to every value in the dataset or values belonging to a selection you make.
  6. Using the Nodes dropdown within the textbox, choose whether your threshold refers to a number of values or the percentage of values.
  7. Enter a number in the text box to use as your threshold.
  8. As you type, the options falling under your threshold will be removed from the list or chart of filter options.
  9. To add another filter, select the New filter button below the text box. Then, repeat steps 6 - 9.
  10. To save your filter, click the save icon next to the name of your variable.
  11. Done ... Your list of filter options now includes only values that meet the criteria you just set.


Clearing Filters

There are two ways to clear filters. Either use the 'Clear' button at the top of your left sidebar to remove all filters. Alternatively you can clear a specific filter using the 'Clear filter' icon within the sidebar variable card.

How to Clear All Filters?

  1. Start from your project's Graph, Details or Trends panel.
  2. Select the 'Clear' button at the top of the left sidebar.
  3. Done ... All filters have been removed.

How to Clear a Specific Filter?

  1. Start from your project's Graph, Details or Trends panel.
  2. Locate your variable in either the left or right sidebar.
  3. Select the 'Clear Filter' icon next to the variable's name.
  4. Done ... The data presented in your project will no longer take this filter into account.

Need Something Different?

We know that data isn't always clean and simple.
Have a look through these topics if you can't see what you are looking for.