The GROUP, GROUPBY, and COUNTDISTINCT functions are used to specify a series of unique elements to group together and to remove duplicate values from your data. They are designed to complement each other when aggregating and simplifying your data.
- GROUP enables you to remove duplicates from a column, for example when a column has multiple instances of the same value.
- GROUPBY collects values based on a specified aggregation method (like GROUP) so that the unique values align with a parallel column.
- COUNTDISTINCT counts the number of each unique value in a column.
Note: The results for these functions are ordered alphabetically by list item.
Tip: To quickly group, aggregate, filter, and sort using the options in the editor. Learn more here.
Get introduced to the GROUP and GROUPBY functions by watching this video:
Introduction to GROUP and GROUPBY
The following sections show detailed examples of how each function can be used. You can download the data source used in this article here: Regions and units.
GROUP
In this example, the Regions in column A are grouped so duplicate Regions are not displayed.
This is the formula structure:
GROUP(SLICE(values))
values: set to the Regions data in column A (A:A
), and wrap in SLICE to remove the column header.
Note: You do not need the SLICE function to remove the headers if you model your data source before entering the Klip Editor. Learn more here.
GROUPBY
The Regions in column A are grouped and the Units in column B are grouped (summed unless specified otherwise) according to the Regions. You can learn about different GROUPBY methods here.
This is the formula structure:
GROUPBY(SLICE (values1), SLICE (values2)
- values1: set to the Regions data in column A (
A:A
) and wrap in SLICE to remove the column header. - values2: set to the Units data in column B (
B:B
)and wrap in SLICE to remove the column header.
Note: You do not need the SLICE function to remove the headers if you model your data source before entering the Klip Editor. Learn more here.
Setting a different GROUPBY method (Optional)
The default method is SUM(Values)
. You can set a different method by entering a comma after the second value and choosing one from the drop-down list or typing one in as a string.
This is the formula structure:
GROUPBY(values1, values2,"method")
- values1: set to the Regions data in column A (
A:A
). - values2: set to the Units data in column B (
B:B
). - method: select from the list (as below) or type a function with values.
Watch the video below for a demonstration of how to use GROUPBY with the method parameter:
Using GROUPBY with the Method parameter
COUNTDISTINCT
COUNTDISTINCT counts the number of each unique value in a column of data. In this example, COUNTDISTINCT returns the number of times a specific region was recorded in the data to reflect the times there was a sale in that region.
This is the formula structure:
COUNTDISTINCT(values1)
- values1: set to the Regions data in column A (
A:A
).