The revenue for the two transactions on June 30th were added together. This data set now groups the revenue field by day.
These changes have occurred because we changed the granularity of the data set. We also notice that the transaction ID has changed to a count of transaction IDs. The first thing we notice is that the data set has shrunk. Let's now look at this data set in a less granular form. The granularity here is simply one transaction per line. Let's look at the example of a simple sales data set. Let's start by examining what exactly data granularity is.Įvery data set has a level of granularity, or level of detail represented by a single row. Third we'll prepare our data for aggregation.įinally we'll aggregate our data and union it with the remaining data set. Second we'll investigate two data sets with different granularity. First we'll define what exactly data granularity is.
In this lesson we're going to look at how to reduce the granularity of a complex data set by aggregating the data. In those lessons all of the data sets shared the same level of granularity, or level of detail. In the previous lesson we merged data sets with data unions and data joins. We then drag Sales to the Aggregated fields, ensuring that the aggregation type is set to SUM. In the Aggregate step, we drag Year and Product to the Grouped fields.
Once we’ve prepared the data, the process of aggregating the data is quite simple. In this case, we used the Right formula to extract the first 4 characters from the right of the transaction date field. We could use this field to create our year field by creating a calculated field which would take the year from the date field. It does have a date field, but that is too specific as it’s based on specific dates. This dataset has a product field but no year field. In order to do this, we need a year field to tell Tableau Prep to group our aggregation by year and product. In this lesson, we’d like to aggregate by product and year. Often when aggregating a dataset we must do a little preparation first. It can be sales by region, sales per individual salesperson and so on. Also, less granular datasets aren’t always grouped according to a period of time. It can also be the average, minimum, count, etc. This is called aggregating the data.ĭata is not always aggregated with a sum of values. In this case, the sum of all sales in a quarter. To make each row represent sales for an entire quarter, we have to create a calculation. A sales dataset grouping data by all sales in a given quarter would not be considered very granular. For example, a sales dataset with each row representing a single transaction would be considered a highly granular dataset. The more detailed the information, the more granular it is. Reduce the granularity of a complex dataset by aggregating the data.ĭata granularity refers to the level of detail per row.