Business intelligence tools offer a ton of out-of-the-box visualizations – line charts, pie charts, bar and stacked bar charts, and others. However, sometimes this variety of charts is not enough to analyze data for a specific use case. Two such use-cases are cohort analysis and rank/bump charts. In this post, we’ll take a look at the SQL queries behind these wacky charts – regular cohort charts, qualitative cohort charts, and rank/bump charts so that you can build them even if it isn’t offered as an out-of-the-box chart option, in a tool like the SQL Report Builder

Cohort charts

Cohorts are useful in comparing “groups” or “cohorts” of entities together. Examples of commonly-used cohorts are: customers who made their first order in the same month, customers who made their first order of a particular product in the same month, products that were first released in the same time period, and so on. Below is an example of revenue generation by cohorts of customers who made their first orders from January 2016 onwards on a monthly basis.

Each line in the chart represents a different cohort of customers. The x-axis indicates the number of months since the customers’ first orders, and the y-axis shows the revenue generated in that month.

cohort1

Here is the associated query to build a chart like that:

A cohort can also be analyzed on a cumulative basis as well. Below is the corresponding chart and query:

cohort2

Lastly, to define cohorts using qualitative variables like acquisition sources, locations, or customer segments, simply replace the first column “cohort” with the required variable. Here’s an example of the cumulative cohort with a “shipping state” variable used to define each cohort.

cohort3

And the associated query:

Rank charts

Rank charts (also known as bump charts) are often used to measure the rank finite number of groupings over a period of time. The criss-crossing of lines is key to reading these charts. A crossing (or, a bump) means one entity has surpassed the other entity in absolute terms, even though we are looking at the relative rank. This graphic provides no information about the share of the total quantity taken up by each entity, only which entity has the higher share.

Below is the rank of the geographic region of terrorist attacks since 1971. The rank for each region is determined by the number of incidents that occurred in that year.

rank

The associated query is below. The data was obtained from the Global Terrorism Database.

There are two cool, non-standard chart types that we are aware of. If you have written SQL logic for other chart types, or would like us to recommend SQL logic for other chart types, we’re definitely interested in hearing from you. Please leave your comments below!