In this post we will explain how to save Google Analytics (GA) acquisition channel information into your own database – namely the sourcemediumtermcontentcampaign, and gclid parameters that were present on a user’s first visit to your website. For an explanation of these parameters, check out the Google Analytics documentation. Then, we will explore some of the powerful marketing analyses that can be performed with this information in RJMetrics.

Why?

If you’re just looking at the default Google Analytics conversion and acquisition metrics, you aren’t getting the whole picture. While seeing the number of conversions from organic search versus paid search is interesting, what can you do with that information? Should you spend more money on paid search? That depends on the value of customers coming from that channel, which is not something Google Analytics provides. [Note: Google Analytics eCommerce Tracking does mitigate this problem by storing transaction data in GA, but this solution doesn’t work for non-eCommerce sites, and certain tools like cohort analysis are not easy to do in the GA interface].

What if you want to email a follow-up deal to all customers acquired from a certain e-mail campaign? Or integrate acquisition data with your CRM system? This is impossible in GA – in fact, it is against the Terms of Service for Google Analytics to store any data that identifies an individual.  But that doesn’t mean you can’t store this data yourself.

The Method

(Special Note: If you are using Magento to power your eCommerce site, we’ve already done the work for you. Check out our free acquisition source tracker extension. Not only does it track each user’s acquisition source data, it also tracks each order’s source data.)

Google Analytics stores visitor referral information in a cookie called __utmz. After this cookie is set (by the Google Analytics tracking code), its contents will be sent with every subsequent request to your domain from that user. So in PHP, for example, you could check out the contents of $_COOKIE['__utmz'] and you would see a string that looks something like this:

100000000.12345678.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=rj metrics

There is clearly some acquisition source data encoded into the string, and I have done some testing to confirm that this is the visitor’s first acquisition source. Now we just need to know how to extract the data. Luckily, Justin Cutroni has previously described how this encoding works, and shared some javascript code to extract the key bits of information.

We took this code and translated it into a PHP library hosted on github.   To use the library, include a reference to ReferralGrabber.php and then call

$data = ReferralGrabber::parseGoogleCookie($_COOKIE['__utmz']);

The returned $data array will be a map of the keys source, medium, term, content, campaign, gclid and their respective values.

We recommend adding a new table to your database called, for example, user_referral, with the columns like: id INT PRIMARY KEY, user_id INT NOT NULL, source VARCHAR(255), medium VARCHAR(255), term VARCHAR(255), content VARCHAR(255), campaign VARCHAR(255), gclid VARCHAR(255). Whenever a user signs up, grab the referral information and store it to this table.

How to use this data

Now that we’re saving user acquisition source, how can we use it?

Lets suppose we are using a SQL database and have a users table with the following structure:

id email join_date acq_source acq_medium
1 john@abc.com 2012-01-24 google organic
2 jim@abc.com 2012-01-24 google cpc
3 joe@def.com 2012-01-25 direct
4 jess@ghi.com 2012-01-26 referral techcrunch.com
5 jen@ghi.net 2012-01-30 other organic

For starters, we can count the number of users coming from each referral channel by running the following query against your database:

SELECT acq_source, COUNT(id) as user_count FROM users GROUP BY acq_source;

The result will look something like this:

acq_source user_count
google 294
direct 156
referral 55
other 16

This is interesting, but of limited use. What we would really like to know is the growth rate of these numbers over time, the amount of revenue generated by each acquisition source, a cohort analysis of users coming from each source, and the probability that a user from one of these channels will return as a customer in the future. The queries required to do these analyses are complex – which is why we built RJMetrics. Armed with this information we can determine our most profitable acquisition channels and focus our marketing time and money accordingly.

My colleague Xiao has also written a blog post detailing how to generate these and other useful marketing analyses using RJMetrics for you to check out.

ad-13