Twitter Vine Flying Past Competition Despite Low Overall Adoption

On January 24, 2013, Twitter released Vine, a mobile service that lets you capture and share short looping videos.  We set out to learn just how popular Vine has become in its first month of existence and how its performance has stacked up against competitors like Viddy and Socialcam.

We loaded data from Twitter’s API into RJMetrics and here’s what we found:

  • Overall, video creation is still an extremely underdeveloped market.  Only about 4% of highly active users shared a video through Vine or a top competitor during Vine’s first month on the market.  In that same period, 98% of the same group shared at least one photo through a leading photo sharing service.
  • In its first month, Vine steadily gained market penetration to 2.8% of Twitter’s highly active users, blowing past competitors Viddy and Socialcam, which were used by 0.5% and 0.2% of the same population, respectively.
  • Twitter’s built-in tools for photo and video sharing are dominating the competition.  Vine.co and pic.twitter.com are the most popular tools in their respective categories by a wide margin.

Vine Adoption

Vine showed impressively stable adoption growth over the course of its first month.  We were expecting to see a spike in adoption around the time of the announcement followed by a leveling-off period, but instead the percent of new users each day has remained consistent.  This is a good sign for future growth because the rate of adoption does not appear to be slowing as time goes by.

Vine vs. Viddy vs. SocialCam

In Vine’s first month, the percent of highly active users who used Vine was meaningfully higher than the percent who used competitors Viddy and SocialCam.

We were concerned that this might not be an apples-to-apples comparison since many users may have just been “trying out” Vine in this month.  As a check, we looked at the average number of times each of these tools was used during the month.  As it turned out, repeat usage of Vine was actually more likely than the other apps.

Video vs Photo

While Vine’s performance is impressive relative to its competitors, it’s still a tiny player in the universe of media sharing on Twitter.  We looked at the percentage of users that linked to various media sharing services and found that photos represent the vast majority of the links sent out by highly active users.

As you can see in the chart above, native Twitter photo hosting (pic.twitter.com) is the dominant player, followed by Instagram and then a number of less prominent competitors.

When you remove Twitter and Instagram, you can see just how small a player Vine is when it comes to sharing media.

About The Data

We decided to sample from Twitter’s most active users to find early-adopter activity.  Twitter’s API was used to identify and download the twitter streams of about 2,500 randomly-selected “highly active” users, each of whom had tweeted at least 100 times so far in 2013.

The result was 2.3 million tweets that were sent between January 24th and February 24th.  320,000 of these tweets contained links, which we followed through any link shorteners to find their final destinations.

The data was then loaded into RJMetrics, where we generated this analysis with just a few clicks.

Conclusions

Twitter’s efforts to add native photo and video sharing into its service are proving fruitful.  These tools have quickly become the most popular options for end users, causing a major impact on the market for 3rd party apps.

Vine appears to be establishing itself as the de facto tool for short video creation and sharing.  However, the significance of this move will only be felt as its market matures.  Today, Vine is a service only used by a small minority of even the most highly active users.

 

The Human Element in Automated Software Testing

As with any startup, the excitement of improving and adding to our current product often overshadows more mundane aspects of software development, such as automated software testing. In response to this, here at RJMetrics we have recently been examining our current suite of automated tests and reevaluating our strategy towards testing. There are already lots of great blog posts about unit testing philosophy and best practices, so instead I’d like to share some of our personal experiences navigating the human elements of automated software testing.
  • Testing can be a divisive issue. Each developer subscribes to his or her own philosophy of software development, and this includes testing. You will never get everyone to agree on these issues and could spend hours debating the goals and merits of testing, so my best advice is to decide upon and document a team philosophy early on. This includes how your team defines the basic terms such as unit and integration test, and classifying the types of tests that are important for your product. You should also clarify the main goals of your tests (to catch bugs, aid in refactoring, etc), as this affects how tests are written and what is tested. Once this is in place and everyone is on the same page, you can move forward with more meaningful discussions.
  • A smaller number of thoughtful unit tests is infinitely better than a large number of poorly written unit tests. In addition to failing to catch real bugs, bad unit tests break as a result of unrelated code changes. Fixing such tests slows down product development by taking time away from other projects, and it takes an even worse psychological toll on the team. Spending minutes or hours debugging a failing unit test is enough to turn the most avid supporter sour on testing. To avoid such tests, everyone should be familiar with unit testing “best practices” and any testing pitfalls specific to your codebase should be addressed as early on as possible.
  • People are more likely to write and run unit tests if its easy to do. Try to simplify the process as much as possible by providing a collection of helper functions and objects that will allow developers to focus on testing the code at hand. For us, this means providing a set of “stock” objects, ready to be used in a test.
Bearing in mind the experiences above, we are moving forward to improve and simplify our automated testing. There is still much more for us to learn and we will periodically reevaluate and keep everyone posted on the success of our testing initiative in catching bugs, refactoring old code and aiding in new code development.

Our Start-Up’s First Trade Show: A Data-Driven Recap

Last week, Jake and I attended the 2011 Internet Retailer Conference and Exhibition (IRCE 2011) in San Diego.This four-day event is the world’s largest e-commerce convention, with over 7,200 attendees and 500 exhibiting vendors.

This was our first “real” trade show with RJMetrics and it was a new experience for both of us. Our main objectives were to generate sales leads and raise awareness of RJMetrics in the internet retail community. To help achieve these goals, we purchased a 10’x10’ booth in the exhibit hall to peddle our wares.

This post is a summary of our experience as newbies to the trade show circuit. It is the result of data collection and note-taking on both our and other exhibitors’ behavior and performance. The data taught us three main lessons:

  • It pays to be aggressive
  • We have the most success with conference attendees who don’t look like us.
  • One of us has a very bright future as a carnival barker.

Who wouldn’t buy software from these handsome, slightly blurry gentlemen?

Exhibitor Social Dynamics

Within the first hour of the show, we observed quite a bit. Most notable was that trade show exhibitors appear to all fall somewhere on the “aggression spectrum,” with the most prominent approaches including:

  • Disengaged: exhibitors sitting behind their booths, not making eye contact with anyone, and waiting for attendees to approach them.
  • Passive: exhibitors standing, saying hello, or otherwise giving physical signs that they are available to speak if passers-by are interested.
  • Passive-Question: exhibitors attempting to engage passers-by with easy-to-reject questions such as “Can I give you some information on our company?”
  • Aggressive: exhibitors approaching passers-by with out-of-the-blue questions related to their pitch, such as “do you ship product?” or “who is your domain name registrar?”

Additionally, many exhibitors made use of “enhancers” to help attract attendee interest. (We had none of these to offer.) The most popular ones included:

  • Swag: Pens! Stress balls! T-Shirts! Nothing softens a sales pitch like free junk with your company’s logo on it. While some attendees were lugging bags of this stuff around, we didn’t perceive any disadvantage by not offering it. (This is based on casual observations of our performance relative to nearby booths that were aggressively distributing swag.)
  • Sweepstakes: There was a huge promotion at the conference in which attendees who collected “stickers” from about 40 specific booths would be entered to win a new car. (We could have been one of the 40 booths by shelling out $7,600 to the conference organizers.) We were definitely losing leads to this promotion, as many passers-by used “I have to get my stickers” as a reason for not staying to learn more about us.
  • Hot Girls: The vast majority of attendees were male, and some exhibitors hired models to engage passers-by and lure them in for a conversation. This appeared to be effective. However, I can see how certain attendees might find this patronizing or otherwise offensive.

Data Collection

For that first hour, Jake and I waited for people to come to us. We quickly decided that it would be worth A/B testing other approaches to see how we would do. For the next three days, we recorded the following data points about every interaction we had:

  • Date and Time
  • Badge Type (attendee or exhibitor)
  • Gender
  • Approximate Age (by decade)
  • Ethnicity
  • Number of People in Group (when applicable)
  • Discussion Starter (Jake or Bob)
  • Success or Failure (success was defined as being able to give a 30-second pitch on what we do and learn where the attendee worked and what their role was at the company)
  • Approach Used (see below)
  • We collected data on four different conversation-starting approaches:
  • “What is your average customer lifetime value?” This heavy question was designed to stop people in their tracks and be a lead-in to a conversation about the benefits we provide.
  • “How much of your revenue is from repeat customers?” This was a less intimidating question with the same intention as above.
  • “Does Your Company Generate Data?” This was a question that we knew people should always say “yes” to.
  • “Can I give you some information on RJMetrics?” This was passive-question approach.
  • The Walk-Up (when people came up to us unsolicited).

At the end of the three days, we collected 330 data points, 230 of which were successful conversations and exactly 100 of which were rejections.

Inbound vs. Outbound

One thing was clear: it pays to have an outbound strategy. Only 28% of our conversations were walk-ups. This means that employing an outbound strategy allowed us to extract between 3 and 4 times as much value from the show as we would have otherwise.

We initially speculated that the quality of walk-up traffic would be higher than that of random passers-by. However, we observed (unscientifically) that this was not the case. While some very high-value prospects did approach us as walk-ups, we ultimately derived more qualified leads from our outbound conversations.

Effectiveness by Outbound Approach

Naturally, 100% of walk-ups converted into conversations. Below, we show the conversion rates on the other outbound approaches.

Clearly, the more aggressive methods (i.e. asking a pointed question that is difficult to brush off) were the most effective. Asking questions about repeat purchase rates such as “What Percent of Your Customers Come Back to Purchase a Second Time?” was the most effective method, with a whopping 79% conversion rate.

The least effective method of the outbound approaches was the passive-question technique (i.e. “Can I give you some information on RJMetrics?”)

Age

The most common passer-by was in their 30s, and the populations steadily dropped off with each additional decade of age. The percentage of attendees in their 20s (like us) was surprisingly small.

Our success rates were lowest with attendees in their 40s, but increased substantially at each extreme end of the age spectrum. We weren’t surpised by our success with 20-somethings, but would not have predicted that we’d have such strong performance with much older attendees.

Race and Gender

We were interested in the breakdown of attendees at the conference and our relative success levels with different groups of people. While there is a potential for selection bias here, we feel that we spoke with a random sample of the conference population. As such, the breakdown of our interactions is likely representative of the conference population as a whole.

72% of our interactions were with white males. Interestingly, it was about as likely that we interacted with someone who was non-white (15%) as with someone who was non-male (13%).

The conversion rates of different race and gender combinations are also quite interesting. We were least likely to convert white males (58%) and most likely to convert non-white females (87%). Overall, our conversion rates on females and non-white attendees were higher than their counterparts.

Jake vs. Bob

So, who was the better pitchman? Despite heroic trash talking, Jake put me to shame. 66% of his attempts converted into conversations (compared to my paltry 55%).

We also looked at conversion attempts for females only. This data revealed that, while Jake still converted attempts into conversations at a higher rate than I did, the gap was significantly smaller. Unfortunately, these rates did not carry over to the after parties.

The ROI of An Aggressive Pitch

All-in, our booth space, display materials, meals, travel, and accommodations added up to about $8,000 of total expense. (This does not include the opportunity cost of our time.)

The exhibit hall was open for a total of 20.5 hours across 3 days, which works out to about $6.50 per minute to participate in the exhibition. As the co-founders of a bootstrapped company, that figure weighed heavily in our minds every time we considered getting some lunch or taking a bathroom break.

If you look at our total interactions, the numbers get even larger. Based on our count of 220 conversations, we paid around $36 per pitch. This is where our aggressive sales strategy really makes a significant impact. If we had only interacted with the 65 walk-ups from the conference, that effective rate would have been $123 per pitch.

In other words, we extracted nearly 3 times as much value from our conference experience by simply getting out in front of the booth and selling more aggressively.

Given these data points, it becomes clear why some companies are willing to invest money in novelties and gimmicks to increase traffic to their booths. Spending a few thousand dollars could double or triple the effectiveness of your experience while increasing your costs by a significantly smaller percentage.

Was it Worth It?

Given the price point of our product, we will break even on this conference if we convert a single lead into a long-term customer. Based on the leads generated and our typical sales cycle, it seems very likely that we will do much better than that.

There were also less tangible benefits to participating in the show. We increased brand awareness, strengthening our relationships with existing customers, scoping out the competition, and keeping up-to-date with emerging trends and technologies.

Perhaps most importantly, however, this experience has given us a new perspective on how to think about the costs of customer acquisition and spending money to acquire new business. If this show turns out to be a profitable endeavor, we will have have a great baseline for the cost of putting people into the top of our sales funnel. With this data, we can feel much more confident about making investments in advertising and other forms of lead generation.

Foursquare Outpacing Gowalla as it Approaches 2 Million Users

[This post, written by our CEO Robert J. Moore, originally appeared on TechCrunch as a guest column. You can find that post here.]

Location-based social networks Foursquare and Gowalla are accumulating users (and headlines) with impressive momentum. While both companies have been vocal about reaching major milestones, we wanted to take a closer look at the data behind these accomplishments.

For the past four weeks, we’ve been monitoring the Foursquare and Gowalla APIs to track growth rates and sample users and venues. This data was loaded into an RJMetrics Dashboard, which provided the results found here with just a few clicks. We will keep these estimates up-to-date with fresh data and you can view them any time at ourStartup Data page.

Here are a few highlights from our findings:

  • As of today, Foursquare has just over 1.9 Million users. Gowalla has around 340,000.
  • At its current pace, Foursquare will surpass 2 Million users within a week.
  • Foursquare is adding almost 10x as many new users per day as Gowalla and, despite a significantly larger base, has a daily percentage growth rate that is 75% higher than Gowalla’s.
  • Currently, Foursquare has about 5.6 Million venues and Gowalla has 1.4 Million venues.
  • 1 in 3 venues on Foursquare have been checked into only once or never. That number is 1 in 4 on Gowalla.
  • The most popular venue name is “Home,” followed by national fast food chains like “McDonald’s” and “Burger King”
  • On Foursquare, men outnumber women almost 2-to-1. Exact gender breakouts are not available for Gowalla, but the most popular first names suggest a similar distribution.

User Growth

As of today, Foursquare has just over 1.9 Million users. Gowalla has around 340,000.

Recent new user acquisition by day for each service is shown in the chart below.

Foursquare is clearly acquiring users at a much higher rate than Gowalla, and this ratio of new Foursquare users to new Gowalla users is shown below. It averages almost 10-to-1.

The numbers become even more interesting when you consider each company’s daily growth rate. This is the number of new users in a given day divided by the total user population from the previous day.

Since Foursquare is growing off of a much larger base, you might expect their percentage growth to be smaller than Gowalla’s. However, as shown below, their daily growth rate averages about 75% higher than Gowalla’s.

Venue Growth

Similar trends when we look at daily venue growth. Currently, Foursquare has about 5.6 Million venues (or about 3 per user) and Gowalla has 1.4 Million venues (or around 4 per user). The rate at which new venues are being added is shown below:

User Characteristics

Foursquare and Gowalla share different information about their users via the public API, revealing different types of statistics about each population.

On Foursquare:

  • 64% of users are male, 33% are female, and 3% did not specify a gender
  • 55% of users have uploaded a photo
  • 28% of users have linked their Foursquare account to their Facebook account

On Gowalla:

  • 38% of users have linked their Gowalla account to their Facebook account and 53% have linked to their Twitter account
  • 57% of users have zero friends and another 13% have only one friend

Interestingly, across both services, the five most popular first names are identical:

  • Chris
  • Michael
  • David
  • John
  • Jason

Venue Characteristics

As with users, the available data differs between the two services.

On Foursquare:

  • 18% of venues have at least one “tip” associated with them
  • 3% of venues offer “specials”
  • 32% of venues have been checked into only once or never
  • The two most used venue categories are “Home” and “Corporate/Office”

On Gowalla:

  • 25% of venues have been checked into only once or never
  • 0.5% of venues have a Twitter username associated with them

Across both services, the most popular venue names are:

  • Home
  • Subway
  • Starbucks
  • McDonald’s’
  • Burger King
  • Walgreens

How We Did It

In most cases, this level of detail wouldn’t be accessible from the outside looking in. However, Foursquare and Gowalla have a few common characteristics that made it possible:

  • Both companies use auto-incrementing ID numbers (1,2,3,4…) for both users and venues.
  • Both companies have an API that allows us to access basic user and venue information by ID number.
  • The central limit theorem tells us, among other things, that a large enough random subset of a large data set will behave like its parent set with a high degree of statistical confidence.

Our scripts tracked the maximum registered user and venue IDs each hour, along with randomly sampling data points throughout the population. This gave us a “density factor” that so that we could adjust the absolute numbers to reflect deactivated accounts, deleted venues, and other skipped IDs.

In the end, our sample size consisted of about 82,000 data points from Foursquare and 36,000 data points from Gowalla. As with all such analyses, the results in this report are only estimates and could be skewed by flaws in our sampling methods or unconsidered outside factors.

Conclusion

Both services are showing impressive growth and are accumulating moutains of valuable, fascinating data. However, Foursquare is clearly the dominant player and their lead is increasing every day.

Be sure to keep an eye on our Startup Data page to track how these numbers progress over time. With Foursquare approaching the 2 Million member mark, it appears that this may only be the beginning.

RJMetrics is a hosted business intelligence tool that allows online businesses to quickly and easily capture the value within their data. To learn more about how we can help your business measure, manage, and monetize better, go to RJMetrics.com and follow us on Twitter.

Top 5 Startup Tips from Jay-Z

Here at RJMetrics, we try to learn everything we can from the works of accomplished business leaders. While guys like Warren Buffet usually top that list, there is another visionary we thought was worth mentioning: rapper Jay-Z. And since we know everyone loves business lessons from rap stars, we thought we’d share some of his insights here.

While he’s never done a song about business intelligence or cohort analysis, Jay-Z’s discography is a rich library of business advice that has been topping the charts over a decade. Don’t believe me? Here are the top five lyrical excerpts that have helped shape our business philosophy.

5. Adapt to Serve Your Market

“I dumbed down for my audience to double my dollars
They criticized me for it yet they all yell ‘HOLLA!’”

Song: Moment of Clarity
Album: The Black Album
Year: 2003

Here, Jay-Z admits that he has compromised his artistic vision to make his music more commercially accessible. Mainstream artists are often accused of “selling out” for doing exactly this, but here Jay-Z openly admits to the practice and justifies it with a simple fact: it made him rich.

As a business owner, your vision may not always appeal to the largest possible market. The key is to keep an open mind about new opportunities and be ready to adjust your plans to seize them. Like Jay-Z, many entrepreneurs have made their millions by constantly refining their vision and adjusting their strategy to reach the largest possible market.

4. Be a Renegade

“No lie, just know I chose my own fate
I drove by the fork in the road and went straight”

Song: Renegade
Album: The Blueprint
Year: 2001

In 2001, Jay-Z and Eminem were young stars, each several albums away from the iconic statuses they hold today. Renegade, their rare collaboration from that year, casts the two as “Renegades” who risk becoming outcasts as they attempt to change the face of hip hop. Today, we know that these renegades succeeded in their mission.

This track is a lesson in innovation and the rewards that are possible if you take the right risks. While it’s often safer and easier to use your talents to feed the existing machine, there is far more opportunity in disrupting it.

3. Stay in Food and Beverage

“Shoulda stayed in food and beverage
Too much flossing
Too much Sam Rothstein”

Song: Lost One
Album: Kingdom Come
Year: 2006

Here, Jay draws a powerful business lesson from the Scorsese classic Casino. In the film, Robert De Niro plays Sam “Ace” Rothstein, a handicapper who is chosen by the mob to run a new casino in 1970s Las Vegas. Due to his criminal record, Rothstein is forced to runs things under lowly title of “Food and Beverage Director.” When he begins to pursue a more public image in the interest of personal fame, we see his downfall as it parallels the downfall of mob rule in Vegas.

As Jay’s lyrics suggest, if Ace had stayed heads-down and focused on his original goal of running a profitable enterprise, he might have never fallen. The business lesson is clear: don’t start a company to become famous; start a company to build a company. Keep your eye on the best interests of your business, and don’t let the tempting distraction of personal fame compromise your original goals.

2. Be a Business, Man

“I’m not a businessman
I’m a business, man
Let me handle my business, damn”

Song: Diamonds From Sierra Leone (Remix)<
Album: Kanye West’s Late Registration<
Year: 2005

With this classic lyric, Jay-Z delivers a valuable lesson for business operators by drawing a distinction between the entrepreneur and the working stiff. His business’s success has made him something much more than just an average man, and that has enabled him to do far greater things with his life and the lives of those around him.

In the lines that follow, Jay-Z calls out to all of the family members and employees whose livelihoods depend on his continued success and whose lives are better because of what he has become. Jay-Z reminds us that running your own business will both consume and enrich your life– and that the tradeoff is definitely worth it.

1. Data is King

“Men Lie.
Women Lie.
Numbers Don’t.”

Song: Reminder
Album: The Blueprint 3
Year: 2009

As one the best-selling recording artists of all time, Jay-Z has a pretty good response to anyone who challenges his dominance: check the numbers. This same comeback is also a foolproof way for him to dismiss unaccomplished rivals.

Similarly, in business, there is nothing more important than the numbers behind a company. When the hype dies down, the companies with the strongest fundamentals are the fiercest competitors. And the companies with the strongest understanding of their data are the best-equipped to steer those numbers in the right direction. If you think there is value in your data that might be going uncaptured, it’s probably time to learn more about RJMetrics.

Twitter Data: An Investor’s Perspective

[This article was also featured as a guest post on TechCrunch on October 5th, 2009.]

A few weeks ago, my former employer led a $100 million investment into Twitter and I must admit that I was quite jealous of my former colleagues. Chances are they got the opportunity to do some very cool analytics on Twitter’s data.

Rather than wonder about what I missed, I decided to figure out what I could from the outside looking in. Using some statistical trickery, the Twitter API, and my RJMetrics dashboard, I uncovered a ton of astonishing new information about Twitter. Here are some highlights:

  • Twitter’s user growth is no longer accelerating. The rate of new user acquisition has plateaued at around 8 million per month.
  • Over 14% of users don’t have a single follower, and over 75% of users have 10 or fewer followers.
  • 38% of users have never sent a single tweet, and over 75% of users have sent fewer than 10 tweets.
  • 1 in 4 registered users tweets in any given month.
  • Once a user has tweeted once, there is a 65% chance that they will tweet again. After that second tweet, however, the chance of a third tweet goes up to 81%.
  • If someone is still tweeting in their second week as a user, it is extremely likely that they will remain on Twitter as a long-term user.
  • Users who joined in more recent months are less likely to stop using the service and more likely to tweet more often than users from the past.

Read on for some detailed charts a deeper dive into the data.

How We Did It

In most cases, this kind of outside-looking-in exercise wouldn’t be possible. Twitter, however, is a special case for a few reasons:

  • The company is pre-revenue, so its value is wrapped up in user activity and engagement
  • A Twitter user’s activity data (tweets, followers, etc) is all public by default
  • Twitter’s API allowed me to automatically download up to 20,000 data points per hour
  • Twitter uses auto-incrementing ID numbers (1,2,3,4…) for both users and tweets
  • The central limit theorem tells us, among other things, that a large enough random subset of a large data set will behave like its parent set with a high degree of statistical confidence

In the end, our sample size consisted of about 85,000 users and just over 3 Million tweets. By piecing all of these things together and pulling the data into the RJMetrics Dashboard, I was able to chart loads of information about Twitter’s user base and user behavior. I’ve looked around, and this appears to be the largest public analysis of Twitter’s user base online. Enjoy!

Number of Twitter Users

This analysis leverages the fact that Twitter uses auto-incrementing ID numbers for both users and tweets. We identified the range of IDs that were consumed by the system in any given month and the percentage of them actually tied to real Twitter accounts. (“Dead” IDs are likely canceled accounts, SPAM accounts, test accounts, etc.) In combination, these numbers give us a reliable approximation of how many new users joined Twitter each month:

NewUsers

This shows us the exponential growth experienced by Twitter in 2009. In Q3, this plateaus at a rate of about 8 million new users per month. A chart of total cumulative users is below:

CumulativeUsers

Hockey, anyone? As of September 1st, the actual number of live Twitter accounts was just above 50 million.

Average Number of Followers

According to the data, the average Twitter user has 42 followers. It’s interesting to see the distribution of users by the number of people following them:

FollowersPie

As you can see, the vast majority of users have ten or fewer followers, and over 20% have no followers at all! As we know, most users have been on the system for less than a year and, as shown in the chart below, the number of followers is proportional to the user’s time since joining:

AvgFollowers

Number of Tweets

It’s also interesting to look at the number of status updates, or “tweets” made by the average user. Obviously, the number of tweets from any given user grows over time (per the trend shown in the chart below):

UpdatesJoinDate

When we look at the distribution of tweets by user, we see a very surprising trend: over 75% of all Twitter users have tweeted fewer than ten times.

UpdatesPie

“Protected” (Private) Twitter Profiles

Before moving onto analyses at the tweet level, it’s important to note that some of the users we identified have “protected” their tweets, meaning we were able to see how many followers they had and how many times they had tweeted, but were unable to download specific tweets (and, more importantly, tweet times).

The chart below shows how many users in our data set are “protected” by the month they joined. The overall number sits around 10% (and dropping):

ProtectedAccounts

Also interesting is how “protected” Twitter users differ from public users. As shown in the charts below, protected users tend to tweet far more often, but have far fewer followers:

AvgUpdates-protected

AvgFollowers-protected

Power Users

Another limitation of the API is that it can only return the 3,200 most recent tweets for any given user. This is obviously not a big deal for most users, but there are some users out there who have passed that mark. Our sample data set showed that less than 0.02% of Twitter users have sent more than 3,200 tweets. These users will have incomplete data sets in our study, but the population is so small that they should not have any meaningful impact on our conclusions.

Tweets by Source

It’s interesting to see how different tweeting methods have risen up over time. Below I show the most popular methods and what percent of Twitter traffic came through them each month since 2007:

TweetsbySource

The web clearly dominates this list. Let’s exclude it to get a closer look at which other sources are driving tweets:

tweetsbysourcenoweb

Twitterriffic has clearly seen better days, and text messages (txt) have been declining as a channel, as well. Meanwhile, TweetDeck appears to be aggressively gobbling up market share.

Time Between Tweets

Since we know the timestamp of every tweet in our sample data set, we can study the time between tweets and the recency of tweets from the userbase.

Remarkably, the average time between any two tweets from the same user is exactly 24 hours.

The chart below shows the average amount of time between tweets for a user’s first ten tweets (when applicable). The x-axis contains the time of the tweet in question, and the value is the average amount of time since the previous tweet.

TimeSincePreviousTweet

Surprisingly, the time between Tweets actually drops as users do more tweeting. However, this could be biased by the fact that most users have tweeted fewer than ten times. To clear things up, let’s look at the average time between tweets based on how many times the user has tweeted:

TBTUsage

Indeed, as you might expect, users who send more tweets also tweet more frequently, and the dropoff is quite significant.

Probability of Incremental Tweets

Since there is such a huge dropoff in tweeting activity up until the 10 tweets mark, we thought it might be interesting to look at the “probability of an incremental tweet” based on how many tweets a given user has completed. This can be calculated with just a few clicks in RJMetrics:

ProbInc

As you might expect, with every Tweet a user performs, their chance of tweeting again goes up.

Active Tweeters

We know that Twitter has 50 million registered users, but we also know that the vast majority of them have tweeted fewer than ten times. Let’s investigate just how many of these registered users are actually actively tweeting.

Using our tweet data, we can identify what percent of the user base sent out at least one tweet in any given month. This “unique tweeters” statistic is charted below (to get a fair statistic we excluded protected accounts from our denominator):

PercentTweeting

The number seems to hover in the 25% range. In other words, only about 1 in 4 registered users is actually tweeting in any given month. (Although it’s worth noting that some users may only be using Twitter to read others’ tweets, meaning they are not full-fledged “zombie” accounts.)

Notice the bump in early 2009, right around the time when new user growth began to accelerate aggressively. This suggests the obvious: on average, a newer user is more likely to tweet than an older user. When new user growth exploded in early 2009, the concentration of new users became denser, driving this average up. To illustrate this (and get a better look at how users behave over their lifetime), we turn to cohort analysis.

Cohort Analysis

A cohort analysis is a great way to look at user behavior and loyalty over time. Each line in the chart below represents a different “cohort” of Twitter users based on the month they joined (we chose 7 cohorts from different time periods to avoid clutter). In the chart below, we monitor what percent of the users in each cohort come back to tweet again in each month after having tweeted in the first month. Obviously, month 1 is 100% by definition:

MonthlyCohort

This is quite a telling chart:

  • There is an expected usage dropoff in month 2, but after that point usage holds predictably steady. This is great news for anyone trying to forecast user activity early on in a new user’s lifetime.
  • The newer cohorts, despite being significantly larger in size, actually consist of more loyal users. The two highest lines are also the two most recent, meaning that users who joined in 2009 are actually more likely to keep tweeting after their first month than those who joined in the same month in 2008.

Since the dropoff in Month 2 is quite pronounced, let’s zoom in and look at weekly cohorts to see if we can see how usage drops off at the weekly level:

WeeklyCohort

We see a similar pattern here, although more recent cohorts don’t stand out as much as in the monthly analysis. Again, however, the dropoff in the second period doesn’t seem to further decline as time goes on. This means that by the second week of a cohort’s lifetime, Twitter can reliably predict its users’ future behavior as a group.

Another cohort analysis that might be interesting is to look at how many tweets a cohort makes each month after joining. This metric will incorporate both the dropoff in usage from the users who churn in the first month and the uptick in activity from users who stay on the platform:

TweetCohorts

Wow! This is a remarkable image. Despite the massive dropoff in users after the first month, the tweeting activity from the users who are left is so voluminous that it makes the “tweets per month” of each cohort average over 100% (and, as before, the more recent cohorts are the more loyal)!

In other words, the users who stick around actually tweet so frequently (and at such a rapid pace compared to their first month) that they more than make up for the lost activity of those who churned after the first month. This is a very powerful and unexpected statistic.

Conclusion

Everyone has their own feelings about Twitter’s reported $1 billion valuation. I hope this article gave you a taste of what its new investors likely considered before coming up with that number.

To learn more about RJMetrics and our original blog posts including the business intelligence rap and our twitter followers guide, check out our website and follow us on Twitter @RJMetrics.