NewsCred is a global company. Its team is global, comprised of 200 people in 7 global offices. Its client base is global, servicing customers across 70 countries, and boasting a lineup that includes Pepsi, Visa, Dell, and HP. And its platform is designed to serve the needs of a global content marketing team.
Like many startups, Newscred takes pride in having a lean engineering team. All of their engineers in both NYC and Dhaka, Bangladesh are 100% dedicated to building the product. That’s why, when it became apparent last year that Newscred needed data on marketing and product performance, it was never an option for the tech team to take on a data engineering project.
At the time, Product Manager Tom Lowe had never built an analytics stack, but he recognized that the number one priority was self-service – every team at the company needed to have access to data without any barriers. Tom set out to build an internal analytics program that would deliver these insights without substantial engineering time, and without major distraction from his core product job. Here’s what he learned.
Three approaches to building an analytics stack
1. Out-of-the-box business intelligence
The NewsCred team was already set up with a point-and-click business intelligence tool. The tool was simple to set up, and promised to aggregate and visualize all of the company’s data, but as the questions became more complex, he ran into limitations. “We needed to monitor median activity metrics across clients and correlate product activity with commercial information like the industry of the client, how much they’re paying us, and what products they’ve purchased.”
The questions were just too difficult for an out-of-the box solution, requiring Tom to go outside the tool and spend hours downloading CSV files, managing Google spreadsheets, and running far too many vlookups. An out-of-the-box business intelligence tool just wasn’t flexible enough to do the type of analysis NewsCred needed.
2. Build a data warehouse with bespoke scripts
At this point, NewsCred’s CTO said it was time to stop relying on other people to store their data—it was time to take ownership of their data by building a data warehouse. Having data centralized in a data warehouse was the only way to get the control they needed. After considering their options, they landed on Amazon Redshift.
Redshift is perfect for NewsCred’s use case; it’s secure, stable, and optimized for analyzing massive amounts of data. “As soon as we had our data in Redshift, we were able to start writing the SQL for complex queries that we couldn’t answer with a basic business intelligence platform,” Tom said. “We were exploring product questions like how long does it take for a client to go from first login to hitting publish? Which areas of the product is a client most invested in? Then we were joining these user activity metrics to Salesforce data to analyze behavior by account.”
Of course, tapping into the analytical powers of Redshift requires actually getting your data into Redshift. Tom solved this problem by writing numerous scripts to connect the data sources to the data warehouse. It didn’t take long for the cracks to start appearing, “We had automated so much, but I was still having issues with my scripts. They were taking too long, and would crash. We didn’t have the data engineering resources to build set-and-forget solutions.” On top of that, Tom was creating a bottleneck. Every time a script broke, someone was knocking on his door to fix it.
3. Build a data warehouse with RJMetrics Pipeline
Tom’s third (and final) approach to solving this problem was to own data management, but to replace his custom data integration scripts with RJMetrics Pipeline. Pipeline connected to NewsCred’s existing data sources and began streaming that data to Redshift, essentially replacing Tom’s self-described “sketchy code.”
“I set RJMetrics Pipeline up really quickly,” Tom said. “Our devops team whitelisted an IP then I set up my credentials to Salesforce and Zendesk. Within an hour we were streaming all of our data into Amazon Redshift.”
The analytics layer
To visualize the data in Amazon Redshift, NewsCred uses Periscope. “Periscope is great because you write a SQL framework and then you’re able to use an interface on top of that.” This functionality makes it easy for an analyst like Tom to do the heavy lifting, but then hand it over to a business user to do additional filtering or data aggregation.
With this in place, Tom was finally able to go back to his real job – building the NewsCred product. He lists three core benefits of this analytics stack:
- Automated reporting: Tom sends out a weekly all-company email that tracks high-level KPIs related to product usage, which he says gives his team the ability to see whether their customers are behaving as they expect.
- Data Independence: Redshift plays beautifully with SQL, a language that can be picked up by anyone with general analytics skills. This provides Tom’s team with a high degree of what Tom calls data independence. He spends some time onboarding people to his process, but he’s not a data gatekeeper. “Anybody can get the information that they need without going through anybody else. As soon as you get between the person who needs the information and the information, then you’ve got inefficiency.”
- No strain on developer resources: Best of all, this requires zero dev time, any analyst can start building reports and visualizations on top of the data without waiting for an engineer. “Since we got all of our data into Redshift, it’s been gravy.”
Tom set out with a simple goal: get everybody in the organization the information they need, without needing to invest time that could be better spent working directly on the product. His experience is a roadmap for anyone looking to solve a similar problem. “If I would have known how easy it was going to be to get RJMetrics Pipeline pushing my data to Redshift, I wouldn’t have spent all that time writing scripts. I would have set this up earlier.”