Introduction and the state of analytics

With the explosion of new software and other data-collecting applications, companies receive and store more data than ever before, and as you’d expect, the amount of data collected globally increases exponentially each year. In corporate statements and media outlets, executives consistently emphasize the importance of enabling a data-driven workforce.

You’d assume that, with all this data and attention, it’d be easy for a company to understand its business and customer behavior at a meaningful level. However, that’s rarely the case. In fact, many modern companies have an extremely difficult time creating and sharing critical metrics. Multi-touch marketing attribution, 360-degree view of the customer, cohorting, predictive pipeline forecasting, ROI on ad spend, weighted customer health scores: these represent just a sampling of essential analytics that many companies struggle to properly calculate.

Typically, companies resort to using the native reporting capabilities of each of their separate applications, or just dump everything into local, one-off spreadsheets. In these cases, employees get sales data from a CRM, marketing data from an automation tool, payment transaction data from a finance application, and so forth. This means employees have to jump between multiple browser tabs and desktop applications to get the data they need, then manually download and reconcile the data, and finally arrive at a result. The operational resource requirements, not to mention the headache of constant discrepancies between data reports across teams, leave many employees exhausted and unable to devote time to meaningful analysis.

On top of that, employees can’t link information about customers from these disparate applications in a consistent and accurate fashion, which confines them to standard, inflexible metrics that show only high-level performance and rarely deliver meaningful insights. Employees can’t create custom metrics to help other employees make educated decisions based on customer or product data.

Challenges of creating and utilizing analysis

Not getting access to data quickly enough, dealing with rigid, predefined analysis, not able to get everyone the data they need — do these constraints sounds familiar? They’re the same issues companies face when working with legacy data architecture, which necessitates the creation of disparate data silos to service all business users. Some of the issues also persist for business intelligence vendors. They take ownership of your data, deliver predetermined metrics, and, consequently, aren’t able to answer the unbounded questions that are most important.

If these are known issues, why aren’t people solving them? Why aren’t companies unifying their data sources and using analytics tools to give all their employees everything they need? Well, there’s good reason for this. Most existing approaches to proper data management present hurdles to instituting the three fundamental pillars of utilizing data to drive better business:

  1. Unify separate data sources
  2. Allow for customization of metrics to provide actionable insights
  3. Enable users to access data and analytics quickly and flexibly

In order to effectively utilize data and analytics, each employee should have the capacity to access actionable data quickly. Only then will data result in competitive advantages for a company.

Advances in ETL and analytics technology

How are modern companies confronting these issues? Recently, several solutions have emerged. These modern ETL and analytics tools resolve the issues presented by today’s multi-application data environment, while creating a data architecture that enables a data-driven business.

Modern-day ETL providers and effective solutions

ETL tools, such as RJMetrics Pipeline, make it easy for companies to unify their data by extracting it from disparate sources, loading it into a central data warehouse, and maintaining that flow of data.

These tools export the data from each application in a standardized, well-defined schema. This means companies don’t need to spend limited engineering or data science resources writing complicated or inconsistent API calls and transformation scripts aimed at simply getting the data into one place, in a structure that’s optimized for analytics. Data integration tools handle everything.

While some companies could certainly centralize data using internal resources, sometimes that just doesn’t make business sense. Once the initial centralization effort is complete, your data or engineering team needs to build in the appropriate transformation logic and continuously maintain data flows and a well-structured schema, which is usually a larger resource commitment than typical envisioned at the outset of a project.

The team’s time might be better spent analyzing data and refining your product rather than building analytics infrastructure. Think about the cost of the team’s time versus the cost of a relatively cheap data integration tool: A tool — developed by an experienced data and engineering team — that understands the difficult of data consolidation. And then recognize that your competitors are increasingly taking advantage of these tools.

Modern analytics technology

Data unification is step one on the path to effective analytics. Step two is creating meaningful, custom metrics. And step three is empowering analysts and business users.

Data transformation and custom metrics
Once you’ve loaded your data into a warehouse, you’ll need to apply business logic on top of it. Log event streams don’t easily translate into business insights; use SQL to create views of this data with additional meaning. Translate status codes into meaningful statuses, exclude test records, create calculated fields with accepted definitions. At Looker, we call this feature of our product the modeling layer.

Applying this type of business logic after data has been loaded into the warehouse is a very “modern” approach, often referred to as ELT: Extract, Load, Transform. ELT has achieved popularity in the last few years due to the release of performant and affordable analytic data warehouses like Amazon Redshift. ELT has a significant advantage over the traditional ETL approach because it’s far more iterative and flexible. As a result, analysts can now move much faster, and tools like Looker are paving the way to help them do exactly this.

Tools for analysts
Once all your data is centralized and your metrics set up, you’ll need to provide constant access to these specific metrics for all analysts who need them. Luckily, modern tools translate commands from a GUI interface into optimized SQL queries, which are executed directly against your centralized database and unified data sources.

Reports for business users
You’ll also want to allow your business users access to these reports, and the ability to customize the core analysis. For example, since Looker is connected directly to your data warehouse, it allows users to drill into any data point or visualization; down to any individual event or transaction. This turns business users into “mini-analysts”, able to self-serve by standing on the shoulders of their analyst teams.

Building your modern analytics stack

Integrating data and providing company-wide analytics is a rising priority among businesses of every size, giving engineers the opportunity to write about the details of their data stack. And it’s easier than ever for anyone to enjoy the benefits of modern analytics technology.

With RJMetrics Pipeline and Looker Blocks you can go from siloed reporting in separate applications to a unified data model, complete with both standardized and customized metrics. What once took months (if not years) to build is now something you can have up and running in a handful of days.

pipeline-ad-blog