A few weeks ago, the first-ever RJMetrics hackathon took place at our Philadelphia headquarters. I decided to throw my hat into the ring with a project I’d been thinking about for a while: a MySQL to MongoDB query translator.

This was a unique challenge because MongoDB and MySQL are very different technologies that store data in very different ways. To some, translating between them might seem like a non-sequitur. However, I knew there was a use case because of my personal experience learning MongoDB. I would often think about queries in terms of SQL syntax, and a translator like this would have greatly softened the learning curve.

The final product is available at our Query Mongo site, and I encourage you to give it a try. It’s not perfect, but we hope it will be a helpful learning tool for the many people who have SQL experience and are getting started with MongoDB.

In this blog post, I’ll provide some insights into how this tool works.

My Starting Point

Going into the hackathon, there were a few pre-built pieces of our source code that would prove critical to this project. (RJMetrics helps online businesses make smarter decisions using their data, so we’ve built up quite a few libraries related to query management.)

The first piece is our “query object” system, which allows us to represent relational database queries as objects that can be interpreted by our code. These objects allow us to represent most SQL-style queries independent of the platform on which they will ultimately be run. In other words, the same exact query object could be built by our code, and then used to extract data from MySQL, PostgreSQL or Microsoft SQL Server.

Obviously, these query objects require a “query renderer” to translate them into the SQL language of our choice. Our rendering system is pretty robust for the SQL platforms I mentioned, but was extremely weak for MongoDB (only the simplest SELECT queries would work).

The Missing Pieces

First and foremost, I knew that one of the keys to actually finishing this project was to limit its scope to the most common use cases. I decided that my translator would only support SELECT queries, would not have support for JOINs across multiple tables and would have limited support for complex WHERE clauses including embedded parentheses. These limitations may change over time but, in the scope of the hackathon, they were a necessity.

With a focused scope, I was ready to start coding. First, I significantly improved the query renderer’s support of MongoDB queries to include support for query properties such as:

aggregate functions like COUNT(), SUM(), AVG(), MIN(), and MAX()
WHERE clauses, including IN statements and the OR operator
GROUP BY clauses
ORDER BY clauses
LIMIT clauses

Next, I needed a system for taking a MySQL query string and translating it into one of our query objects. Before the hackathon, these query objects could only be built programmatically. Now, we needed to be able to parse human input so that we could build query objects that then rendered into MongoDB query strings.

I found a sweet open-source MySQL parser written in PHP that served as a starting point for tokenizing MySQL strings. Once I had a traversable representation of the query string, I had to interpret each of the query components, translate them into our query object components and build in robust error handling for query properties our objects don’t support.

Suddenly, I had the ability to translate MySQL query strings into query objects, and then render those query objects into MongoDB query commands. All that was left was a nice user interface. Thanks to awesome tools like bootstrap, codemirror and js-beautifier, getting an attractive proof-of-concept together by the end of the hackathon was a cinch.

The Final Product

You can play with the translator by visiting Query Mongo. I did my best to issue helpful error messages when the translator doesn’t support something or your input query is malformed. Comments or suggestions are always welcome, and I’ll do my best to continue improving on the translator over time. Happy querying!

Tags: life at rjmetrics, sql

	RFM Analysis for Online Businesses: Part One Salvatore Calvo\|January 12, 2017
	Understanding your Customer Concentration Jim Roddy\|May 18, 2017
	Understanding your Magento Series: Taking Stock of your Inventory Chris Schmid\|April 20, 2017
	Introducing Magento Business Intelligence Essentials Johanna Richardson\|April 10, 2017
	Quantifying Customer Loyalty Michael Kreshtool\|March 30, 2017
	Optimizing Customer Support Michael Kreshtool\|March 23, 2017

MySQL to MongoDB Query Translator

Robert J. Moore | December 12, 2012

My Starting Point

The Missing Pieces

The Final Product

About The Author

Robert J. Moore

CEO and Co-Founder, RJMetrics

mwasley
2 Posts

Chris Schmid
3 Posts

Tommy Baw
1 Posts

Anita Andrews
12 Posts

Ryan Williams
6 Posts

Ben Garvey
4 Posts

My Starting Point

The Missing Pieces

The Final Product

About The Author

Robert J. Moore

CEO and Co-Founder, RJMetrics

You Might Also Enjoy

RFM Analysis for Online Businesses: Part One

Salvatore Calvo|January 12, 2017

Understanding your Customer Concentration

Jim Roddy|May 18, 2017

Understanding your Magento Series: Taking Stock of your Inventory

Chris Schmid|April 20, 2017

Introducing Magento Business Intelligence Essentials

Johanna Richardson|April 10, 2017

Quantifying Customer Loyalty

Michael Kreshtool|March 30, 2017

Optimizing Customer Support

Michael Kreshtool|March 23, 2017