Derek Watson

On software development

MySQL to Redis in One Step

In moving a relatively large table from MySQL to Redis, you may find that extracting, transforming and loading a row at a time can be excruciatingly slow. Here’s a quick trick you can use that pipes the output of the mysql command directly to redis-cli, bypassing middleware and allowing both data stores to operate at their peak speed.

Using this technique I was able to reduce my load time of ~8 million rows from 90 minutes down to two. Shabby? I think not.

Redis Protocol Output from MySQL

redis-cli has a mass-insert mode, specially designed to execute bulk commands as quickly as possible. Input is expected to be formatted in raw redis protocol, which is simple enough to be generated from MySQL SELECT queries. Here we go!

My stats table:

CREATE TABLE events_all_time (
  id int(11) unsigned NOT NULL AUTO_INCREMENT,
  action varchar(255) NOT NULL,
  count int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (id),
  UNIQUE KEY uniq_action (action)
);

Desired Redis command to be run for each row:

HSET events_all_time [action] [count]

SQL statement to produce Redis-protocol-formatted output:

-- events_to_redis.sql

SELECT CONCAT(
  "*4\r\n",
  '$', LENGTH(redis_cmd), '\r\n',
  redis_cmd, '\r\n',
  '$', LENGTH(redis_key), '\r\n',
  redis_key, '\r\n',
  '$', LENGTH(hkey), '\r\n',
  hkey, '\r\n',
  '$', LENGTH(hval), '\r\n',
  hval, '\r'
)
FROM (
  SELECT
  'HSET' as redis_cmd,
  'events_all_time' AS redis_key,
  action AS hkey,
  count AS hval
  FROM events_all_time
) AS t

Tying it all together:

mysql stats_db --skip-column-names --raw < events_to_redis.sql | redis-cli --pipe

It’s important to use the --raw switch so that MySQL doesn’t rewrite the crucial newline characters, and --skip-column-names keeps the formatted output clear of column headers.

Comments