MySQL, data partitioning, and me

So I have this database of bytes transferred from many of the devices on our network.

It queries these devices every 5 minutes storing many pieces of data, from the number of packets in and out to the number of bytes (or octets in network speak) in and out on an interface. (it also gathers statistics on errors, Cisco environment items, discards, etc). I am mostly concerned with the bytes transferred, so while the other data is collected – I am not using it at this point.

But what happens when you have 16 months worth of data?

Before you say ‘i dunno’, I’ll tell you: it is slow to query it back out, even with minimal indexes.

I have ~54 GB of data (including indexes), and 250 tables (each device has multiple tables, each table stores one type of data only).

So, how to get some speed back? Read on!

I learned how to use data partitioning in MySQL and it wasn’t as horrible as I thought it would be.

Initially I thought I would need to take things down (losing data in the process), dump the tables (that’s gonna take a while), drop the tables, then after all of that, create them with partitioning, then start the long process of restoring the data (and index creation).

But I learned something, though it wasn’t very clear in the documentation, but you can do this LIVE on an unpartitioned table. I had found the holy grail!

Summary for those that don’t want to read further..I was able to converted the ~54 GB of data (and indexes) into the partitioned tables and it took approximately an hour of time to do so, and without taking down the collection service or causing an outage for the database.

If you had to do it the hard way, the steps would be like such:

  • for each table…
  • mysqldump -ustatsamongus -pREMEMBERME DB TABLENAME > TABLENAME.SQL
  • echo ‘drop table TABLENAME’ | mysql -ustatsamongus -pREMEMBERME DB

and I’d have the first step done but I’d still have a lot more time to invest (and a ton of disk I/O).

Now to create the table…

CREATE TABLE tablename {id int, dtime datetime, animal varcar(40))
PARTITION BY RANGE (TO_DAYS(dtime))
(PARTITION p2009xx VALUES LESS THAN (TO_DAYS('2010-01-01 00:00:00')),
PARTITION p201001 VALUES LESS THAN (TO_DAYS('2010-02-01 00:00:00')),
...
PARTITION p201312 VALUES LESS THAN (TO_DAYS('2014-01-01 00:00:00')),
PARTITION p2014 VALUES LESS THAN MAXVALUE);

Then take the time to do a full restore of the tables (and I won’t bore you with those details).

Instead, something I didn’t expect to find; using ALTER TABLE to partition an unpartitioned table!

I was excited to try this but the documentation was sparse and examples even more difficult to find, but after about 30 minutes of trial and error I got it, and in the end, I (LIVE) changed my monolithic tables into partitioned tables based on year and month out to 2013 (I’ll run out of disk before we hit that, so I’ll be doing some data pruning and such long beforehand and can add more partitions later).

The tricky part was the issue of ‘number of partitions’ which just wasn’t clear anywhere. With the data I have so far, and going to encompass all of 2012 as well, I needed a total of 38 partitions. Once I had that figured out … things started to fall into place and now my data is partitioned.

Here it is and I hope it helps someone else out there who always wanted to this – and was afraid to try (or just didn’t have the time to do it the old fashioned way).

alter table tablename PARTITION BY RANGE (to_days(dtime)) partitions 38
(
partition p2009xx values less than (to_days('2010-01-01 00:00:00')),
partition p201001 values less than (to_days('2010-02-01 00:00:00')),
partition p201002 values less than (to_days('2010-03-01 00:00:00')),
partition p201003 values less than (to_days('2010-04-01 00:00:00')),
partition p201004 values less than (to_days('2010-05-01 00:00:00')),
partition p201005 values less than (to_days('2010-06-01 00:00:00')),
partition p201006 values less than (to_days('2010-07-01 00:00:00')),
partition p201007 values less than (to_days('2010-08-01 00:00:00')),
partition p201008 values less than (to_days('2010-09-01 00:00:00')),
partition p201009 values less than (to_days('2010-10-01 00:00:00')),
partition p201010 values less than (to_days('2010-11-01 00:00:00')),
partition p201011 values less than (to_days('2010-12-01 00:00:00')),
partition p201012 values less than (to_days('2011-01-01 00:00:00')),
partition p201101 values less than (to_days('2011-02-01 00:00:00')),
partition p201102 values less than (to_days('2011-03-01 00:00:00')),
partition p201103 values less than (to_days('2011-04-01 00:00:00')),
partition p201104 values less than (to_days('2011-05-01 00:00:00')),
partition p201105 values less than (to_days('2011-06-01 00:00:00')),
partition p201106 values less than (to_days('2011-07-01 00:00:00')),
partition p201107 values less than (to_days('2011-08-01 00:00:00')),
partition p201108 values less than (to_days('2011-09-01 00:00:00')),
partition p201109 values less than (to_days('2011-10-01 00:00:00')),
partition p201110 values less than (to_days('2011-11-01 00:00:00')),
partition p201111 values less than (to_days('2011-12-01 00:00:00')),
partition p201112 values less than (to_days('2012-01-01 00:00:00')),
partition p201201 values less than (to_days('2012-02-01 00:00:00')),
partition p201202 values less than (to_days('2012-03-01 00:00:00')),
partition p201203 values less than (to_days('2012-04-01 00:00:00')),
partition p201204 values less than (to_days('2012-05-01 00:00:00')),
partition p201205 values less than (to_days('2012-06-01 00:00:00')),
partition p201206 values less than (to_days('2012-07-01 00:00:00')),
partition p201207 values less than (to_days('2012-08-01 00:00:00')),
partition p201208 values less than (to_days('2012-09-01 00:00:00')),
partition p201209 values less than (to_days('2012-10-01 00:00:00')),
partition p201210 values less than (to_days('2012-11-01 00:00:00')),
partition p201211 values less than (to_days('2012-12-01 00:00:00')),
partition p201212 values less than (to_days('2013-01-01 00:00:00')),
PARTITION p2013 VALUES LESS THAN MAXVALUE);

Comments are closed.