Although, there are many ways to setup a reporting server depending on the requirements. Sharing one of those.
Setup process has been breakup into few steps and explained below.
1. Storage Engine Selection:
The first and the most important is storage engine selection. Since, the data can be used for billing purpose. Therefore, it should be accurate and loosing data by crash is not acceptable at all. Mostly select, insert and delete (data purging discussed below) is required in this kind of setup. No update is required. Below command should be used for inserting data.
LOAD DATA INFILE ...
Since, this command is 30% faster than normal inserts.
Concurrent operations (select, insert) is required. As, innodb uses MVCC to achieve high concurrency and it’s crash safe. Therefore, I’ve used innodb storage engine and recommend other’s to use for critical activity like billing. Inoodb is a great Storage Engine !!!
2. Table Design:
CREATE TABLE mt ( c1 bigint unsigned not null, c2 varchar(30), c3 datetime, . . c9 date default NULL) engine=innodb partition by range (to_days(c9)) (PARTITION p0 VALUES LESS THAN (to_days('2011-01-01')), PARTITION p1 VALUES LESS THAN (to_days('2011-01-05')), PARTITION p2 VALUES LESS THAN (to_days('2011-01-10')), PARTITION p3 VALUES LESS THAN (to_days('2011-01-15')), PARTITION p2 VALUES LESS THAN (to_days('2011-01-20')), . . PARTITION p25 VALUES LESS THAN (to_days('2011-03-25')), PARTITION p26 VALUES LESS THAN (to_days('2011-03-30')), );
3. Purging Data:
In order to make query operation faster, Old data should be purged on a regular basis. Dropping partiton is the best and fastest way to purge old data. But, make sure that this operation is performed at off peak hours. Otherwise, it can block the whole table. While dropping partition from the top and adding partition from the bottom, we can maintain some specific duration data for eg 3 months. In most cases alter table command rebuild the whole table. But, not in below cases.
ALTER TABLE mt DROP PARTITION ... ALTER TABLE mt ADD PARTITION ...
The above task can be automated (details can be found in information_schema) using procedure and cronjob or mysql event.
4. Replication:
Mysql doesn’t support many to one replication. Therfore, script is required to transfer the data from many servers (services) to one servers (Report Server)
5: Intelligent Application:
As, we know that count is slow in innodb. Therefore, application can be designed in such a way that report for more than a month can be scheduled. The result can be send to client using sms or email.
I’ve build a reporting server with 2TB+ data and still growing. The ratio of data and index is 6:4