View Single Post
  #5  
Old 25th April 2013, 02:44 PM
UselessBettor UselessBettor is offline
Member
 
Join Date: Sep 2011
Posts: 1,474
Default

Do not use microsoft access or excel to store the data. Your crazy if you do as both of these are not professional databases.

Preferably if you have multiple machines then setup a distributed database such as Teradata. This is basically a database with its data over multiple machines which speeds up the querying. A lot more compelx but worthwhile in the end.

If this is beyond your skill level then I suggest going with Oracle or MySQL. You can get a free copy of both for non business purposes. At least these will be able to handle your queries much more efficiently than access and excel if you set them up properly. Indexes and correctly partitioned disks will be very important.

Your problem is going to be what to collect and where to collect it from. Once you get started it can get a bit crazy. My current database holds over 24 million form lines and collects data from a lot of sources(risa, aap, betfair, totes, bookies, and other form sites).

Running queries on this was becoming slow in MySQL (5-10mins a query on indexed data and 30+ minutes on obscure data)which is why I set up a cluster of machines and moved my data to a distributed database. It runs a lot better now although I'm still learning the ins and outs of it.

I wish I had set it up using a distributed database from the beginning as it was a lot of work to transfer the data across. So its worth the effort in my opinion if your serious about collecting data.



Reply With Quote