Smartgambler
Pro-Punter

Go Back   OZmium Sports Betting and Horse Racing Forums > Public Forums > General Topics
User Name
Password
Register FAQ Search Today's Posts Mark all topics as read

To advertise on these
forums, e-mail us.

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 24th April 2013, 10:12 PM
<AvgPunter <AvgPunter is offline
Member
 
Join Date: Apr 2013
Posts: 2
Default Multinomial Logit Models and MS Excel

Hi All

I have never really been a punter but always enjoyed a night out at the races. What fascinates me greatly is the analysis of a race.

I have read papers published by Bolton, Chapman and Benter. An interesting read if your into that stuff. I hope to implement the model contained within the papers.

During my research I came across the thread Proportional bet sizes which started out as a good discussion and an interesting read. As there were a few members interested in this topic I thought I would share my progress with this project and maybe get some good (friendly) discussion going about the merits/flaws of such an approach, implementation and any updates which build upon the model discussed in those papers.

I have begun the process of data gathering, well not actually getting the data, but writing the code to get the data. I have found that RISA provides excellent data, but not in the most user friendly manner. To date I have not found a free source of reliable sectional data. Though, if any knows of a place feel free to let me know. There is no shortage of subscription versions, but that is cost prohibitive.

I am now able to generate a list of meets for all states, generate the various URLs and get the race day data. i am still formating the race day data in to something usable. Unfortunately RISA provides data in two ways, tables (nice and easy to work with) and text (not easy to work with).

I am considering two issues:
  1. Storage of the data collected
  2. How to format some of the data collected
Storing data is something that Excel really isn't very good at. I developed a sharing trading spreadsheet for the ASX share trading game. Excel seemed to struggle with storing and retrieving data for ~110 shares. This lead me to developing a system which stored share data individual workbooks and called when needed. This worked fine for the ASX game, but I think this type of system will be completely inadequate for this project. My options at this stage are to use an Access or a SQL database, both of which are happy to talk to Excel. One concern I do have with Access is its ability to talk to other programs if Excel struggles and I need to move on.

Another issue for consideration is how to format data such as rail position. I have seen the rail position at +3m, neutral, +6m for first half of track and all sorts of other variations. I will need to compile a list of rail position data and have a good look at how to break it down into something that can be analysed.

I will try to keep this thread updated if there is any interest.

Cheers

Reply With Quote
  #2  
Old 25th April 2013, 08:33 AM
Chrome Prince Chrome Prince is offline
Member
 
Join Date: Jan 1970
Posts: 4,425
Default

That's an interesting, but massive project you're undertaking there.
When you say "shares", do you mean Access accessing 110 workbooks or 110 people sharing the hosted data?
Perhaps you could write a routine that imports the data in the spreadsheets to tables in Access, rather than linking to Excel. Excel is great, but not good at storing significant amounts of data. When I say not good, I mean more prone to corruption or locking up or freezing.
You might want to investigate, Alpha 5 database or Filemaker Pro Advanced or Advanced Server, all have free trials.
One problem you will encounter using sectional times is dirty data supplied by the race clubs. The transponder drops out leaving gaps or no data at all.
Often there are serious errors in the times recorded, I've seen impossible times (fast and slow) recorded. Sometimes the clubs fail to pass on vital information such as barriers relocation making the race distance and times way out.
But you can only work with what's supplied.
Some people go to the extent of timing of video, but this is a huge amount of work and has it's own problems such as camera angles and video compression.

You could of course take this all one step further when you have enough quantitative data. If you look up SPSS by IBM, it can give you some insight into predictive and financial analysis of data which can be used to identify hidden patterns or areas of profit, even predict with some accuracy chance.
Blue Chip financial businesses use similar data models to identify risk, but this can be tailored to identify chance.

Best of luck with the project, be interested to hear of your progress.
__________________
RaceCensus - powerful system testing software.
Now with over 409,000 Metropolitan, Provincial and Country races!
http://www.propun.com.au/horse_raci...ng_systems.html
*RaceCensus now updated to 31/10/2024
Video overview of RaceCensus here:
http://www.youtube.com/watch?v=W821YP_b0Pg
Reply With Quote
  #3  
Old 25th April 2013, 09:52 AM
Puntz Puntz is offline
Member
 
Join Date: Jan 2012
Posts: 292
Default

A welcome thread,
Reply With Quote
  #4  
Old 25th April 2013, 10:02 AM
<AvgPunter <AvgPunter is offline
Member
 
Join Date: Apr 2013
Posts: 2
Default

Quote:
That's an interesting, but massive project you're undertaking there.

Certainly is, but I am in no hurry.

Quote:
When you say "shares", do you mean Access accessing 110 workbooks or 110 people sharing the hosted data?

Share was a poor choice of words, I should have used the word stock. Each stock had it's own workbook, each containing the downloaded data and various calculations. As I wanted to view various charts of a stock the workbook would be opened and the charts produced. This was all done in excel alone. I should have used access for the data storage.

Quote:
You might want to investigate, Alpha 5 database or Filemaker Pro Advanced or Advanced Server, all have free trials.

I am thinking MySQL as the database, but Access may be fine.

Quote:
One problem you will encounter using sectional times is dirty data supplied by the race clubs. The transponder drops out leaving gaps or no data at all.
Often there are serious errors in the times recorded, I've seen impossible times (fast and slow) recorded. Sometimes the clubs fail to pass on vital information such as barriers relocation making the race distance and times way out.

WOW, they don't publicise that on the various wewbsites. I when I was looking for sectional data, I looked at a paid service and I worked out that on average it would cost me $4k per week to obtain the sectional times for all races. To find out that the info contained within the reports could be flawed, WOW!

Quote:
You could of course take this all one step further when you have enough quantitative data. If you look up SPSS by IBM, it can give you some insight into predictive and financial analysis of data which can be used to identify hidden patterns or areas of profit, even predict with some accuracy chance.
Blue Chip financial businesses use similar data models to identify risk, but this can be tailored to identify chance.

This sort of system is well beyond the scope of what I am trying to achieve and the budget in which I am trying to achieve it.

Quote:
Best of luck with the project, be interested to hear of your progress.

Thank you, I will let you know how I am progressing

Cheers

Reply With Quote
  #5  
Old 25th April 2013, 01:44 PM
UselessBettor UselessBettor is offline
Member
 
Join Date: Sep 2011
Posts: 1,534
Default

Do not use microsoft access or excel to store the data. Your crazy if you do as both of these are not professional databases.

Preferably if you have multiple machines then setup a distributed database such as Teradata. This is basically a database with its data over multiple machines which speeds up the querying. A lot more compelx but worthwhile in the end.

If this is beyond your skill level then I suggest going with Oracle or MySQL. You can get a free copy of both for non business purposes. At least these will be able to handle your queries much more efficiently than access and excel if you set them up properly. Indexes and correctly partitioned disks will be very important.

Your problem is going to be what to collect and where to collect it from. Once you get started it can get a bit crazy. My current database holds over 24 million form lines and collects data from a lot of sources(risa, aap, betfair, totes, bookies, and other form sites).

Running queries on this was becoming slow in MySQL (5-10mins a query on indexed data and 30+ minutes on obscure data)which is why I set up a cluster of machines and moved my data to a distributed database. It runs a lot better now although I'm still learning the ins and outs of it.

I wish I had set it up using a distributed database from the beginning as it was a lot of work to transfer the data across. So its worth the effort in my opinion if your serious about collecting data.



Reply With Quote
  #6  
Old 25th April 2013, 02:42 PM
Puntz Puntz is offline
Member
 
Join Date: Jan 2012
Posts: 292
Default

The database and server that has capabilities to "talk" with Excel (from some experience a while back) was a horse race program designed by engineering programmer/s.
From what I recall, it was Firebird Server with a relational database, script editor and some form of GUI. http://www.firebirdsql.org/

I'm not a programmer, but had a script written according to my specs. to access the database in real time, do my selections, and a whole lot of other things.
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is On
Forum Jump



All times are GMT +10. The time now is 03:36 AM.


Powered by: vBulletin Version 3.0.3
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
©2008 OZmium Pty. Ltd. All rights reserved . ACN 091184655