View Full Version : Programming your own database of results
wesmip1
7th August 2006, 06:37 PM
For anyone interested.
I don't mind helping out people if they wish to learn java to program teir own databases for test purposes.
I won't be giving out code but I will definately guide people through some of the more error prone parts.
Anyone who is interested leave a note here.
Good luck.
Chuck
7th August 2006, 06:57 PM
i've always been interested but never followed through
if you could drop me a line at chucky_s_3000 at yahoo dot com dot au that would be great
system
8th August 2006, 01:51 AM
cheers wes
cb_fleming@hotmail.com
Chinbok
8th August 2006, 03:08 PM
Hi Wesmip,
I take it your database comes from web scraping. Do you have an online source for race results or are you using a form guide?
wesmip1
8th August 2006, 05:08 PM
Chinbok,
At the moment I just rip the reuslts from unitab (easiest site to scrape).
I was thinking about getting them either from "virtual formguide" or from "expert form" or from "oze form".
You got any other sugggestions ?
wesmip1
8th August 2006, 05:09 PM
Oh for form ... I grab those from the form guides at unitab and the ones at expert form and the ones at virtual formguide.
I check they all match up... quite often they don't.
Chinbok
8th August 2006, 05:15 PM
Chinbok,
You got any other sugggestions ? No. I use Unitab too becuase the urls are easy to program. I also scrape the neurals and winner in six.
DR RON
8th August 2006, 05:54 PM
Funny you should mention databases wes as I was only thinking of starting my own not long ago. A couple of questions for you or any others who can help, should I use excell or access to store the info in, and which form provider is the easiest to use, as well as having plenty of info , such as position in running, especially for non-metro races as these are where most of my betting is at the moment. I dont mind paying for form as I would like to get all a horses starts, but not a huge amount though, under $ 50 a month would be my budget. I dont want to use it as a system creator, more as an in depth analysis tool, for example i would like to be able to pull up every horse run from any individual barrier and see if there is any common factors from the horses that won or finished close up from that particular barrier, also things like being able to pull up a list of all horses in the db that are trained by a certain trainer and see if I can pick up his/her methods ecetera. Thanks in advance from the doc.
Chrome Prince
8th August 2006, 06:20 PM
Funny you should mention databases wes as I was only thinking of starting my own not long ago. A couple of questions for you or any others who can help, should I use excell or access to store the info in, and which form provider is the easiest to use, as well as having plenty of info , such as position in running, especially for non-metro races as these are where most of my betting is at the moment. I dont mind paying for form as I would like to get all a horses starts, but not a huge amount though, under $ 50 a month would be my budget. I dont want to use it as a system creator, more as an in depth analysis tool, for example i would like to be able to pull up every horse run from any individual barrier and see if there is any common factors from the horses that won or finished close up from that particular barrier, also things like being able to pull up a list of all horses in the db that are trained by a certain trainer and see if I can pick up his/her methods ecetera. Thanks in advance from the doc.
Dr Ron,
Don't use Excel, as the data grows you risk corruption, Access will be far more reliable and you can perform queries much better. Also the data can be split into tables which means you can import form to one table and results to another.
The best form provider is the "virtual" one.
I dont want to use it as a system creator, more as an in depth analysis tool, for example i would like to be able to pull up every horse run from any individual barrier and see if there is any common factors from the horses that won or finished close up from that particular barrier, also things like being able to pull up a list of all horses in the db that are trained by a certain trainer and see if I can pick up his/her methods ecetera.
My database can do this. Also you can export the data to almost any format you want. Might be a starting point as all the data is already in there.
Just a suggestion, as in the longrun it will save you a great deal of expense in past form and results.
DR RON
8th August 2006, 06:35 PM
Thanks for the comment chrome, I actually gave your "one" some consideration but would ask the question, is there a chance that you would add non-metro stuff to it as I prefer to have family time at the weekends and get most of my betting time midweek? If the answer is yes then you may have a potential customer. The price seems quite reasonable given the amount of time a novice like me would have to spend on doing my own.
wesmip1
8th August 2006, 06:55 PM
Dr Ron,
Just grabbing 1.5 years from unitab has 268,537 horses in races. This is going to start getting slow in access when you start running into the millions.
I use Oracle. There is a free version for personal use and it is quite good. It does have some limitations but they are much better than the access limitations.
Good Luck
Chrome Prince
8th August 2006, 06:56 PM
DrRon,
The database contains fields and results for Metro races only being the "current" run.
However, there is a formguide which covers all Metro, country racing, and also the past runs of horses at country tracks are included.
There are some 500,000 horse runs you can filter.
Bear in mind you cannot update the database itself, updates are provided on CD.
tcmill
13th August 2006, 09:56 PM
I use an Access database with 5 years of Oz Race Data in it.Its around 700Mb in size.3 million records .It performs fine with a bit of indexing.Building queries for non programmers is a breeze in access.It comes with Office professional.
It is a fantastic product for a desktop database.
Your best bet is to subscribe to some db service for your answers.But if you want a hobby build your own.
Jimmy
13th August 2006, 10:13 PM
appreciate the help wesmip if you can give it
jamrad64 at yahoo dot com dot au
cheers
Chrome Prince
13th August 2006, 11:49 PM
I use Oracle. There is a free version for personal use and it is quite good. It does have some limitations but they are much better than the access limitations.
Hi Wesmip1,
Can Oracle perform rankings based on date, track & race number?
Curious to know, as most db's find this a problem or it gets mindblowingly complex and confusing.
dave3000
14th August 2006, 08:14 AM
hi,i am using office 2000,i have never worked with a data base,however i have just decided to start building one,i know there will be plenty of hiccups on the way,however i am sure there is plenty of info on the net to help me on my way,office also has a wizard to help........:)
cheers
OZDOC
14th August 2006, 12:57 PM
Dr Ron,
Don't use Excel, as the data grows you risk corruption, .
100% incorrect , if done properly there is no more chance of risk than any other application.
i am not saying it is the best or worst application for data, but to say you risk corruption is wrong, ( if i am wrong on this could you explain how i risk corruption over and above another app. )
Chrome Prince
14th August 2006, 02:47 PM
You risk corruption in Excel just like Outlook Express.
The data files are not optimized to handle LARGE amounts of data.
By corruption, I mean the file system collapsing (not being able to read the data.
Try storing 60,000 rows of data in Excel, it will crash eventually.
If you think I am wrong and want proof, just google "Excel corrupt" and "Excel crash" and see the hundreds of postings and "tools" available to try and reclaim data.
There is even an obvious hint by Microsoft in the Microsoft Application Recovery tool, the saved document library, (which is why by default, Excel saves itself every 20 minutes or so) and the Microsoft Knowledge Base.
Excel is designed for spreadsheets, analysis and reporting, not storing large amounts of data, which is why you are limited to 65,336 rows of data, the absolute edge of the precipice.
I know from experience, when I first started using Excel as a database with 45,000 rows of handtyped data exploded to be unrecoverable and a year's work disappeared forever into the ether.
I reasearched it fully and was amazed how commonplace it is, and switched to Access.
I do however store each year's tweaked data with my own input in Excel spreadsheets (around 42,000 rows each year) as a template, but I have one copy of each spreadsheet on my computer, one copy burned to disc and one copy on a USB drive.
I also have a copy within my Access database, and a copy within my Filemaker database, should anything go wrong with any of them.
OZDOC
14th August 2006, 05:16 PM
QUOTE=Chrome Prince]The data files are not optimized to handle LARGE amounts of data.
??? the new version of excel has over 1 million rows ( may be wrong ) but even the current sytem can handle Large data, but i may be wrong again
By corruption, I mean the file system collapsing (not being able to read the data.
incorrect i have many large files that do not break down
Try storing 60,000 rows of data in Excel, it will crash eventually.
i do / have and no problems
If you think I am wrong and want proof, just google "Excel corrupt" and "Excel crash" and see the hundreds of postings and "tools" available to try and reclaim data.
i dont wish to go down the your wrong i am right path but just to so it has proven stable when set up right for me
There is even an obvious hint by Microsoft in the Microsoft Application Recovery tool, the saved document library, (which is why by default, Excel saves itself every 20 minutes or so) and the Microsoft Knowledge Base.
this is an option that can be turned of and excel has had auto save for some time
Excel is designed for spreadsheets, analysis and reporting, not storing large amounts of data, which is why you are limited to 65,336 rows of data, the absolute edge of the precipice.
as above new version 1 million
I know from experience, when I first started using Excel as a database with 45,000 rows of handtyped data exploded to be unrecoverable and a year's work disappeared forever into the ether.
??? not sure why it happened to you
I also have a copy within my Access database, and a copy within my Filemaker database, should anything go wrong with any of them
agree copies of anything are important, but stating that excel is open to coruption is not from my experiance true and i push it to every boundry known. In that all is peace here just was not sure regarding your comment, but will re state if set up well excel will not corrupt a file. ( from my experiance )
Chrome Prince
14th August 2006, 06:59 PM
??? the new version of excel has over 1 million rows ( may be wrong ) but even the current sytem can handle Large data, but i may be wrong again
incorrect i have many large files that do not break down
i do / have and no problems
If you think I am wrong and want proof, just google "Excel corrupt" and "Excel crash" and see the hundreds of postings and "tools" available to try and reclaim data.
i dont wish to go down the your wrong i am right path but just to so it has proven stable when set up right for me
this is an option that can be turned of and excel has had auto save for some time
as above new version 1 million
??? not sure why it happened to you
but stating that excel is open to coruption is not from my experiance true and i push it to every boundry known.
Ozdoc,
I am stating from my experience too.
Isn't it better to advise people on what's gone wrong than what hasn't?
The new version does not have one million rows and Excel never will have one million rows, the file system cannot handle it.
65,536 rows is in the new Beta version which I have also road tested last month, being a Microsoft software agent.
You have been lucky.
You have gone down that path, by stating that I am wrong, and I supplied proof of what is inevitable.
Yes, you can turn off autosave, that wasn't the statement, the statement was the reason it is there built in in the first place.
I know why it happened to me, because when I lost my data, I contacted Microsoft who responded:
"Dear Sir,
Microsoft Excel is part of the Microsft Office package.
Within that package is a flat file Excel spreadsheet program, and Access a relational database.
While we have addressed many issues with Excel crashing through the release of Service Packs, the Excel package was never designed to hold large amounts of data. Large amounts of data should be stored within the Access program and for reporting purposes exported to the Excel program for reports, summaries, graphs etc.
It is recommended that large amounts of data not be stored in Excel as the file system is not as advanced as Access in coping with large numbers of records.
Regards,
Microsoft Office Support"
I am correct in my statements, just because it hasn't happened to you, doesn't mean it won't or can't happen to someone else.
My advice was offered as help and to be cautious. I don't want to be responsible for someone losing a year's work.
Just last week I was called into an organization, which lost all it's yearly cashflow projections due to file corruption.
wesmip1
14th August 2006, 08:07 PM
Chrome,
Oracle is industry standard for major applications and is one of the leading databases. It can easy handle rankings based on date, track & race number.
Doing a search of over 400,000 entries takes a little over 5 secs for the majority of searches I do ( which includes counts, sums, multiplication, mins, averages, substrings all in the same query).
Good Luck.
wesmip1
14th August 2006, 08:14 PM
Ok for those I missed on here sending an email to here is what I sent to a couple of people on how I would go about starting to get an idea on java and learning to program your own database :
--------------------
Have you done any programming before ?
Do you know Java or HTML or SQL ?
If not don't worry it isn't hard. Just takes a bit of dedication to learn the first bit then its easy.
If you are interested. Go to http://java.sun.com (http://java.sun.com/)
Download the SDK for 1.4.2 as its the most commonly used java language.
Its available at : http://java.sun.com/j2se/1.4.2/download.html (http://java.sun.com/j2se/1.4.2/download.html)
Read the intstallation instructions for installing as they are fairly important.
Then run through the first few tutorials at :
http://java.sun.com/docs/books/tutorial/index.html (http://java.sun.com/docs/books/tutorial/index.html)
Especially focus on Getting Started (http://java.sun.com/docs/books/tutorial/getStarted/index.html), Learning the Java Language (http://java.sun.com/docs/books/tutorial/java/index.html), Essential Java Classes (http://java.sun.com/docs/books/tutorial/essential/index.html) and Collections (http://java.sun.com/docs/books/tutorial/collections/index.html). Don't worry about Swing or Deployment .. they can come later if you want to learn them.
The first thing you should do is do a "Hello World" Program. The tutorials will show you how to do it. Its always nice to get the program printing something out.
If you have any questions send them my way. Once you have covered the above let me know and I will go through downloading from the web with you and also setting up the Database and how to put data into it.
My email address is aussiegreyhound @ yahoo com au
Chrome Prince
14th August 2006, 09:31 PM
Chrome,
Oracle is industry standard for major applications and is one of the leading databases. It can easy handle rankings based on date, track & race number.
Doing a search of over 400,000 entries takes a little over 5 secs for the majority of searches I do ( which includes counts, sums, multiplication, mins, averages, substrings all in the same query).
Good Luck.
Thanks wesmip1,
I'll look into it, rankings into tables have been a bugbear of mine for quite sometime.
lomaca
14th August 2006, 09:39 PM
Can Oracle perform "RANKINGS" based on date, track & race number?
Hi CP!
I am sure it's only a matter of us, using different terminology, but would you please tell me what you mean by "rankings"?
Thanks
Chrome Prince
14th August 2006, 10:22 PM
Hi CP!
I am sure it's only a matter of us, using different terminology, but would you please tell me what you mean by "rankings"?
Thanks
Iomaca,
Ranking by career prizemoney for example:
$1,000,000 1
$900,000 2
$750,000 3
$200,000 4
$100,000 5
etc
It can of course be done, but not easily enough for my liking and is cumbersome.
The problem is getting access to recognize the difference between two records, not between two fields.
I don't want to create a report with the info or even a query, I want to update tables, which is near impossible. Of course, I could perform a sort and then insert the ranking, but the program does not recognize a different race or venue or date as per previous / next record.
Ideally what is needed is a database that recognizes Excel formulae and can calculate between records.
At present I have a query within a query within a query, and it's getting very difficult to make changes or add queries when there are so many "layers"
lomaca
14th August 2006, 10:49 PM
Iomaca,
Ranking by career prizemoney for example:
$1,000,000 1
$900,000 2
$750,000 3
$200,000 4
$100,000 5
etc
It can of course be done, but not easily enough for my liking and is cumbersome.
The problem is getting access to recognize the difference between two records, not between two fields.
I don't want to create a report with the info or even a query, I want to update tables, which is near impossible. Of course, I could perform a sort and then insert the ranking, but the program does not recognize a different race or venue or date as per previous / next record.
Ideally what is needed is a database that recognizes Excel formulae and can calculate between records.
At present I have a query within a query within a query, and it's getting very difficult to make changes or add queries when there are so many "layers"
OK I see what you mean.
It's just that you mentioned date-track-racenumber in your original post.
Creating a primary key based on these fields would be a basic prerequisite of designing a database table, I would have thought, thus keeping table integrity (elminating duplicate records etc.)
As for queries, at work we only use databases as a repository of data and all query work is done programmatically via VB, C++ etc. This makes writing applications portable, you only have to make a reference to the type of database you are using in code (like "mdb, dbs etc.") as a matter of fact even Excel can be referenced this way.
I have to agree with you re. Excel, it's very good what it was made for but a database it ain't. Still, if it does what you want it to do? why not use it?
Cheers
Wunfluova
14th August 2006, 11:14 PM
Chrome, I had the same type of problems trying to incorporate things like pace and ability rankings in my Access greyhound databases. It was beyond my programming abilities so in the end I was exporting all the form into Excel, doing all my rankings etc using array formula then reimporting into Access. Terribly inefficient and time consuming.
Another thing that initially bugged me was the inability to simply select the last n records of each ***/horse to produce a form guide. The 'top n records' function only produced the top records for the whole dataset, not for each individual runner. Of course I was able to eventually get around this problem with programming but didn't think it should have been necessary for something so simple and basic.
Wesmip, I have never looked at Oracle before and know nothing of Java but am reading this thread with some interest as I am presently trying to motivate myself to build a couple of new databases so that I can query data in ways that can't be done by the commercial dbs I am using. Thanks for your input. You just might motivate this tired old brain into trying something new. http://forums.ozmium.com.au/images/icons/icon7.gif
Wunfluova
Chrome Prince
15th August 2006, 12:07 AM
Wunfluova,
I find array formulae extremely slow and resource hungry amongst large amounts of data.
The way I do it is by sorting by A (Date) B (Track) C (Race Number) and of course the data I am trying to rank.
Then run a formulae such as :
=IF(A1&B1&C1<>A2&B2&C2,1,N1+1)
This gives you a ranking, but of course needs to be modified for dual top values, I usually use a second category such as ranking career prizemoney and the API, so the equal top ranking career prizemoney, priority goes to the horse with best API.
OZDOC
15th August 2006, 01:53 AM
Ozdoc,
I am stating from my experience too.
appreciated
Isn't it better to advise people on what's gone wrong than what hasn't?.
yes
The new version does not have one million rows and Excel never will have one million rows, the file system cannot handle it.
i appologise for this, i was given wrong information and not happy about that and stand corrected
You have gone down that path, by stating that I am wrong, and I supplied proof of what is inevitable.
i said it proved stable for me when set up right not refering anything diferent
I am correct in my statements, just because it hasn't happened to you, doesn't mean it won't or can't happen to someone else..
you stated that it failed due to high volumes of data and was prone to corruption, yes you also provided some information to that, and i will not argue with that, what i can say is i run very large files ( and have for many years ) and they have not been corupted nor i doubt ever will be, as said i do not say excel is the worst or best option but can say if set up right there are no dramas and is a viable option for people to use.
crash
15th August 2006, 05:48 AM
'Open Office Org' [freeware project] supplies pretty much a copy of Excel [works with all Excel input]. This office package also has Calc, Spreadsheet, Draw, Impress, Math, Base and Word.
http://www.openoffice.org/
Crackone
15th August 2006, 10:17 AM
Hi been thinging of a database for a while just didn't now how to do it.
Would you or anybody else have a template for a database, I would be using access 2003, if it is not asking to much it will save me a lot of work and time my email is salesgpo at bigpond dot net dot au.
Thanks
OZDOC
15th August 2006, 01:11 PM
The new version does not have one million rows and Excel never will have one million rows, the file system cannot handle it.
65,536 rows is in the new Beta version which I have also road tested last month, being a Microsoft software agent.
I know i responded with corrected to this, but have done some further confirmation and seems 1 million plus is correct, the beta version may be at 65536 but the full version is as such.
Originally Posted by xxx
What about when the new version of Excel comes out?
More than 1 million rows apparently.
my question
Is this fact or fiction, are there 1 million rows + in the new version ?
my answer from a respected source
Fact.
CP lets just hope they have also improved on the stability issues as well and it proves a better package for all.
OZDOC
15th August 2006, 01:19 PM
Hi been thinging of a database for a while just didn't now how to do it.
Would you or anybody else have a template for a database, I would be using access 2003, if it is not asking to much it will save me a lot of work and time my email is salesgpo at bigpond dot net dot au.
Thanks
have a look for templates in office there should be an example template to learn from, or type examples into the help section ( from access ) and there are many that can be downloaded
cheers
Chrome Prince
15th August 2006, 01:53 PM
I stand corrected, the Beta version did have 65,536 rows.
Excel 2007 will have "support for" 1 million rows and 16,000 columns.
I certainly won't be upgrading!
Let's have a guess how many Service Packs they release for it?
Two, Three?
I reckon Four at least, the file system is just not built to handle this kind of storage. They have improved the theme, the feel, the look, and a few add ons, but the basic file structure is the same.
Isn't it strange that M$ know themselves that the spreadsheet file structure is not suitable for large storage, and told me so on more than one occassion, by phone and email, recommended large amounts of data be kept in Access, then release 1 million rows with no changes to the actual file system, just feel-good stuff.
Welcome to Microsoft Online Crash Analysis ;)
Anyway, I was wrong on that issue, 1 million it is.
lomaca
15th August 2006, 03:40 PM
Hi been thinging of a database for a while just didn't now how to do it.
Would you or anybody else have a template for a database, I would be using access 2003, if it is not asking to much it will save me a lot of work and time my email is salesgpo at bigpond dot net dot au.
Thanks
Hi!
Just tell me what fields (like horse-name, race number, venue etc) you want included and I design one for you and send it over to you, as an empty, ready to use database, it's easy as.
Cheers
Wunfluova
15th August 2006, 05:30 PM
Thanks for the suggestion Chrome, but unfortunately what I was doing was a little too complex for such an elegant solution.
I agree that arrays can be painfully slow and also at times I found them very tricky to use. On more than one occasion they have downright refused to work correctly for me and blowed if I could ever work out the source of the problem.
Wunfluova
vBulletin v3.0.3, Copyright ©2000-2025, Jelsoft Enterprises Ltd.