PDA

View Full Version : Python: get_rands.zip from jazzy


Toil
27th March 2012, 03:39 PM
'Cause I'm such a nice guy :), here's a quick and dirty Python script I wrote a while ago for a mate of mine... Dunno if he still uses it...

It gets all the worksheets for the day and writes the favourites to a spreadsheet. Note, it is a quick and dirty script, there is no error checking, and if it has a problem with something it raises an exception and closes (default Python behaviour)

If you want to try it out you will need to install Python - get it from http://www.python.org - get 2.7 not 3.x - it is not compatible and pywin32 (for the excel stuff) from http://sourceforge.net/projects/pywin32/files/pywin32/Build216/ - select the right version to match the version of Python you've got.

Anyway, it shows what can be done.

Does anyone know if this get_rands script from jazzy still works or am I not using python properly?

Raven
27th March 2012, 04:04 PM
It worked the last time i used it Toil, in early Jan

you need those downloads he suggests

The Ocho
27th March 2012, 04:24 PM
Is your wife a..."goer"... eh? Know what I mean? Know what I mean? Nudge nudge. Wink wink! Know what I mean? Say no more...Know what I mean?

A nod's as good as a wink to a blind bat, eh?.

The python were pretty funny.

:D

Toil
27th March 2012, 10:15 PM
Thanks Raven, I'll play around with it a bit more.

jazzy
28th March 2012, 03:00 PM
G'day Toil,

I've since moved on to Python 3 so although installed, Python27 isn't on my system's path anymore. I just tested the script, and it got all the pages OK, but fell over when creating the excel sheet. Something to do with the path. If you really want me to, I'll have a go at fixing it...

What happens when you run it?

Toil
28th March 2012, 05:02 PM
Hi Jazzy, thanks for replying.

I'm using PythonWin 2.7.2(pywin32 build 217)

I click open file and open get_rands and then all I see is the code.

import urllib2, re, datetime, win32com.client, sys, os.path, decimal


usragent = "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5; .NET CLR 2.0.50727; ffco7) Gecko/2008120122 Firefox/3.0.5"
xlpath = "%s\\%s" % (os.path.dirname(sys.argv[0]) , "rands.xlsx")



tooday = datetime.date.today()
tomorrow = tooday+datetime.timedelta(days=1)






#href="meeting.asp?meeting=13493"> Cowra</a>
match_mtgs = re.compile("href=\"meeting.asp\?meeting=(\d+)\"> (.*?)</a>")
match_races = re.compile("<a href=\"worksheet.asp\?raceno=(\d+)&meetingid=\d+")
#>Race 1<br>13:15 <font style="font-size:11px;color:#999999;">(local)</font><br><div id="timezone">13:15</div><select class='nfinput' name="timezone" style="width: 70px;font-size: 9px;" onchange="javascript:adjClock('timezone', 3, 15, this.value);">
match_rn_lt_GMT = re.compile(">Race (\d\d?)<br>(\d\d?:\d\d).*?onchange=\"javascript:adjClock\('timezone', (\d\d?), (\d\d?), this\.value\);")






def stripMarkUp(src):
src = re.sub(r"<[^>]*>", '', src)
src = re.sub(r"&nbsp;", '', src)
return src.strip()




def addRace(raceList, buf, trk, rn_st):
#('1', '13:15', '3', '15')
gmt = "%02i:%02i" % (int(rn_st[2]), int(rn_st[3]))
tmpList = []
trs = buf.split("<tr")
ths = trs[1].split("<th")
assert stripMarkUp("< " + ths[2]) == "TAB"
assert stripMarkUp("< " + ths[3]) == "HORSE"
assert stripMarkUp("< " + ths[18]) == "DIV"
for tr in trs[2:]:
tds = tr.split("<td")
if len(tds) > 18:
div = stripMarkUp("< " + tds[18])
if div != "scr":
decDiv = decimal.Decimal(re.sub("\$|,", '', div))
tmpList.append((decDiv, stripMarkUp("< " + tds[2]), stripMarkUp("< " + tds[3])))

tmpList.sort()
#[(Decimal('4.00'), '8', 'Ohmy Dubai'), (Decimal('5.00'), '1', 'Cyclone Bella'),
favs = 1
for i in range(1, len(tmpList)):
if tmpList[i][0] == tmpList[0][0]:
favs += 1
else:
break
for i in range(favs):
raceList.append((gmt, trk, int(rn_st[0]), rn_st[1], tmpList[i][1], tmpList[i][2], tmpList[i][0]))
print tmpList[i]



url = "http://www.racingandsports.com.au/form-guide/"
txheaders = ****"User-agent" : usragent****

print xlpath
print tooday

req = urllib2.Request(url, None, txheaders)
buf = urllib2.urlopen(req).read()

raceList = []

###buf = open("rands.html", "rb").read()
#fp = open("rands.html", "wb")
#fp.write(buf)
#fp.close()
i = buf.find(tooday.strftime("%A"))
j = buf.find(tomorrow.strftime("%A"))
buf2 = buf[i:j]
mtgs = match_mtgs.findall(buf2)
for m in mtgs:
print m
url = "http://www.racingandsports.com.au/form-guide/worksheet.asp?meetingid=%s" % (m[0],)
req = urllib2.Request(url, None, txheaders)
buf = urllib2.urlopen(req).read()
rn_st = match_rn_lt_GMT.findall(buf)
#[('1', '13:15', '3', '15')]
fp = open("%s-%s-R%02i.html" % (tooday.strftime("%Y-%m-%d"), m[1], int(rn_st[0][0])), "wb")
fp.write(buf)
fp.close()
races = match_races.findall(buf)
k = buf.find("class='mainHeader'")
i = buf.find("<table", k-100, k)
j = buf.find("</table", i)
addRace(raceList, buf[i:j], m[1], rn_st[0])


#print races
#['2', '3', '4', '5', '6', '7', '8']
for r in races:
url = "http://www.racingandsports.com.au/form-guide/worksheet.asp?raceno=%s&meetingid=%s" % (r, m[0],)
req = urllib2.Request(url, None, txheaders)
buf = urllib2.urlopen(req).read()
rn_st = match_rn_lt_GMT.findall(buf)
#[('1', '13:15', '3', '15')]
fp = open("%s-%s-R%02i.html" % (tooday.strftime("%Y-%m-%d"), m[1], int(rn_st[0][0])), "wb")
fp.write(buf)
fp.close()
k = buf.find("class='mainHeader'")
i = buf.find("<table", k-100, k)
j = buf.find("</table", i)
addRace(raceList, buf[i:j], m[1], rn_st[0])



raceList.sort()

xl = win32com.client.Dispatch("Excel.Application")
xl.Visible = 1
try:
wb=xl.Workbooks.Open(xlpath)
except:
wb=xl.Workbooks.Add()
wb.SaveAs(xlpath)
#wb=xl.Workbooks.Open(r"rands.xls")

ws = wb.Worksheets("Sheet1")
r = 1
while 1:
if ws.Range("A%i" % (r,)).Value is None:
r += 1
if ws.Range("A%i" % (r,)).Value is None:
break
r += 1


#[('03:22', 'Eagle Farm', 3, '13:22', '5', 'Golden Hut', Decimal('4.20')), ('03:22', 'Eagle Farm', 3, '13:22', '9', 'Western Run', Decimal('4.20')), ('05:12', 'Belmont Park', 2, '13:12', '8', 'Ohmy Dubai', Decimal('4.00'))]
for rl in raceList:
ws.Range("A%i" % (r,)).Value = tooday
ws.Range("B%i" % (r,)).Value = rl[0]
ws.Range("C%i" % (r,)).Value = rl[1]
ws.Range("D%i" % (r,)).Value = rl[2]
ws.Range("E%i" % (r,)).Value = rl[3]
ws.Range("F%i" % (r,)).Value = rl[4]
ws.Range("G%i" % (r,)).Value = rl[5]
ws.Range("H%i" % (r,)).Value = rl[6]
r += 1

wb.Save()

Not sure what I'm doing wrong.

jazzy
28th March 2012, 09:32 PM
OK, sounds like you are running IDLE (Python GUI)?

I normally just run my scripts from a command prompt...

Just tried running the scrpt from IDLE and it works - the path problem disappears :)

So... from IDLE - open the file, then hit Run Module from the Run menu (or F5)

Bear in mind, it is/was a QUICK AND DIRTY script for a specific job, so it only takes some of the data. I only really put it up to show what was possible using Python.

Cheers

Toil
28th March 2012, 11:23 PM
Thanks jazzy, that worked, thats pretty cool what it did.

I would be interested later to maybe pay someone to do me some scripts.

Like for instance would it be possible get python to get these panels from R&S: Rating, Neurals, Times, Switches, Trn, $, ST/LT, and Crs and import them to excel side by side all the way along the sheet with all the TAB numbers and each runners info on the same line as below so after the CLS LS column you would see the TAB numbers again from the next chart and what ever else is in that chart?

TAB HORSE WT TS +- BP TS LS DIST LS +- CLS LS
1 Finding Water 58.0 2 14 5 1700m -25 MDN
2 Ihts Riveting 58.0 0 4 5 1675m 0 MDN
3 Bar Lover 58.0 0 15 1 1400m 275 MDN
4 Flip Mccool 58.0 3 12 2 2013m -338 MDN
5 Falcon Strike 58.0 2 6 6 1300m 375 MDN
6 Greek God 57.5 0 18 14 1400m 275 MDN

jazzy
29th March 2012, 12:11 PM
Yep. Sounds doable.