Breeding: web scraping - Attempting (and failing) to scrape financial data from google finance using python beautiful soup -

Saturday, 15 January 2011

web scraping - Attempting (and failing) to scrape financial data from google finance using python beautiful soup -

i new , having problem this, great if show me going wrong (rather solution).

so far self-explanatory...

import urllib2 bs4 import beautifulsoup  url = 'http://www.google.co.uk/finance?q=nasdaq%3aaapl&fstype=ii&ei=_dupu6dgfmtgwapr6yhqda' page = urllib2.urlopen(url) soup = beautifulsoup(page)

the info looking easy locate:

soup.find.all("tr",{"class":"hilite"})

inputting in console turns out right info

where stuck how work loop (i new-ish programming).

i know headers 1td class =lft lm bld , info td class = rbld have no thought how arrays. help in understanding concepts behind great.

the simplest illustration iterate on tr tags , utilize find_all() td tags every row:

for row in soup.find_all("tr", {'class': "hilite"}):     cell in row.find_all('td'):         print cell.text     print "-----"

prints:

total revenue 45,646.00 57,594.00 37,472.00 35,323.00 43,603.00 ----- gross turn a profit 17,947.00 ...

python web-scraping beautifulsoup

Breeding

Saturday, 15 January 2011

web scraping - Attempting (and failing) to scrape financial data from google finance using python beautiful soup -

No comments:

Post a Comment