web scraping - Attempting (and failing) to scrape financial data from google finance using python beautiful soup -
i new , having problem this, great if show me going wrong (rather solution).
so far self-explanatory...
import urllib2 bs4 import beautifulsoup url = 'http://www.google.co.uk/finance?q=nasdaq%3aaapl&fstype=ii&ei=_dupu6dgfmtgwapr6yhqda' page = urllib2.urlopen(url) soup = beautifulsoup(page)
the info looking easy locate:
soup.find.all("tr",{"class":"hilite"})
inputting in console turns out right info
where stuck how work loop (i new-ish programming).
i know headers 1td class =lft lm bld
, info td class = rbld
have no thought how arrays. help in understanding concepts behind great.
the simplest illustration iterate on tr
tags , utilize find_all()
td
tags every row:
for row in soup.find_all("tr", {'class': "hilite"}): cell in row.find_all('td'): print cell.text print "-----"
prints:
total revenue 45,646.00 57,594.00 37,472.00 35,323.00 43,603.00 ----- gross turn a profit 17,947.00 ...
python web-scraping beautifulsoup
No comments:
Post a Comment