python - re.compile not obeying case of text when using with beautifulsoup -
i'm using beautifulsoup , looping through series of li objects , 2 causing me issues next two:
<li><span class="prefix">teams</span>6</li> <li><span class="prefix">new teams</span>4</li> i'm matching based on .find seen below:
if newdetail.find(text=re.compile("teams")): however reason re.compile registering each of li objects under if statement, want create case sensitive finds following:
<li><span class="prefix">teams</span> 6</li> anyone got ideas on how solve issue?
the problem html im parsing doesnt have same html parts
i'm not sure if mean info need not in lists , spans, or what, here's how parsed info , extracted totals wanted.
from bs4 import beautifulsoup page_filename = "tester.html" html_file = open(page_filename, 'r').read() soup = beautifulsoup(html_file) lists = soup.find_all('li') item in lists: span = item.find('span') if "teams" in span.string: span.replacewith('') print item.text if teams , total in line not in lists or spans, or not consistently associated in same way within line, have problems getting want. ideally determine patterns team , total can found with, utilize bs4's built-in methods find matches, , utilize regex rest of way if needed.
python regex beautifulsoup
No comments:
Post a Comment