正如Klaus所提到的,社区中的明确共识是使用BeautifulSoup来完成这些任务:
soup = BeautifulSoup.BeautifulSoup(html) for script_elt in soup.findAll('script'): script_elt.extract() html = str(soup)