当前位置: 主页 > 日志 > Python >

提取URL的正则式

日期：2011-01-30 ｜来源：未知｜作者：redice ｜人围观｜ 1 人鼓掌了！

>>> html = "<div><a href='http://www.redicecn.com/plus/search.php?keyword=python&submit.x=0&submit.y=0'>Python</a> <a href='http://www.google.com'>google</a></div>"

>>> re.compile(r'''(http(s)?://([\w\-]+\.)+[\w\-]+(/[\w\- \./\?%&=]*)?)''').findall(html)

[('http://www.redicecn.com/plus/search.php?keyword=python&submit.x=0&submit.y=0', '', 'redicecn.', '/plus/search.php?keyword=python&submit.x=0&submit.y=0'), ('http://www.google.com', '', 'google.', '')]

[日志分享]

| 更多

[日志信息]

该日志于 2011-01-30 00:05 由 redice 发表在 redice's Blog ，你除了可以发表评论外，还可以转载 “提取URL的正则式” 日志到你的网站或博客，但是请保留源地址及作者信息，谢谢!! （尊重他人劳动，你我共同努力）

[相关日志]

MongoDB导出CSV - mongoexport工具	Python跨进程级锁的一种实现
MySQLdb取回大结果集的技巧	Python字符串IP转整型
使用PIL实现多张图片垂直合并	pyodbc如何获取刚插入记录的ID