python - Using urlparse to remove a certain string? -


i have url:

www.domain.com/a/b/c/d,authorised=false.html 

and want convert

www.domain.com/a/b/c/d.html 

please note using python 2.7.

from urlparse import urlparse  url = "www.domain.com/a/b/c/d,athorised=false.html_i_location=http%3a%2f%2fwww.domain.com%2fcms%2fs%2f0%2ff416e134-2484-11e4-ae78-00144feabdc0.html%3fsiteedition%3dintl&siteedition=intl&_i_referer=http%3a%2f%2fwww.domain.com%2fhome%2fus"  o = urlparse(url) url = o.hostname + o.path print url 

returns www.domain.com/a/b/c/d,authorised=false.html don't know how remove authorised=false part url

import re print re.sub(r',.+\.', '.', 'www.domain.com/a/b/c/d,authorised=false.html')  # www.domain.com/a/b/c/d.html 

Comments

Popular posts from this blog

java - How to specify maven bin in eclipse maven plugin? -

single sign on - Logging into Plone site with credentials passed through HTTP -

php - Why does AJAX not process login form? -