Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
361 views
in Technique[技术] by (71.8m points)

javascript - Regular expression to remove hostname and port from URL?

I need to write some javascript to strip the hostname:port part from a url, meaning I want to extract the path part only.

i.e. I want to write a function getPath(url) such that getPath("http://host:8081/path/to/something") returns "/path/to/something"

Can this be done using regular expressions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

RFC 3986 ( http://www.ietf.org/rfc/rfc3986.txt ) says in Appendix B

The following line is the regular expression for breaking-down a well-formed URI reference into its components.

  ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(?([^#]*))?(#(.*))?
   12            3  4          5       6  7        8 9

The numbers in the second line above are only to assist readability; they indicate the reference points for each subexpression (i.e., each paired parenthesis). We refer to the value matched for subexpression as $. For example, matching the above expression to

  http://www.ics.uci.edu/pub/ietf/uri/#Related

results in the following subexpression matches:

  $1 = http:
  $2 = http
  $3 = //www.ics.uci.edu
  $4 = www.ics.uci.edu
  $5 = /pub/ietf/uri/
  $6 = <undefined>
  $7 = <undefined>
  $8 = #Related
  $9 = Related

where <undefined> indicates that the component is not present, as is the case for the query component in the above example. Therefore, we can determine the value of the five components as

  scheme    = $2
  authority = $4
  path      = $5
  query     = $7
  fragment  = $9

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...