Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
264 views
in Technique[技术] by (71.8m points)

java - Extract main domain name from a given url

I used the following to extract the domain from a url: (They are test cases)

String regex = "^(ww[a-zA-Z0-9-]{0,}\.)";
ArrayList<String> cases = new ArrayList<String>();
cases.add("www.google.com");
cases.add("ww.socialrating.it");
cases.add("www-01.hopperspot.com");
cases.add("wwwsupernatural-brasil.blogspot.com");
cases.add("xtop10.net");
cases.add("zoyanailpolish.blogspot.com");

for (String t : cases) {  
    String res = t.replaceAll(regex, "");  
}

I can get the following results:

google.com
hopperspot.com
socialrating.it
blogspot.com
xtop10.net
zoyanailpolish.blogspot.com

The first four cases are good. The last one is not good. What I want is: blogspot.com for the last one, but it gives zoyanailpolish.blogspot.com. What am I doing wrong?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Using Guava library, we can easily get domain name:

InternetDomainName.from(tld).topPrivateDomain()

Refer API link for more details

https://google.github.io/guava/releases/14.0/api/docs/

http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/net/InternetDomainName.html


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...