Use Java Pattern to extract word between HTML tag with attributes -
i using java pattern & matcher extract words between 2 tags.
my code like:
final pattern pattern = pattern.compile("<([a-za-z][a-za-z0-9]*)\\b[^>]*>(.*?)</\\1>"); list<string> topicarray = new arraylist<string>(); final matcher matcher = pattern.matcher("<city count='1' relevance='0.304' normalized='shanghai,china'>shanghai</city>"); while (matcher.find()) { topicarray.add(matcher.group(1)); }
the system gives me city output instead of shanghai. what's wrong it?
thanks
you can try next:
private static final pattern regex_pattern = pattern.compile("<[^>]*>([^<>]*)<[^>]*>"); public static void main(string[] args) { string input = "<city count='1' relevance='0.304' normalized='shanghai,china'>shanghai</city>"; system.out.println( regex_pattern.matcher(input).replaceall("$1") ); // prints "shanghai" }
Comments
Post a Comment