Scenario: Assume there is a string '--opt1=1 --opt2="2 3"'. We want to divide it into 2 token '--opt1=1' and '--opt2="2 3"'. Define the regular expression: Let's define 3 tokens for now:\S*".*"\S*: a pair of double quotes optionally surrounded by non-space (e.g. --opt2="2 3" ). \S*'.*'\S*: a pair of double quotes optionally surrounded by non-space (e.g. --opt2='2 3' ). \S+: a string of non non-space (e.g. --opt2='2 3' ). Usage: In python, use re.findall to divide the tokens:>>> m = re.findall("\S*\".*\"\S*|\S*'.*'\S*|\S+", '-1=2 "--3=3 5"'); >>> m ['-1=2', '"--3=3 5"'] >>> // The array to store the tokens. ArrayList args = new ArrayList<String>(); String regularExpression = "\\S*\".*\"\\S*|\\S*'.*'\\S*|\\S+";
if( regExp == null ) return args; // Now, parse the tokens in a string text . // Here is similar to strtok: When calling RegExp.exec, // it will automatically update lastIndex, for(MatchResult matchResult = regExp.exec(text); matchResult != null; matchResult = regExp.exec(text) ) args.add(matchResult.getGroup(0)); return args; References: http://www.gwtproject.org/javadoc/latest/com/google/gwt/regexp/shared/RegExp.html http://stackoverflow.com/questions/1520800/why-regexp-with-global-flag-in-javascript-give-wrong-results http://docs.python.org/2/library/re.html#re.findall |
Memo-migrated >