Notifications
Clear all

[Closed] .net RegExp matches

Hey Folks,

I’m really enjoying everything .net is bringing to the party at the moment, even if some of it is a bit hard to get going with sometimes.

Current head-scratcher is how to read valid matches from a regular expression match into something usable. As with quite a few .net constructs certain items seem to be recursive in their nature, which is frustrating when you’re trying to find something.

Here’s my source text…

name "MaxScript Packager"
description "Easily create zipped MaxScript (.mzp) packages"

And here’s the Regular Expression code I’m using to extract the variable/value pairs


rx	 = dotNetClass "System.Text.RegularExpressions.RegEx"
pattern = "(name|description)\s+\"([^\"]+)\""
matches = rx.matches str pattern
for i = 0 to matches.count - 1 do print matches[i]

No matter what I do, I can only seem to extract the original text, not the parenthesized captures. What I would expect is something like:

#("name", "MaxScript Packager")
#("description", "Easily create zipped MaxScript (.mzp) packages")

Any help would be appreciated!
Dave

6 Replies

be afraid – be very afraid


 (
 	result = #()
 	str = "name \"MaxScript Packager\" name
description \"Easily create zipped MaxScript (.mzp) packages\""
 	rx	 = dotNetClass "System.Text.RegularExpressions.RegEx"
 	pattern  = "(?<1>name|description)\s+\"(?<2>[^\"]+)\""
 
 	m = rx.match str pattern
 	while m.Success do (
 		append result (
 			for i = 1 to (m.Groups.count - 1) collect (
 				m.groups.item[i].value
 			)
 		)
 		m = m.nextMatch()
 	)
 	result
 )
 

Edit: begone, ye devil print statement

How strange to use the Iterator pattern! And where the hell are people supposed to find that out?

On the plus side at least that code can be packaged up into something rather useful, so I thank you graciously, Sir!

You know, I often wonder how you know as much as you seem to. Are you actually a distant cousin of the Enigma Machine?

Thanks again,
Dave

And a nice tidy function for those following in my footsteps:

function regexMatch source pattern options: =
(
	local groups
	local results	= #()
	local rx		= dotNetClass "System.Text.RegularExpressions.RegEx"
	local matches	= if options == unsupplied then rx.match source pattern else rx.match source pattern options
	while matches.Success do
	(
		groups		= for i = 1 to matches.Groups.count - 1 collect matches.groups.item[i].value
		matches		= matches.nextMatch()
		append results groups
	)
	results
)

just as an aside, you don't [i]have[/i] to use rx.match's own iterator. You could have gone with rx.matches, if you wanted....

  	....
    	matches = rx.matches source pattern 
    	for matchIdx = 0 to matches.count-1 do
    	(
    		currentMatch = matches.item[matchIdx]
    		for groupIdx = 0 to currentMatch.groups.count-1 do (
    			currentGroup = currentMatch.groups.item[groupIdx]
    			append results currentGroup.value 
    		)			 
    	)
          ....
    
but ZB's / your new way is nicer. The only problem with your original was that the matchcollection and groupcollection  don't map straight to MXS arrays.

correct – it’s just nice to not have downstream code break due to a change in the regex pattern later on

bytheby, dave, I think your function is not defining rx? might be good to inline that in the function? (or have the user pass one as a parameter, I suppose)

Good spot! Had test code running in global scope…