Notifications
Clear all

[Closed] better MemStream.readblock

Hi,
I had some trouble getting code blocks starting with a ‘(’ and ending with a ‘)’ using the MemStream.readblock in cases where the ‘(’ or ‘)’ are “touching” other characters (like in cases where I have something like this

(a = b)

).
The problem gets even worse for MemStream.readblock with nested blocks because it’s not enough for the outer block to be properly separated, all the nested blocks must be separated too. So I wrote my own function:

(
	fn readBlock ms =
	(
		local token = ms.readChar()
		while not ms.eos() and not matchPattern token pattern:"(*" do
			token = ms.readToken()
		
		local n = 1
		local blockString = token
		while not ms.eos() and n > 0 do (
			local c = ms.readChar()
			blockString += c
			
			if c == "(" then
				n += 1
			else if c == ")" then
				n -= 1
		)
		
		if n > 0 then
			blockString = undefined
		
		blockString
	)
	
	local testString = "test = (test (a=b) and (b=c) do something) dsad"
	
	local ms = MemStreamMgr.openString testString
	if ms != undefined then (
		local b = readBlock ms
		MemStreamMgr.close ms
		print b
	)
)

it works a bit different which is more suitable for my case, it only looks for blocks of type ‘()’ and it doesn’t have to be standing exactly at the beginning of the block.
The main problem with my function is the efficiency. Can anyone give me some ideas on how to make it faster?

16 Replies
 lo1

What’s killing you is building your string character by character. Because strings are immutable this causes a building of a new string each time.

Here’s a more efficient implementation:

fn readBlock2 str =
(
	local ms = MemStreamMgr.openString str
	if ms == undefined do return undefined
	
	local start = findString str "("
	if start == undefined do
	(
		MemStreamMgr.Close ms
		return undefined
	)
			
	ms.seek start #seek_set 
	
	local p1 = "("
	local p2 = ")"
	
	local n = 1
	while not ms.eos() and n > 0 do (
		local c = ms.readChar()
				
		if c == p1 then
			n += 1
		else if c == p2 do
			n -= 1
	)
	
	local end = if n > 0 then -1 else ms.pos()
	MemStreamMgr.Close ms
	if end == -1 then undefined else
		subString str start (end - start + 1)
)

It uses the memstream only to read the string and record in/out positions, and then uses a single subString to reap the data.
Both processing time and memory become more efficient as the data is larger.

Thanks dude, I didn’t know that was my bottleneck. The problem with your solution for me is the that I need to have the original string. The problem is I’m working with a memstring built from an external file. Too bad there is no substring method for memstrings.

i would do it using StringStream and readDelimitedString or skipToString.
if you parse a file use FileStream

2 Replies
(@denist)
Joined: 11 months ago

Posts: 0

it’s a one function for StringStream (or FileStream) – skipToString

 lo1
(@lo1)
Joined: 11 months ago

Posts: 0

FileStream would be better of course. Does the function have to receive a memstream?

 lo1

Not a problem! Just use a StringStream instead of a string.
Also cache the “(” and “)”. You’d be surprised how inefficient maxscript is in this regard.

fn readBlock ms =
(
	local token = ms.readChar()
	while not ms.eos() and not matchPattern token pattern:"(*" do
		token = ms.readToken()
	
	local ss = stringStream ""
	format token to:ss
	local n = 1		
	local p1 = "("
	local p2 = ")"
	
	while not ms.eos() and n > 0 do (
		local c = ms.readChar()
		format c to:ss
		
		if c == p1 then
			n += 1
		else if c == p2 then
			n -= 1
	)
	
	if n > 0 then undefined else ss as string
)
2 Replies
(@denist)
Joined: 11 months ago

Posts: 0

i would use the StringStream instead of the MemStream

(@matanh)
Joined: 11 months ago

Posts: 0

I finally got to try this out, and it gave me a big performance boost, so thanks for all the help! For now I will stay with the MemStream interface because it helps me in other parts of the script. I also added some additional case handling for quotes and comments so I’m posting the updated function in case anyone in the far future will ever find it useful.

fn readBlock ms =
(
	local p1 = "("
	local p2 = ")"
	local s1 = "\""
	local s2 = "\\"
	local s3 = "-"
	local s4 = "/"
	local s5 = "*"
	local s6 = "
"	
	
	local token = ms.readChar()
	while not ms.eos() and not token[1] == p1 do
		token = ms.readToken()
	
	local ss = stringStream ""
	format token to:ss
	local n = 1
	
	local inString = false
	local inComment1 = false
	local inComment2 = false
	local last1 = ""
	local last2 = ""
	
	while not ms.eos() and n > 0 do (
		local c = ms.readChar()
		format "%" c to:ss
		
		if not inString and not inComment1 and not inComment2 then (
			case c of (
				s1: if last1 != s2 or last2 == s2 then inString = true
				s3: if last1 == s3 then inComment1 = true
				s5: if last1 == s4 then inComment2 = true
				p1: n += 1
				p2: n -= 1
			)
		) else (
			case c of (
				s1: if inString and last1 != s2 or last2 == s2 then inString = false
				s6: inComment1 = false
				s4: if inComment2 and last1 == s5 then inComment2 = false
			)
		)
		
		last2 = last1
		last1 = c
	)
	
	if n > 0 then undefined else ss as string
)

Thanks a lot!
I will try both of the directions and see how it goes. The problem with using stringstream is that this is just a part of a bigger system that uses other features of memstring like the readblock function that plays a big role in a different part. I will try to convert it all and see if the pros are more significant then the cons. Too bad I wont be in the office for the next 3 weeks so my update on this will come eventually…
Thanks again this really helps!

i thought that memstream.readblock is exactly the problem. i can’t believe that not working function may play a big role. maybe negative… but it might be fixed.

i compared by performance – memstream, stringstream, and filestream… today i don’t see any difference. probably all three streams stay in memory all the time now. so the only reason to prefer one stream type to another is the using of its specific functions.

 lo1

Glad it worked out.

Btw, this might be overkill – depending on the position of the first ‘(’ in the stream, but caching

pattern:"(*"

might also help.

1 Reply
(@matanh)
Joined: 11 months ago

Posts: 0

I want to allow cases like this:

fn foo = (print "foo")

but ignore cases like that:

fn foo arg:(random 1 10 > 5) = (print (if arg then "A" else "B"))

so my blocks must start after a white space.

 lo1

I’m not sure what you mean. I just meant you could do this:

local pat = "(*"
while not ms.eos() and not matchPattern token pattern:pat do
		token = ms.readToken()

Like you did with the other symbols.

1 Reply
(@matanh)
Joined: 11 months ago

Posts: 0

got you now,
fixed it to something even simpler:

while not ms.eos() and not token[1] == p1 do

Thanks again.

Page 1 / 2