[Closed] better MemStream.readblock
Hi,
I had some trouble getting code blocks starting with a ‘(’ and ending with a ‘)’ using the MemStream.readblock in cases where the ‘(’ or ‘)’ are “touching” other characters (like in cases where I have something like this
(a = b)
).
The problem gets even worse for MemStream.readblock with nested blocks because it’s not enough for the outer block to be properly separated, all the nested blocks must be separated too. So I wrote my own function:
(
fn readBlock ms =
(
local token = ms.readChar()
while not ms.eos() and not matchPattern token pattern:"(*" do
token = ms.readToken()
local n = 1
local blockString = token
while not ms.eos() and n > 0 do (
local c = ms.readChar()
blockString += c
if c == "(" then
n += 1
else if c == ")" then
n -= 1
)
if n > 0 then
blockString = undefined
blockString
)
local testString = "test = (test (a=b) and (b=c) do something) dsad"
local ms = MemStreamMgr.openString testString
if ms != undefined then (
local b = readBlock ms
MemStreamMgr.close ms
print b
)
)
it works a bit different which is more suitable for my case, it only looks for blocks of type ‘()’ and it doesn’t have to be standing exactly at the beginning of the block.
The main problem with my function is the efficiency. Can anyone give me some ideas on how to make it faster?
What’s killing you is building your string character by character. Because strings are immutable this causes a building of a new string each time.
Here’s a more efficient implementation:
fn readBlock2 str =
(
local ms = MemStreamMgr.openString str
if ms == undefined do return undefined
local start = findString str "("
if start == undefined do
(
MemStreamMgr.Close ms
return undefined
)
ms.seek start #seek_set
local p1 = "("
local p2 = ")"
local n = 1
while not ms.eos() and n > 0 do (
local c = ms.readChar()
if c == p1 then
n += 1
else if c == p2 do
n -= 1
)
local end = if n > 0 then -1 else ms.pos()
MemStreamMgr.Close ms
if end == -1 then undefined else
subString str start (end - start + 1)
)
It uses the memstream only to read the string and record in/out positions, and then uses a single subString to reap the data.
Both processing time and memory become more efficient as the data is larger.
Thanks dude, I didn’t know that was my bottleneck. The problem with your solution for me is the that I need to have the original string. The problem is I’m working with a memstring built from an external file. Too bad there is no substring method for memstrings.
i would do it using StringStream and readDelimitedString or skipToString.
if you parse a file use FileStream
Not a problem! Just use a StringStream instead of a string.
Also cache the “(” and “)”. You’d be surprised how inefficient maxscript is in this regard.
fn readBlock ms =
(
local token = ms.readChar()
while not ms.eos() and not matchPattern token pattern:"(*" do
token = ms.readToken()
local ss = stringStream ""
format token to:ss
local n = 1
local p1 = "("
local p2 = ")"
while not ms.eos() and n > 0 do (
local c = ms.readChar()
format c to:ss
if c == p1 then
n += 1
else if c == p2 then
n -= 1
)
if n > 0 then undefined else ss as string
)
I finally got to try this out, and it gave me a big performance boost, so thanks for all the help! For now I will stay with the MemStream interface because it helps me in other parts of the script. I also added some additional case handling for quotes and comments so I’m posting the updated function in case anyone in the far future will ever find it useful.
fn readBlock ms =
(
local p1 = "("
local p2 = ")"
local s1 = "\""
local s2 = "\\"
local s3 = "-"
local s4 = "/"
local s5 = "*"
local s6 = "
"
local token = ms.readChar()
while not ms.eos() and not token[1] == p1 do
token = ms.readToken()
local ss = stringStream ""
format token to:ss
local n = 1
local inString = false
local inComment1 = false
local inComment2 = false
local last1 = ""
local last2 = ""
while not ms.eos() and n > 0 do (
local c = ms.readChar()
format "%" c to:ss
if not inString and not inComment1 and not inComment2 then (
case c of (
s1: if last1 != s2 or last2 == s2 then inString = true
s3: if last1 == s3 then inComment1 = true
s5: if last1 == s4 then inComment2 = true
p1: n += 1
p2: n -= 1
)
) else (
case c of (
s1: if inString and last1 != s2 or last2 == s2 then inString = false
s6: inComment1 = false
s4: if inComment2 and last1 == s5 then inComment2 = false
)
)
last2 = last1
last1 = c
)
if n > 0 then undefined else ss as string
)
Thanks a lot!
I will try both of the directions and see how it goes. The problem with using stringstream is that this is just a part of a bigger system that uses other features of memstring like the readblock function that plays a big role in a different part. I will try to convert it all and see if the pros are more significant then the cons. Too bad I wont be in the office for the next 3 weeks so my update on this will come eventually…
Thanks again this really helps!
i thought that memstream.readblock is exactly the problem. i can’t believe that not working function may play a big role. maybe negative… but it might be fixed.
i compared by performance – memstream, stringstream, and filestream… today i don’t see any difference. probably all three streams stay in memory all the time now. so the only reason to prefer one stream type to another is the using of its specific functions.
Glad it worked out.
Btw, this might be overkill – depending on the position of the first ‘(’ in the stream, but caching
pattern:"(*"
might also help.
I want to allow cases like this:
fn foo = (print "foo")
but ignore cases like that:
fn foo arg:(random 1 10 > 5) = (print (if arg then "A" else "B"))
so my blocks must start after a white space.
I’m not sure what you mean. I just meant you could do this:
local pat = "(*"
while not ms.eos() and not matchPattern token pattern:pat do
token = ms.readToken()
Like you did with the other symbols.
got you now,
fixed it to something even simpler:
while not ms.eos() and not token[1] == p1 do
Thanks again.