Notifications

Clear all

[Closed] Micro-Challenge. #Split a String

Page 2 / 3 Prev Next

Sep 04, 2011 4:53 pm

improved memory footprint

fn mySplit str tok =
(
	local space = " "
	local tab = "	"
	local res = #()
	local tmpStr = ""
	local wasToken = false
	local lastChar = ""
	local emptyStr = ""
	local istoken	
	local tokens = for i = 1 to tok.count collect tok[i]
	for s in (for i = 1 to str.count collect str[i]) do
	(		
		if lastChar == s then tmpStr+=s
		else
		(
			if s == tab do s = space
			lastChar = s
			istoken = findItem tokens s > 0
			if wasToken or isToken then
			(
				if tmpStr!=emptyStr do append res tmpStr
				tmpStr = s
			)
			else 
			(
				tmpStr += s
			)
			wasToken = isToken
		)	
	)
	append res tmpStr
	res	
)

28mb

apparently the “” empty string literal is not a static value and takes up precious memory as well.

denisT

Sep 04, 2011 4:53 pm

thank you all for participation in this #challenge.

first of all want to answer the question “how to use it in the practice”.

these or similar functions are used for parsing a text for compilation or interpretation.
very similar algorithms used for the parsing of executable strings.

there are some another tasks where the algorithm might be used:

#1: Start any new word after a token with Capital Letter (for nice UI name for example)
– Sample: get_new_test-function –> Get New Test-Function

#2: Find “mirror” names in list:
– Sample:
source: “L_UpperArm”
targets: #( “UpperArm_R”, “UpperArm(Right)”, “UpperArm_Right”, “Right_UpperArm”, “R_UpperArm”, “R UpperArm”)

#3: Colorize string … http://forums.cgsociety.org/showpost.php?p=7097821&postcount=1

go back to the algorithms…

denisT

Sep 04, 2011 4:53 pm

here are my versions:

String version:


fn stringSplitString str tokens = 
(
	local cuts = #(), token = on
	local c -- previous char
	for i=1 to str.count do
	(
		if findstring tokens (s = str[i]) != undefined then
		(
			if c != s or not token do append cuts i
			token = on
		)
		else 
		(
			if token do append cuts i
			token = off
		)
		c = s
	)
	append cuts 0
	for k=1 to cuts.count-1 collect substring str cuts[k] (cuts[k+1]-cuts[k])  
)

and Stream version:


fn streamSplitString str tokens = 
(
	local cuts = #(), token = on 
	local c -- previous char
	ss = stringstream str
	while not eof ss do
	(
		s = readChar ss
		if findstring tokens s != undefined then
		(
			if c != s or not token do append cuts (filePos ss)
			token = on
		)
		else 
		(
			if token do append cuts (filePos ss)
			token = off
		)
		c = s
	)
	append cuts (filePos ss + 1)
	seek ss 0
	for k=1 to cuts.count-1 collect readChars ss (cuts[k+1]-cuts[k])  
)

denisT

Sep 04, 2011 4:53 pm

results that i have for my machine for 10000 iterations:

Panayot
1302ms
51152344L

TzMtN
1213ms
46828512L

lo
1123ms
36752928L

denisT (string)
1035ms
32588448L

denisT (stream)
1093ms
34108136L

but these are results for short strings…

check the situation for long string and less number of iterations (10):


 str = "--> Hello,	 << World >>!"
 tokens = "-<>! "
 
 stt = ""
 for k=1 to 1000 do stt += str
 
 splitString stt tokens

Panayot
2361ms
50981656L

TzMtN
4218ms
63543672L

lo
1979ms
27547584L

denisT (string)
2799ms
30507176L

denisT (stream)
1038ms
30508552L

denisT

Sep 04, 2011 4:53 pm

What I want to show by this #challenge?

Any string operations in MXS are very expensive by memory use…
All built-in string operations very slow for long strings…
Some simple trick can make your function 10 times faster…

denisT

Sep 04, 2011 4:53 pm


fn easySplitString str tokens =
(
	for i=1 to tokens.count do 
	(
		t = tokens[i]
		str = substitutestring str t ("" + t + "")
		str = substitutestring str "" ""
		str = substitutestring str "" ""
	)
	filterstring str ""
)

it a sample when “Short” doesn’t mean “Good”

lo1

Sep 04, 2011 4:53 pm

I think that last one is really clever. The results (for short strings at least) are not bad.

1 Reply

denisT

(@denist)

Joined: 11 months ago

Posts: 0

Sep 04, 2011 4:53 pm

Reply to

lo1

 the real version that i used is:


     fn easySplitString str tokens =
     (
     	local sx = ""		/* start of text symbol*/
     	local ex = ""		/* end of text symbol*/
     	local xx = ""
     	local em = ""
     
     	for i=1 to tokens.count do 
     	(
     		t = tokens[i]
     		sss = substitutestring str t (ex+t+ex)
     		if sss.count != str.count do
     		(	
     			str = substitutestring sss xx em
     			str = substitutestring str ex sx
     		)
     	)
     	filterstring str sx
     )

 i used it in searching algorithm for [b]similar [/b][/i]names in scene

it gives for our test condition:
1146 ms
30181280L

lo1

Sep 04, 2011 4:53 pm

how do you produce the text symbols in maxscript editor?

1 Reply

denisT

(@denist)

Joined: 11 months ago

Posts: 0

Sep 04, 2011 4:53 pm

Reply to

lo1

bit.intaschar <code>, copy, and paste

lo1

Sep 05, 2011 5:19 pm

oh, duh

Panayot

Sep 05, 2011 5:19 pm

Oh, this challenge end so fast… no time to react on Now if I say that I has similar to Denis idea, w’d sound speculative. (I mean the last function for short strings as I saw this approach somewhere into programming forums) With my first question I tried to get more info to know if we’re after function for concrete purpose or we’are in the next Denis lesson.

Please do not understand me wrong Denis, I respect your knowledge, I like your coding style and learn alot from you, also like your teaching style – how you start, lead and close your subjects. You do it in a manner well proofed from ancient ages, style used by philosophers, religious gurus, ect., all they start question with some ruse (usually non-strict definition).

Do not worry, nothing bad here, I just think you are at level of knowledge (and spirit condition) where you need students. Yep, atm you fill that need into the forum, and all we gain something btw, but maybe is time to thought about the next step (make a course, write a book)? Just my friendly thoughts. Sometimes I need to say what I think and this is my weakness

Regards,
Panayot

Page 2 / 3 Prev Next