[Closed] MXS 2017 &#039;For Loop&#039; performance

PolyTools3D

This sounds like exactly my situation and thought process too!

Here are my results :

3DS MAX 2017
time:5077ms heap:3485764L RAM:1146.23MB

3DS MAX 2016
time:1415ms heap:3321464L RAM:1140.9MB

Have you broken this code down to identify if the slowdown is specific to a particular loop or operation?

(@denist)

Joined: 1 year ago

Posts: 0

Jun 25, 2016 11:14 am

reForm

i still don’t think that the problem related specifically to loop operations. we just associate it with loops because it’s a simplest way to many iterations.

i’m pretty sure that the code:

(
    a = [0,0,0]
    (
        a += 1 
    )
    (
        a += 1 
    )
    (
        a += 1 
    )
    (
        a += 1 
    )
    (
        a += 1 
    )
    (
        a += 1 
    )
    -- another 1000 times
    (
        a += 1 
    )
)

will be slower for 2017

PolyTools3D

(@polytools3d)

Joined: 1 year ago

Posts: 0

Jun 25, 2016 11:14 am

reForm

My first hypothesis was that this was caused by the iterator.

I thought there was a gc() kinking in in the loop that caused the 50% slowdown, but Denis suggested that this is due to a scope change and I think he is right, and this situation is obviously emphasized in loops, as nobody will unroll a loop just to make the code run in Max 2017 as it does in previous versions.

However, there is a worse problem that drops the performance in Max 2017 immensely.

Thanks to Vojtech suggestion, I started looking at were exactly the problem is, and I think I got an interesting situation.

The biggest problem does seem to be iterating over bitarrays.

How bad is the performance drop in Max 2017 in these situations? So far it is bad enough.

I wouldnt find this myself if it wasn’t for the help of Denis and Vojtech, so thank you so much guys for your contributions and knowledge.

Is this a new feature or a bug? You tell me.

(
	st = timestamp(); sh = heapfree

	elements = for j = 1 to 10000 collect #{j}
	
	for j in elements do
	(
		for i in j do ()
	)

	format "time:% ram:%
" (timestamp()-st) (sh-heapfree)
)

MAX 2016
time:261 ram:663624L

MAX 2017
time:6540 ram:680204L

(@denist)

Joined: 1 year ago

Posts: 0

Jun 25, 2016 11:14 am

PolyTools3D

i can confirm that the code above is written using very good style and good practices.
any optimization of this code can be done only with a kind of tricks. which shouldn’t be a normal practice

Jun 25, 2016 11:14 am

i think it’s because of a changed way how values are handled now in different scopes

Jun 25, 2016 11:14 am

Just guessing but let’s play a game, what results do you get if you replace the line

for i in j do center += GetVertPos obj i

with a while loop, something along the lines of

k = j as array
count = 0
while count < k.count do center += GetVertPos obj k[count+=1]

It adds bitarray > array conversion (that might or might not make it perform even worse) and some checking but doesn’t open a new scope which seems to be the main issue here. Not a max machine to test now.

Jun 25, 2016 11:14 am

To answer my own question, in Jorge’s code the while loop even !with the added overhead of array conversion! is 1.5x as fast as the for loop (so about 2x slower compared to the for loop in 2016) and to comment on what denisT suggested:

(
	local stream = stringStream ""
	local iter = "(
a += 1
)
"

	append stream "(
a = [0,0,0]
"
	for i = 1 to 10000 do append stream iter
	append stream ")
"

	t = timeStamp()
	execute (stream as string)
	format "time: %
" (timeStamp() - t)
)

2016: 59 ms
2017: 83 ms

IMHO if there was a for-loop syntax that would use a predeclared iterator/loop variable and wouldn’t open a new scope, it would be at worst 1.5x slower no matter what you do in the loop body and how many loops you nest.

Jun 25, 2016 11:14 am

And with these changes (eliminating the conversion overhead in 2017):

varray = #()

...

count = 0
while count < j.count do center += GetVertPos obj j[count+=1]

3DS MAX 2016
time:1236ms heap:2643720L RAM:1165.67MB

3DS MAX 2017
time:1355ms heap:3033904L RAM:992.391MB

Jun 25, 2016 11:14 am

for me without the last example (which proves my thoughts very well) it was not a bug. it’s much worse. it’s a BAD feature. and it will be very hard to fix because developers have to admit their mistake.

Jun 25, 2016 11:14 am

yes. this is the thing. it’s not nested loops as i thought. and not a passing of a variable from one scope to another. it’s the passing of iteratable variable from one loop the another. great find, Jorge!

1 Reply

(@swordslayer)

Joined: 1 year ago

Posts: 0

Jun 25, 2016 11:14 am

No, it’s partly nested loops (and it might get worse when a variable from different scope gets assigned to) but this is a whole different situation specific to bitarrays, see for example:

(
	st = timestamp(); sh = heapfree
	
	local almostEmptyBitarray = #{10000}

	for i = 1 to 10000 do
		for j in almostEmptyBitarray do ()

	format "time:% ram:%
" (timestamp()-st) (sh-heapfree)
)

Max 2016 time:701 ram:184L
Max 2017 time:8996 ram:196L

Jun 25, 2016 11:14 am

could anyone do the test:

a = #{1..1000}
for k in a do for i in a do ()

PS. it would be more interesting to change A anyhow… let’s say:

a = #{1..1000}
for k in a do for i in a do (a[i] = on)

2 Replies

(@swordslayer)

Joined: 1 year ago

Posts: 0

Jun 26, 2016 4:25 pm

Just this loop, as it is, not nested?

2016: 187 ms
2017: 298 ms

With changing it:

2016: 292 ms
2017: 480 ms

It only gets extremely bad when it happend in the inner loop.

(@denist)

Joined: 1 year ago

Posts: 0

Jun 26, 2016 4:25 pm

it checks the situation where the same iterator used in nested loops

Jun 26, 2016 4:25 pm

(
	st = timestamp(); sh = heapfree
	
	local almostEmptyBitarray = #{10000}

	for i = 1 to 10000 do
		for j in almostEmptyBitarray do ()

	format "time:% ram:%
" (timestamp()-st) (sh-heapfree)
)

WOW! that’s pretty often used code combination.

but this is different than Jorge’s finding.

1 Reply

(@swordslayer)

Joined: 1 year ago

Posts: 0

Jun 26, 2016 4:25 pm

Well, yeah, say you have a 1M vert mesh and do ONLY 100 loops of something – so let’s have a look

(
	st = timestamp(); sh = heapfree
	
	local emptyBitarray = #{}
	emptyBitarray.count = 1000000 --- so that it's like the returned bitarray from 1M mesh, 
						--- for example selected verts that you get once in the beginning

	for i = 1 to 100 do
		for j in emptyBitarray do ()

	format "time:% ram:%
" (timestamp()-st) (sh-heapfree)
)

The result is exactly the same as before, i.e. bit more than half a second in 2016 vs. eight seconds in 2017.

Don’t worry, the results are when tested in local scope, it’s a bit slower when in global scope but not really much.