still very nice!
I think the slowdown in my general approach is I now have to call a function for every iteration, I’m looking into turning the function into a string and ‘bake’ a propper for loop with all the code embedded… stay tuned
gc()
seed(timestamp())
mem = heapfree
start = timestamp()
--------------
-- Multi threaded for-loop:
--
-- syntax equivalent: for i=<start> to <end> do ( <func> i ), spread over <cnt> threads and when completed call the function <comp>
--
-- Jonathan de blok - [www.jdbgraphics.nl]( http://www.jdbgraphics.nl)
-------------
fn multiloop start end cnt func comp=
(
if (cnt=="auto") do ( cnt=sysInfo.cpucount )
cb="complete=0; fn callback a=( complete=complete+1 ; if (complete=="+(cnt as string)+") do ( "+comp+"(); ) )"
execute cb
end=end+1
chunk= (float end- float start)/cnt*1.0
st=start
end=chunk
complete=0
trs=#()
for i=1 to cnt do
(
trs[i]=dotnetobject "System.ComponentModel.BackGroundWorker"
n=i as string
run = "fn trfunc_"+n+" = ( "+func+" "+(int st as string)+" "+(int end as string)+" "+n+" ); "
run=run+"dotNet.addEventHandler trs["+n+"] \"RunWorkerCompleted\" callback; "
run=run+"dotNet.addEventHandler trs["+n+"] \"DoWork\" trfunc_"+n
execute run
trs[i].RunWorkerAsync()
st=st+chunk
end=end+chunk
)
)
--------------
nr=1000000
results = for i = 1 to nr collect i
-- This function is called by every thread, <start> and <end> is where you want to start and end your for loop iteration.
fn shuffleArray start end thread=
(
for i=start to end do
(
ind=random 1 nr
tmp=results[ind]
results[ind]=results[i]
results[i]=tmp
)
)
--This gets called when all thread have completed
fn CompleteEvent2 =
(
end = timestamp()
format "processing: % seconds heap: %
" ((end - start) / 1000.0) (mem-heapfree)
)
-- do a test run:
multiloop 1 nr "auto" "shuffleArray" "CompleteEvent2"
Made a small change to the generic multi-threading tool. Instead of calling a function for every iteration, now a function gets called with a <start> and <end> value, the function needs to handle the iteration by itself. see the ‘shuffleArray’ function. Also added the ‘auto’ option for number of threads based on available CPU cores.
To compare apples to apples I’ve run Kameleon’s code on my system as well:
My generic code running on 8 threads:
processing: 2.405 seconds heap: 56189672L
Kameloen’s dedicated code running on 2 threads:
processing: 2.465 seconds heap: 56127896L
edit… mm speed doesn’t change much when trying different amount of cores… cpu usages does: 13% for 1 thread… around 40 when doing 8. Timings about stays the same
There is a much more elegant solution for this. You can call RunWorkerAsync with a user value (any value) that the thread will have access to.
Example:
fn dowork s e =
(
local myUserValue = e.argument
)
--CREATE WORKER
theworker.runworkerasync someValue
Thx, much cleaner now and shaved a bit of the timing as well.
processing: 2.374 seconds heap: 56169184L
What I don’t get is that the number of threads doesn’t seem to affect the timings alot…
gc()
seed(timestamp())
mem = heapfree
starts = timestamp()
--------------
-- Multi threaded for-loop:
--
-- syntax equivalent: for i=<start> to <end> do ( <func> i ), spread over <cnt> threads and when completed call the function <comp>
--
-- Jonathan de blok - www.jdbgraphics.nl
-------------
fn multiloop start end cnt func comp=
(
if (cnt=="auto") do ( cnt=sysInfo.cpucount )
cb="complete=0; fn callback a=( complete=complete+1 ; if (complete=="+(cnt as string)+") do ( "+comp+"(); ) )"
execute cb
end=end+1
chunk= (float end- float start)/cnt*1.0
st=start
end=chunk
complete=0
trs=#()
for i=1 to cnt do
(
trs[i] = dotnetobject "System.ComponentModel.BackGroundWorker"
dotNet.addEventHandler trs[i] "RunWorkerCompleted" callback
execute ("dotNet.addEventHandler trs["+(i as string)+"] \"DoWork\"" +func )
trs[i].RunWorkerAsync #(st,end,i)
st=st+chunk
end=end+chunk
)
)
--------------
nr=1000000
results = for i = 1 to nr collect i
-- This function is called by every thread, <e.Argument[1];> and <e.Argument[2];> is where you want to start and end your for loop iteration.
fn shuffleArray s e=
(
start=e.Argument[1];
end=e.Argument[2];
for i=start to end do
(
ind=random 1 nr
tmp=results[ind]
results[ind]=results[i]
results[i]=tmp
)
)
--This gets called when all thread have completed
fn CompleteEvent2 =
(
ends = timestamp()
format "processing: % seconds heap: %
" ((ends - starts) / 1000.0) (mem-heapfree)
)
-- do a test run:
multiloop 1 nr "auto" "shuffleArray" "CompleteEvent2"
Well, I just did the same… I’ve distributed the filling of the array to all my 8 cores and then distributed again the randomization to 8 cores… got it to run at 4 secs lol 4 times slower
my bet is that since I’m creating the functions on the fly, detecting the number of cores and creating doWork functions based on that…
I’m adding more for loops to the script and the execute doesnt help for sure… I’ll post my results, either a fixed version or this super slow multithreaded version
I saw your PM, I might do that
Yeah… so… I got similar results. It said it was done processing, but it was still going, then finished a while later. It seemed like doing a single core operation on something like getVert was faster. But it might be the way it was all being processed, assigned.
I will keep watching this thread and see what happens
Note to self: using a recursive function 1000000 times on itself makes max disappear in 1 second…
Could use it a a quick shutdown method
Well, I was able to do a multithreaded very slow version… dunno why but here it goes:
(
print "V7"
gc()
global wDone1, wDone2, fillArray, shuffleArray, cThreads, runThreads
asize = 1000000
wThreads = sysInfo.cpucount
--wThreads = 1
rThreads = wThreads
adiv = asize / wThreads
startg = timestamp()
endg = 0
mem = heapfree
seed(timestamp())
results =#()
results[asize]=asize
--results = for i = 1 to asize collect i
fn fillArray sender eargs =
(
for i = eargs.Argument[1] to eargs.Argument[2] do results[i] = i
rThreads = rThreads - 1
)
fn shuffleArray sender eargs =
(
for i = eargs.Argument[1] to eargs.Argument[2] do
(
idx = random i eargs.Argument[2]
swap results[i] results[idx]
)
rThreads = rThreads - 1
)
fn wDone1 =
(
if rThreads == 0 then
(
rThreads = wThreads
cThreads "shuffleArray" "wDone2"
runThreads "shuffleArray"
)
)
fn wDone2 =
(
if rThreads == 0 then
(
endg = timestamp()
format "processing: % seconds heap: %
" ((endg - startg) / 1000.0) (mem-heapfree)
format "Num unique elem %
" (results as bitarray).numberset
)
)
fn cThreads cFunction wDoneCompleted =
(
for i = 1 to wThreads do
(
run = ""
run += cFunction + "_Thread" + i as string + " = dotnetObject \"System.ComponentModel.BackgroundWorker\"
"
run += "dotnet.addEventHandler " + cFunction + "_Thread" + i as string + " \"DoWork\" " + cFunction +"
"
run += "dotnet.addEventHandler " + cFunction + "_Thread" + i as string +" \"RunWorkerCompleted\" " + wDoneCompleted
execute run
)
)
fn runThreads cFunction =
(
fStart = 1
fEnd = adiv
for i = 1 to wThreads do
(
run = cFunction + "_Thread" + i as string + ".RunWorkerAsync #(" + fStart as string + "," + fEnd as string + ")"
execute run
fStart += adiv
fEnd += adiv
)
)
cThreads "fillArray" "wDone1"
runThreads "fillArray"
)
Results:
"V7"
OK
processing: 5.262 seconds heap: 240129560L
Num unique elem 1000000
It’s a long story… I wanted to create a function to work with the fill and sort array functions… I tried a similar version to jonadb, but it was even slower