here is a time to show my hand
as you see it’s c# solution. I needed to make the algorithm fast and memory safe as possible. Also I need smooth distribution of all shuffled elements across whole array.
here is what I have:
fn CreateArrAssembly =
(
source = ""
source += "using System;
"
source += "using System.Collections;
"
source += "namespace MathOps
"
source += "{
"
source += " public class ArrayClass
"
source += " {
"
source += " public int[] RandomIndexes(int count)
"
source += " {
"
source += " Random list = new Random();
"
source += " int[] numbers = new int[count];
"
source += " int[] randoms = new int[count];
"
source += " for (int i = 0; i < count; i++)
"
source += " {
"
source += " randoms[i] = list.Next();
"
source += " numbers[i] = i + 1;
"
source += " }
"
source += " Array.Sort(randoms, numbers);
"
source += " randoms = null;
"
source += " return numbers;
"
source += " }
"
/*
--------------------------- seed support ------------------------------
source += " public int[] RandomIndexes(int count, int seed)
"
source += " {
"
source += " Random list = new Random(seed);
"
source += " int[] numbers = new int[count];
"
source += " int[] randoms = new int[count];
"
source += " for (int i = 0; i < count; i++)
"
source += " {
"
source += " randoms[i] = list.Next();
"
source += " numbers[i] = i + 1;
"
source += " }
"
source += " Array.Sort(randoms, numbers);
"
source += " randoms = null;
"
source += " return numbers;
"
source += " }
"
*/
source += " }
"
source += "}
"
csharpProvider = dotnetobject "Microsoft.CSharp.CSharpCodeProvider"
compilerParams = dotnetobject "System.CodeDom.Compiler.CompilerParameters"
compilerParams.ReferencedAssemblies.AddRange #("System.dll")
compilerParams.GenerateInMemory = on
compilerResults = csharpProvider.CompileAssemblyFromSource compilerParams #(source)
assembly = compilerResults.CompiledAssembly
assembly.CreateInstance "MathOps.ArrayClass"
)
global ArrayClass = CreateArrAssembly()
maxlimit = 1000000
fn indexDensity list parts:5 = -- you can check distribution for maxlimit ~10000
(
sums = 0
for k=0 to parts-1 do
(
sum = 0
for i=0 to list.count/parts-1 do
(
sum += list[k*parts + i + 1]
)
format " %: % ? %
" k sum ((maxlimit+1)*(maxlimit/2)/parts)
sums += sum
)
format " =: % % % = %
" sums "%" ((maxlimit+1)*(maxlimit/2)) (sums*1.0/((maxlimit+1)*(maxlimit/2)))
)
(
gc()
limit = maxlimit
startM = heapfree
startT = timestamp()
a = ArrayClass.RandomIndexes limit asdotnetobject:on
format "
denisT code:
Time: %s, Mem: %Kb
" ((timestamp()-startT)/1000.0) (((startM-heapfree) as integer)/1024)
-- format "Num unique elem %
" (a as bitarray).numberset
-- indexDensity a
)
biddle’s c# algorithm beats mine by performance and has the same well distribution…
(
gc()
limit = maxlimit
startM = heapfree
startT = timestamp()
a = myrandomIndexes limit
format "
Biddle algorithm (thank you wikipedia):
Time: %s, Mem: %Kb
" ((timestamp()-startT)/1000.0) (((startM-heapfree) as integer)/1024)
format "Num unique elem %
" (a as bitarray).numberset
indexDensity a
)
well. let see his hand…
looking at the filesize of your .mse I figured there was some compiling going on…
in my mse file there were several different algorithms. i left the only one.
as you see i can handle 10,000,000 and more, and more… the method doesn’t touch max script memory
Here is an optimized version of the multithreaded script. you’ll notice that:
- the distribution is not well, due to the segmentation.
- there are 0.062% duplicates, I am guessing it is due to error of arrays in maxscript not being threadsafe. Can someone confirm this?
(
gc()
global wDone2, shuffleArray, runThreads
theThreads=#()
asize = 1000000
wThreads = sysInfo.cpucount
adiv = asize / wThreads
startg = timestamp()
endg = 0
mem = heapfree
seed(timestamp())
global results =#()
results[asize]=asize
fn shuffleArray sender eargs =
(
local st = eargs.Argument[1]
local en = eargs.Argument[2]
for i = st to en do
(
local idx = random i en
swap results[i] results[idx]
)
deleteItem theThreads (findItem theThreads sender)
)
fn wDone =
(
if theThreads.count == 0 then
(
endg = timestamp()
format "processing: % seconds heap: %
" ((endg - startg) / 1000.0) (mem-heapfree)
format "Num unique elem %
" (results as bitarray).numberset
)
)
fn runThreads =
(
local fStart = 1
for i = fStart to asize do results[i] = i
print ("Starting "+wThreads as string+" threads...")
for i = 1 to wThreads do
(
local th = dotnetobject "System.ComponentModel.BackGroundWorker"
append theThreads th
dotNet.addEventHandler th "DoWork" shuffleArray
dotNet.addEventHandler th "RunWorkerCompleted" wDone
th.RunWorkerAsync #((fStart*i),(fStart*i+aDiv))
)
)
runThreads()
)
"Starting 8 threads..."
OK
processing: 0.957 seconds heap: 56083856L
Num unique elem 999380
you are guys a little mess-upped with multithreading. It’s not the same as multiprocessing. You can have as many threads as you want. Technically it doesn’t depend on how many processors you have.
Though I imagine at some point you will lose efficiency, depending on the task and amount of cores
Gotcha, thanks
Dont know if gen class is completely random, but well unless i use qubits nothing else will be.
(
ts = timestamp()
mem = heapfree
arr = #()
for i = 1 to 6 do
(
append arr (execute((genClassID returnValue:True)[1] as string)[i])
)
format "Time: %, Memory: %
" (timestamp()-ts) (mem-heapfree)
arr
)
Time: 0, Memory: 4356L
sorry i missed a point a bit.
GetClassID probably more complicated thing than Random. It’s something about CRC32 or other hash generator. It’s much more slower than Random.
The code provided must be very slow and takes a lot of memory. Handling of string arrays in max it’s probably worse thing that I know.
you don’t need to go so far… .net has everything: http://msdn.microsoft.com/en-us/library/system.security.cryptography.randomnumbergenerator.aspx
i just couldn’t understand the idea of using max ID generator for the task…
Yeah, but it’s cool to try
lo, the duplicates you’re having is because of your fstart and fend being sent to the threads, try this:
fn runThreads =
(
local fStart = 1
local fEnd = adiv
for i = fStart to asize do results[i] = i
print ("Starting "+wThreads as string+" threads...")
for i = 1 to wThreads do
(
local th = dotnetobject "System.ComponentModel.BackGroundWorker"
append theThreads th
dotNet.addEventHandler th "DoWork" shuffleArray
dotNet.addEventHandler th "RunWorkerCompleted" wDone
th.RunWorkerAsync #(fStart,fEnd)
fStart += adiv
fEnd += adiv
)
)
This way, there’s no duplicates.
So what’s the best time?
I wrote a C# dll to call from Mxs and got 0.673s. It takes 0.043s if I call it from C#, but that’s not allowed I guess.
Can post the code if interested but it’s basically an implementation of Fisher-Yates shuffle.
Light
it will be cool if you post a c# code here. we could compare yours and biddle’s. these two methods definitely beat mine. the only question how good shuffling is.