I think this is the way. The time is the same as my first proposal, but the results I am not sure if are so good.
Have you tried with the demo tool you have?
I’ve tried an animation increasing the numbers of bead from 1 to 100 and I am seeing beads “jumping” al over. Perhaps I didn’t modify the code good, I don’t know.
BTW, you may want to check this, just out of the code:
distributeBeads num_beads:21 num_threads:10
-2 >>>>>>>> beads:23 -> threads:10 = time:0 heap:2056L
Anyway, I think this is the correct approach. It all boils down to correctly round those decimals
yep, i found problems after… but i believe it’s the right direction. on high numbers it works very well and with no any extra time usage.
on small numbers we need to find a ‘cheap’ way to add ‘lost’ or remove ‘extra’ beads happened because of a float rounding…
I faced the same problem. For large numbers the result is correct, as well as for most combinations of low numbres, but there are a few that causes problems.
Those decimals are hard to round up if you want to keep adding beads to a thread but not removing any.
But the time is correct as it is O(n) on the number of threads. There are many ways to round the numbers, but I couldn’t find one that produces the results you want. There must be one though.
Also, I think the code should be very simple.
To simplify the description of the algorithm based on the code you showed here, I think it is:
- If number of beads <= number of threads, each thread must get 1 bead, from longest to shortest.
- Else, the number of beads on each thread must be >= than any previous iteration.
So if the number of beads is 50, and on one thread you have 4 beads, for any number of beads greater than 50, the number of beads on that thread must be >= 4.
This is correct description. I think it makes sense.
i will show tomorrow my best i recently have. it works absolutely correct, and takes only ~12ms for 1,000,000 beads for 1,000 threads. but… it uses my c++ help for sorting.
there was a thread that showed how QSORT in mxs slow. yep… this is the thing that i re-wrote first of all
here is my most recent attempt, which works pretty well:
fn distributeBeads num_beads:10000 num_threads:100 =
(
fn p_sort p1 p2 index:1 = if p1[index] < p2[index] then 1 else if p1[index] > p2[index] then -1 else 0
fn mod_f val &int_part =
(
int_part = val as integer
flt_part = val - int_part
)
fn q_sort tab index:1 =
(
p = tab[1]
val = p[index]
deleteitem tab 1
i = 1
j = tab.count
k =
(
if (val >= tab[i][index]) then i
else if (val <= tab[j][index]) then j
else
(
local pivot
while (j - i > 1) do
(
pivot = (i + j)/2
if (val > tab[pivot][index]) then j = pivot
else if (val < tab[pivot][index]) then i = pivot
else i = j
)
if (val >= tab[pivot][index]) then pivot else (pivot + 1)
)
)
insertitem p tab k
)
seed 0
all_len = 0
t = timestamp()
h = heapfree
thread_data = #()
for i=1 to num_threads do
(
len = random 10.0 200.0
all_len += len
append thread_data [i, len, 0, len]
)
qsort thread_data p_sort index:4
all_beads = 0
bcount = amin num_threads num_beads
new_len = 0
for k=1 to bcount do
(
local v = thread_data[k]
v[3] = 1
v[4] -= v[2]/2
new_len += v[4]
thread_data[k] = v
)
num_beads -= bcount
all_beads += bcount
if num_beads > 0 do -- and not keyboard.escpressed
(
bcount = 0
all_len = new_len
new_len = 0
qsort thread_data p_sort index:4
beads = num_beads
for i=1 to num_threads do --while beads > 0 do
(
local v = thread_data[i]
num = (v[4]/all_len * num_beads) as integer
if (num > 0) do
(
v[3] += num
v[4] = v[2] - v[2]*v[3]/(v[3]+1)
bcount += num
beads -= num
)
new_len += v[4]
thread_data[i] = v
)
extra = bcount - num_beads
format "extra >> %
" extra
if extra < 0 do
(
qsort thread_data p_sort index:4
while extra < 0 do
(
extra += 1
local v = thread_data[1]
v[3] += 1
v[4] = v[2] - v[2]*v[3]/(v[3]+1)
thread_data[1] = v
bcount += 1
beads -= 1
if extra < 0 do q_sort thread_data index:4
)
)
num_beads = beads
all_beads += bcount
)
all_beads = 0
for i=1 to num_threads do all_beads += thread_data[i][3] as integer
format "% >>>>>>>> beads:% -> threads:% % = time:% heap:%
" num_beads all_beads num_threads thread_data.count (timestamp() - t) (h - heapfree)
)
/*
distributeBeads num_beads:100000 num_threads:1000
*/
[B]
extra >> -501
0 >>>>>>>> beads:100000 -> threads:1000 1000 = time:43 heap:405392L
extra >> -505
0 >>>>>>>> beads:1000000 -> threads:1000 1000 = time:38 heap:405952L
[/B]
as you can see i use simple quick_sort algorithm to find sort position for only first element … i do it in ‘extra’ loop where we don’t need full sort for array
during the algorithm code i’m counting some numbers for debugging purpose, we can remove it, but it’s unlikely that it can change performance
fn distributeBeads num_beads:10000 num_threads:100 =
(
fn p_sort p1 p2 index:1 = if p1[index] < p2[index] then 1 else if p1[index] > p2[index] then -1 else 0
fn q_sort tab index:1 =
(
local p = tab[1]
local val = p[index]
deleteitem tab 1
local i = 1
local j = tab.count
local k =
(
if (val >= tab[i][index]) then i
else if (val <= tab[j][index]) then j
else
(
local pivot
while (j - i > 1) do
(
pivot = (i + j)/2
if (val > tab[pivot][index]) then j = pivot
else if (val < tab[pivot][index]) then i = pivot
else i = j
)
if (val >= tab[pivot][index]) then pivot else (pivot + 1)
)
)
insertitem p tab k
)
seed 0
t = timestamp()
h = heapfree
local thread_data = #()
for i=1 to num_threads do
(
len = random 10.0 200.0
append thread_data [i, len, 0, len]
)
qsort thread_data p_sort index:4
local all_len = 0
local bcount = amin num_threads num_beads
for k=1 to bcount do
(
local v = thread_data[k]
v[3] = 1
v[4] -= v[2]/2
all_len += v[4]
thread_data[k] = v
)
num_beads -= bcount
if num_beads > 0 do
(
qsort thread_data p_sort index:4
local bcount = 0
for i=1 to num_threads do
(
local v = thread_data[i]
num = (v[4]/all_len * num_beads) as integer
if (num > 0) do
(
v[3] += num
v[4] = v[2] - v[2]*v[3]/(v[3]+1)
bcount += num
)
thread_data[i] = v
)
k = extra = bcount - num_beads
if k < 0 do
(
qsort thread_data p_sort index:4
do
(
k += 1
local v = thread_data[1]
v[3] += 1
v[4] = v[2] - v[2]*v[3]/(v[3]+1)
thread_data[1] = v
q_sort thread_data index:4
)
while (k < 0)
)
)
t = timestamp() - t
h = h - heapfree
all_beads = 0
for i=1 to num_threads do all_beads += thread_data[i][3] as integer
format "% >>>>>>>> beads:% -> threads:% % = time:% heap:%
" extra all_beads num_threads thread_data.count t h
)
all unnecessary counting was removed… slightly faster, slightly better memory usage
Almost there. Still has some errors.
I created 10 Threads with the Arcs script:
For 42-43 and 62-63, thread 6 misses 1 bead.
For 74-75, thread 9 misses 1 bead
For 107-108. thread 6 misses 1 bead
There are other errors as well with larger numbers of beads.
there is almost no difference how many beads used for this algorithm. only number threads makes a difference
Here are some results for 10 threads with missing beads, tested from 1 to 500:
beads:43 current:4 previous:5
beads:63 current:6 previous:7
beads:75 current:11 previous:12
beads:108 current:11 previous:12
beads:128 current:17 previous:18
beads:140 current:10 previous:11
beads:141 current:21 previous:22
beads:173 current:18 previous:19
beads:189 current:14 previous:15
beads:205 current:15 previous:16
beads:206 current:31 previous:32
beads:238 current:25 previous:26
beads:254 current:27 previous:28
beads:270 current:20 previous:21
beads:271 current:41 previous:42
beads:303 current:32 previous:33
beads:335 current:25 previous:26
beads:336 current:51 previous:52
beads:368 current:39 previous:40
beads:384 current:47 previous:48
beads:400 current:61 previous:62
beads:433 current:46 previous:47
beads:451 current:62 previous:63
beads:465 current:35 previous:36
beads:466 current:71 previous:72
beads:498 current:53 previous:54