[Closed] Mini Challenge
it’s a good idea to use (dotNetClass “System.Array”).BinarySearch… it deserves “Thanks!” for sure
BinarySearch gives very very good performance! :keenly: for Swordslayer
DenisT rutine wins if the number of items in ‘points’ is big (num = 50000 and upper).
In my computer:
1000 items:
0.370469 #(#(“d6b1a1”, 0.370021), #(“b9154e”, 0.370525)) >> time:32 memory:128120L (DenisT)
0.370469 #(#(“d6b1a1”, 0.370021), #(“b9154e”, 0.370525)) >> time:24 memory:128392L (Swordslayer)
20000 items:
0.421854 #(#(“cfb5ce”, 0.421824), #(“514085”, 0.421863)) >> time:311 memory:128120L (DenisT)
0.421854 #(#(“cfb5ce”, 0.421824), #(“514085”, 0.421863)) >> time:297 memory:128392L (Swordslayer)
50000 items:
0.872149 #(#(“e4acca”, 0.872148), #(“f375c2”, 0.872164)) >> time:794 memory:128120L (DenisT)
0.872149 #(#(“e4acca”, 0.872148), #(“f375c2”, 0.872164)) >> time:807 memory:128464L (Swordslayer)
100000 items:
0.6579 #(#(“fc2469”, 0.657889), #(“6d3ea0”, 0.657903)) >> time:1718 memory:128120L
0.6579 #(#(“fc2469”, 0.657889), #(“6d3ea0”, 0.657903)) >> time:1782 memory:128464L
here is cleaned up mxs BSearch version:
fn biSearchSides points t =
(
local found = off, k, v
local s = 1, e = points.count
while not found and s <= e do
(
k = (s + e)/2
v = points[k][2]
case of
(
(t > v): s = k + 1
(t < v): e = k - 1
default: found = on
)
)
if found then #(points[k], points[k])
else if v < t then #(points[k], points[amin (k+1) points.count])
else #(points[amax (k-1) 1], points[k])
)
I’ve got interesting results sorting with DotNet. It’s faster for medium size points items and relative few searches, but takes more memory.
Function to insert:
binSearch = (dotNetClass "System.Array").BinarySearch
CSharpSort = (dotNetClass "System.Array").Sort
fn bsearchSides2 floats keys t =
(
local index = binSearch floats t
case of
(
(index <= -(points.count-1)) : #(#(keys.getvalue (abs index-2), floats.getvalue (abs index-2)),#(keys.getvalue (abs index-2), floats.getvalue (abs index-2)))
(index < -1) : (j = abs index-2;#(#(keys.getvalue j, floats.getvalue j),#(keys.getvalue (j+1), floats.getvalue (j+1))))
(index == points.count - 1) : #(#(keys.getvalue (index), floats.getvalue (index)),#(keys.getvalue (index), floats.getvalue (index)))
default : #(#(keys.getvalue (abs index), floats.getvalue (abs index)),#(keys.getvalue (abs index+1), floats.getvalue (abs index+1)))
)
)
The test part:
pts = deepcopy points
t1 = timestamp()
m1 = heapfree
floats = dotNet.ValueToDotNetObject (for p in pts collect p[2]) (dotNetClass "System.Single[]")
keys = dotNet.ValueToDotNetObject (for p in pts collect p[1]) (dotNetClass "System.string[]")
CSharpSort floats keys
seed _seed
for k=1 to count do v = bsearchSides2 floats keys (t = random 0.0 1.0)
format "% % >> time:% memory:%
" t v (timestamp() - t1) (m1 - heapfree)
Result for 20000 items and 1000 searches:
0.174569 #(#(“1fadf1”, 0.174531), #(“45b55b”, 0.174573)) >> time:308 memory:128120L DenisT
0.174569 #(#(“1fadf1”, 0.174531), #(“45b55b”, 0.174573)) >> time:301 memory:128392L Swordslayer
0.174569 #(#(“1fadf1”, 0.174531), #(“45b55b”, 0.174573)) >> time:192 memory:864664L aaandres
your method is faster only because you not use mxs qsort which is slow
your method loses memory because of casting mxs to net values.
see the difference:
fn sortfloats f1 f2 = if f1 < f2 then -1 else if f1 > f2 then 1 else 0
floats = for k=1 to 100000 collect (random 0.0 1.0)
(
arr = deepcopy floats
t1 = timestamp()
m1 = heapfree
sort arr
format "time:% memory:%
" (timestamp() - t1) (m1 - heapfree)
arr = deepcopy floats
t1 = timestamp()
m1 = heapfree
qsort arr sortfloats
format "time:% memory:%
" (timestamp() - t1) (m1 - heapfree)
)
Yes, DenisT. Of course that’s why it’s faster. The amount of memory is a !’?’
And this one, compiling in the fly, seems the faster one for big amount of data, sorting and searching with C#. Most memory and time is for converting data to MXS.
global classIter
global pairValueArray_classname
(
seed 0
num = 100000
count = 20000
points = for k=1 to num collect #(bit.intashex (random 0x100000 0xFFFFFF) , random 0.0 1.0)
tss = timestamp()
m1 = heapfree
(
if (classIter == undefined) then classIter = 100
pairValueArray_classname = "pairValueArray_" + classIter as string
classIter += 1
if classof (dotnet.GetType pairValueArray_classname) != dotNetObject then
(
classStr = "
using System;
using System.Collections;
using System.Collections.Generic;
public class " + pairValueArray_classname + "
{
public float[] floats;
public string[] keys;
public int dim;
public KeyValuePair<string, float> lowerResult;
public KeyValuePair<string, float> upperResult;
/// CONSTRUCTOR
public " + pairValueArray_classname + "(string[] inKeys, float[] inFloats)
{
Array.Sort(inFloats, inKeys);
floats = inFloats;
keys = inKeys;
dim = floats.Length;
}
public void searchItem(float valueToFind)
{
int index = Array.BinarySearch(floats, valueToFind);
int lowerIndex = -1;
int upperIndex = -1;
if (index <= -(dim - 1))
{
lowerIndex = upperIndex = Math.Abs(index) - 2;
lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
return;
}
if (index < -1)
{
lowerIndex = Math.Abs(index) - 2;
upperIndex = Math.Abs(index) - 1;
lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
return;
}
if (index == dim - 1)
{
lowerIndex = upperIndex = index;
lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
return;
}
lowerIndex = Math.Abs(index);
upperIndex = Math.Abs(index) + 1;
lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
}
}
"
compilerParams = dotnetobject "System.CodeDom.Compiler.CompilerParameters"
dotnet.setlifetimecontrol compilerParams #dotnet
compilerParams.ReferencedAssemblies.Add("System.dll");
compilerParams.ReferencedAssemblies.Add((getdir #maxroot) + @"Autodesk.Max.dll");
compilerParams.GenerateInMemory = on
csharpProvider = dotnetobject "Microsoft.CSharp.CSharpCodeProvider"
compilerResults = csharpProvider.CompileAssemblyFromSource compilerParams #(classStr)
dotnet.setlifetimecontrol compilerResults #dotnet
if (compilerResults.Errors.Count > 0 ) then
(
local errs = stringstream ""
for i = 0 to (compilerResults.Errors.Count-1) do
(
local err = compilerResults.Errors.Item[i]
format "Error:% Line:% Column:% %
" err.ErrorNumber err.Line err.Column err.ErrorText to:errs
)
format "%
" errs
undefined
)
)
)
seed (timestamp())
_seed = random 1 1e9
v = undefined
t = undefined
fn convertDotNetKeyValuePair pairValueArray =
(
it0 = pairValueArray.lowerResult
it1 = pairValueArray.upperResult
vr = #(#(it0.key, it0.value) , #(it1.key,it1.value))
)
floats = (for p in points collect p[2])
keys = (for p in points collect p[1])
pairValueArray = dotnetobject pairValueArray_classname keys floats
tsf = timestamp()
seed _seed
for k=1 to count do (pairValueArray.searchItem (t = random 0.0 1.0); v = convertDotNetKeyValuePair pairValueArray)
tef = timestamp()
print ("Time per lookup: " + ((((tef-tsf)/1000.0)/count) as string) + " seconds")
print ("Total time for " + count as string + " lookups: " + ((tef-tss)/1000.0) as string + " seconds")
format "% % >> time:% memory:%
" t v (tef-tss) (m1 - heapfree)
OK
)
Results for 100.000 items and 20.000 searches:
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:2543 memory:2560120L DenisT
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:2117 memory:2560464L Swordslayer
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:2292 memory:17280736L aaandres1
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:1263 memory:14884032L aaandres2
Results for 100.000 items and 1.000 searches:
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:1713 memory:128120L DenisT
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:1769 memory:128392L Swordslayer
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:634 memory:864664L aaandres1
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:363 memory:747856L aaandres2
KD-TREE is slower as it sorts with MXS and takes more memory (25M)
algorithmicly well organized kdtree should beat anything based on bisection method in case of big lists… i don’t see how bsearch can beat it
With python now a first-class citizen, here is the same using the bisect module, the function first:
bisect = python.import "bisect"
fn bisectSides points floats t =
(
local index = bisect.bisect floats t
case of
(
(index == 0 OR index == points.count - 1) : #(points[index+1], points[index+1])
default : #(points[index], points[index+1])
)
)
the test part:
pts = deepcopy points
t1 = timestamp()
m1 = heapfree
qsort pts sortByValue
floats = (for p in pts collect p[2])
seed _seed
for k=1 to count do v = bisectSides pts floats (t = random 0.0 1.0)
format "% % >> time:% memory:%
" t v (timestamp() - t1) (m1 - heapfree)
and the complete results:
directSearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:2428 memory:136588L
sortedSearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:703 memory:-561980L
quicksearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:53 memory:119444L
bsearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:41 memory:115232L
bisectSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:299 memory:205264L