Notifications
Clear all

[Closed] Mini Challenge

it’s a good idea to use (dotNetClass “System.Array”).BinarySearch… it deserves “Thanks!” for sure

BinarySearch gives very very good performance! :keenly: for Swordslayer
DenisT rutine wins if the number of items in ‘points’ is big (num = 50000 and upper).

In my computer:
1000 items:
0.370469 #(#(“d6b1a1”, 0.370021), #(“b9154e”, 0.370525)) >> time:32 memory:128120L (DenisT)
0.370469 #(#(“d6b1a1”, 0.370021), #(“b9154e”, 0.370525)) >> time:24 memory:128392L (Swordslayer)

20000 items:
0.421854 #(#(“cfb5ce”, 0.421824), #(“514085”, 0.421863)) >> time:311 memory:128120L (DenisT)
0.421854 #(#(“cfb5ce”, 0.421824), #(“514085”, 0.421863)) >> time:297 memory:128392L (Swordslayer)

50000 items:
0.872149 #(#(“e4acca”, 0.872148), #(“f375c2”, 0.872164)) >> time:794 memory:128120L (DenisT)
0.872149 #(#(“e4acca”, 0.872148), #(“f375c2”, 0.872164)) >> time:807 memory:128464L (Swordslayer)

100000 items:
0.6579 #(#(“fc2469”, 0.657889), #(“6d3ea0”, 0.657903)) >> time:1718 memory:128120L
0.6579 #(#(“fc2469”, 0.657889), #(“6d3ea0”, 0.657903)) >> time:1782 memory:128464L

in case for many items (>10000) the best method is a using of kdtree

here is cleaned up mxs BSearch version:

fn biSearchSides points t =
(
	local found = off, k, v
	local s = 1, e = points.count	
	while not found and s <= e do
	(
		k = (s + e)/2
		v = points[k][2]
		case of
		(
			(t > v): s = k + 1
			(t < v): e = k - 1
			default: found = on
		)
	)
	if found then #(points[k], points[k])
	else if v < t then #(points[k], points[amin (k+1) points.count])
	else #(points[amax (k-1) 1], points[k])
)

I’ve got interesting results sorting with DotNet. It’s faster for medium size points items and relative few searches, but takes more memory.

Function to insert:


binSearch = (dotNetClass "System.Array").BinarySearch
CSharpSort = (dotNetClass "System.Array").Sort


fn bsearchSides2 floats keys t =
(
	local index = binSearch floats t
	case of
	(
		(index <= -(points.count-1)) : #(#(keys.getvalue (abs index-2), floats.getvalue (abs index-2)),#(keys.getvalue (abs index-2), floats.getvalue (abs index-2)))
		(index < -1) : (j = abs index-2;#(#(keys.getvalue j, floats.getvalue j),#(keys.getvalue (j+1), floats.getvalue (j+1))))
		(index == points.count - 1) : #(#(keys.getvalue (index), floats.getvalue (index)),#(keys.getvalue (index), floats.getvalue (index)))
		default : #(#(keys.getvalue (abs index), floats.getvalue (abs index)),#(keys.getvalue (abs index+1), floats.getvalue (abs index+1)))
	)
)

The test part:

	pts = deepcopy points	
	t1 = timestamp()
	m1 = heapfree
	floats = dotNet.ValueToDotNetObject (for p in pts collect p[2]) (dotNetClass "System.Single[]")
	keys =  dotNet.ValueToDotNetObject (for p in pts collect p[1]) (dotNetClass "System.string[]")
	CSharpSort floats keys
	seed _seed
	for k=1 to count do v = bsearchSides2 floats keys (t = random 0.0 1.0)
	format "% % >> time:% memory:%
" t v (timestamp() - t1) (m1 - heapfree)

Result for 20000 items and 1000 searches:

0.174569 #(#(“1fadf1”, 0.174531), #(“45b55b”, 0.174573)) >> time:308 memory:128120L DenisT
0.174569 #(#(“1fadf1”, 0.174531), #(“45b55b”, 0.174573)) >> time:301 memory:128392L Swordslayer
0.174569 #(#(“1fadf1”, 0.174531), #(“45b55b”, 0.174573)) >> time:192 memory:864664L aaandres

your method is faster only because you not use mxs qsort which is slow

your method loses memory because of casting mxs to net values.

see the difference:

fn sortfloats f1 f2 = if f1 < f2 then -1 else if f1 > f2 then 1 else 0
floats = for k=1 to 100000 collect (random 0.0 1.0)	
(
	arr = deepcopy floats
	t1 = timestamp()
	m1 = heapfree
	sort arr
	format "time:% memory:%
" (timestamp() - t1) (m1 - heapfree)

	arr = deepcopy floats
	t1 = timestamp()
	m1 = heapfree
	qsort arr sortfloats
	format "time:% memory:%
" (timestamp() - t1) (m1 - heapfree)
)

Yes, DenisT. Of course that’s why it’s faster. The amount of memory is a !’?’

And this one, compiling in the fly, seems the faster one for big amount of data, sorting and searching with C#. Most memory and time is for converting data to MXS.


global classIter	
global pairValueArray_classname

(
	seed 0
	num = 100000
	count = 20000
	points = for k=1 to num collect #(bit.intashex (random 0x100000 0xFFFFFF) , random 0.0 1.0) 
	
	tss = timestamp()
	m1 = heapfree
	
	(
		if (classIter == undefined) then classIter = 100

		pairValueArray_classname = "pairValueArray_" + classIter as string
		classIter += 1

		if classof (dotnet.GetType pairValueArray_classname) != dotNetObject then
		(
			classStr = "
			using System;
			using System.Collections;
			using System.Collections.Generic;


			public class " + pairValueArray_classname + "
			{
				public float[] floats;
				public string[] keys;
				public int dim;
				public KeyValuePair<string, float> lowerResult;
				public KeyValuePair<string, float> upperResult;

				/// CONSTRUCTOR
				public " + pairValueArray_classname + "(string[] inKeys, float[] inFloats)
				{
					Array.Sort(inFloats, inKeys);
					floats = inFloats;
					keys = inKeys;
					dim = floats.Length;
				}


				public void searchItem(float valueToFind)
				{
					int index = Array.BinarySearch(floats, valueToFind);

					int lowerIndex = -1;
					int upperIndex = -1;

					if (index <= -(dim - 1))
					{
						lowerIndex = upperIndex = Math.Abs(index) - 2;
						lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
						upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
						return;
					}
					if (index < -1)
					{
						lowerIndex = Math.Abs(index) - 2;
						upperIndex = Math.Abs(index) - 1;
						lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
						upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
						return;
					}
					if (index == dim - 1)
					{
						lowerIndex = upperIndex = index;
						lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
						upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
						return;
					}

					lowerIndex = Math.Abs(index);
					upperIndex = Math.Abs(index) + 1;
					lowerResult = new KeyValuePair<string, float>(keys[lowerIndex], floats[lowerIndex]);
					upperResult = new KeyValuePair<string, float>(keys[upperIndex], floats[upperIndex]);
				}
			}
			"

			compilerParams = dotnetobject "System.CodeDom.Compiler.CompilerParameters"
			dotnet.setlifetimecontrol compilerParams #dotnet

			compilerParams.ReferencedAssemblies.Add("System.dll");
			compilerParams.ReferencedAssemblies.Add((getdir #maxroot) + @"Autodesk.Max.dll");
			compilerParams.GenerateInMemory = on
			csharpProvider = dotnetobject "Microsoft.CSharp.CSharpCodeProvider"

			compilerResults = csharpProvider.CompileAssemblyFromSource compilerParams #(classStr)
			dotnet.setlifetimecontrol compilerResults #dotnet

			if (compilerResults.Errors.Count > 0 ) then
			(
				local errs = stringstream ""
				for i = 0 to (compilerResults.Errors.Count-1) do
				(
					local err = compilerResults.Errors.Item[i]
					format "Error:% Line:% Column:% %
" err.ErrorNumber err.Line err.Column err.ErrorText to:errs
				)
				format "%
" errs
				undefined
			)			
		)

	)

	
	seed (timestamp()) 
	_seed = random 1 1e9
	v = undefined 
	t = undefined 
		
	fn convertDotNetKeyValuePair pairValueArray =
	(
		it0 = pairValueArray.lowerResult
		it1 = pairValueArray.upperResult
		vr = #(#(it0.key, it0.value) , #(it1.key,it1.value))
	)

	floats = (for p in points collect p[2])
	keys =  (for p in points collect p[1])
	pairValueArray = dotnetobject pairValueArray_classname keys floats
	
	tsf = timestamp()
	seed _seed
	for k=1 to count do (pairValueArray.searchItem (t = random 0.0 1.0); v = convertDotNetKeyValuePair pairValueArray)

	tef = timestamp()
	
	print ("Time per lookup: " + ((((tef-tsf)/1000.0)/count) as string) + " seconds")
	print ("Total time for " + count as string + " lookups: " + ((tef-tss)/1000.0) as string + " seconds")
	format "% % >> time:% memory:%
" t v (tef-tss) (m1 - heapfree)
	OK
)

Results for 100.000 items and 20.000 searches:
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:2543 memory:2560120L DenisT
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:2117 memory:2560464L Swordslayer
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:2292 memory:17280736L aaandres1
0.830654 #(#(“d1f04d”, 0.830652), #(“71b7b0”, 0.830658)) >> time:1263 memory:14884032L aaandres2

Results for 100.000 items and 1.000 searches:
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:1713 memory:128120L DenisT
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:1769 memory:128392L Swordslayer
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:634 memory:864664L aaandres1
0.41745 #(#(“29c0e3”, 0.417436), #(“5ac837”, 0.417452)) >> time:363 memory:747856L aaandres2

KD-TREE is slower as it sorts with MXS and takes more memory (25M)

algorithmicly well organized kdtree should beat anything based on bisection method in case of big lists… i don’t see how bsearch can beat it

1 Reply
(@aaandres)
Joined: 1 year ago

Posts: 0

Yes. In “Time per lookup” it’s 4 to 5 times faster than bsearch.

With python now a first-class citizen, here is the same using the bisect module, the function first:

bisect = python.import "bisect"

fn bisectSides points floats t =
(
	local index = bisect.bisect floats t

	case of
	(
		(index == 0 OR index == points.count - 1) : #(points[index+1], points[index+1])
		default : #(points[index], points[index+1])
	)
)

the test part:

	pts = deepcopy points	
	t1 = timestamp()
	m1 = heapfree
	qsort pts sortByValue
	floats = (for p in pts collect p[2])
	seed _seed
	for k=1 to count do v = bisectSides pts floats (t = random 0.0 1.0)
	format "% % >> time:% memory:%
" t v (timestamp() - t1) (m1 - heapfree)

and the complete results:

directSearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:2428 memory:136588L
sortedSearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:703 memory:-561980L
quicksearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:53 memory:119444L
bsearchSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:41 memory:115232L
bisectSides: 0.627017 #(#("786bbf", 0.626951), #("b146d5", 0.628033)) >> time:299 memory:205264L
Page 2 / 2