[Closed] Bsearch to replace finditem with huge array?

Page 1 / 2 Next

Dec 30, 2020 7:30 pm

I want to use bsearch to replace finditem function,blow is the origin way:

Fn Find_string_in_array arr str=
(
	count=0
	for i=1 to arr.count do
	(
		if findstring arr[i] str!=undefined then
		(
			count+=1
		)
	)
	count
)

key_string="C:\Users\Administrator\Personal\WeChat Files"
need_to_find=#(
"C:\Users\Administrator\Personal",
"C:\Users\Administrator\Personal\3ds Max 2020",
"C:\Users\Administrator\Personal\3ds Max 2021",
"C:\Users\Administrator\Personal\3dsMax",
"C:\Users\Administrator\Personal\Apowersoft",
"C:\Users\Administrator\Personal\Autodesk Application Manager",
"C:\Users\Administrator\Personal\WeChat Files",
"C:\Users\Administrator\Personal\WeChat Files\test.max",
"C:\Users\Administrator\Personal\Windows Live"
)

print (Find_string_in_array need_to_find key_string)

–results
–2 matches
/*
“C:\Users\Administrator\Personal\WeChat Files”,
“C:\Users\Administrator\Personal\WeChat Files\test.max”,
*/

It will too slow for a huge array,I am trying to use bsearch to do the same job,but I don’t know how to use it.

Fn bsearch_fn=
(
	--how to compare the string?
)
Fn Find_string_use_bsearch arr str=
(
	count=0
	compare=bsearch str need_to_find Comp_fn
	if compare!=undefined then count+=1
	count
)
print (Find_string_use_bsearch need_to_find key_string)

Any help would be appericate!

18 Replies

Klvnk

Dec 30, 2020 7:30 pm

this is one way to do it, if you can guarantee the strings are already sorted you can forgo the qsort stage…

  (
    	
    	key_string="C:\Users\Administrator\Personal\WeChat Files";
    	strings = #(    "C:\Users\Administrator\Personal", "C:\Users\Administrator\Personal\3ds Max 2020",
    						"C:\Users\Administrator\Personal\3ds Max 2021", "C:\Users\Administrator\Personal\3dsMax",
    						"C:\Users\Administrator\Personal\Apowersoft", "C:\Users\Administrator\Personal\Autodesk Application Manager",
    						"C:\Users\Administrator\Personal\WeChat Files", "C:\Users\Administrator\Personal\WeChat Files\test.max",
    						"C:\Users\Administrator\Personal\Windows Live" );
    	
    	fn string_indexing_sort_compfn v1 v2 string_array: = stricmp string_array[v1] string_array[v2];
    	fn string_indexing_search_compfn v1 v2 string_array: = stricmp v1 string_array[v2];
    	indexing = for i = 1 to strings.count collect i;
    	
    	qsort indexing string_indexing_sort_compfn string_array:strings;
    	key_index = bsearch key_string indexing string_indexing_search_compfn string_array:strings;

    )

batchrender

Dec 30, 2020 7:30 pm

If the
key_string="C:\Users\Administrator\Personal\3ds Max"

then the key_index returns undefined,in my example with findstring function,returns

"C:\Users\Administrator\Personal\3ds Max 2020"
"C:\Users\Administrator\Personal\3ds Max 2021"

cgt139746852

Dec 30, 2020 7:30 pm

i’m afraid it’s difficult to do with besearch , the function just search the first element matched
If i need to do this , I will use regular expression ， here is a reference

findstrings=@"C:\Users\Administrator\Personal\WeChat Files"
aa=#(
@"C:\Users\Administrator\Personal",
@"C:\Users\Administrator\Personal\3ds Max 2020",
@"C:\Users\Administrator\Personal\3ds Max 2021",
@"C:\Users\Administrator\Personal\3dsMax",
@"C:\Users\Administrator\Personal\Apowersoft",
@"C:\Users\Administrator\Personal\Autodesk Application Manager",
@"C:\Users\Administrator\Personal\WeChat Files",
@"C:\Users\Administrator\Personal\WeChat Files\test.max",
@"C:\Users\Administrator\Personal\Windows Live"
)
bb=aa as string
cc=substituteString bb "\\" "/"
ss=dotNetObject "System.Text.RegularExpressions.Regex" "" (dotnetclass "System.Text.RegularExpressions.RegexOptions").IgnoreCase
dd=ss.Split cc (substituteString findstrings "\\" "/")
sel=#()
for i = 2 to dd.count do 
if dd[i][1] ==  "\"" then append sel findstrings
	 else append sel (findstrings + substituteString (filterString dd[i] "\",")[1] "/"  "\\"  ) 
sel

denisT

Dec 30, 2020 7:30 pm

how “huge” are your arrays?

do you sure you need “find a sub string” instead of “check matching a pattern”?

in many cases matchpattern works faster than findstring

batchrender

Dec 30, 2020 7:30 pm

Over 500,000 items ,I’ve tried matchpattern,it’s faster than findstring,I want to cost less time to finish the job if possible.

denisT

Dec 30, 2020 7:30 pm

are you always looking from the beginning of path?

at least it looks like from your examples

(
	t0 = timestamp()
	h0 = heapfree

	x = "hjihfidjsfhjihfidjsfhjihfidjsfhjihfidjsfhjihfidjsfhjihfidjsfhjihfidjsfidjsfhjihfidjsfhjihfidjsfhjihfidjsfhjihfidjsf"
	
	for k=1 to 500000 do
	(
		(substring x 1 x.count) == x
	)

	format "time:% heap:%\n" (timestamp() - t0) (h0 - heapfree)
)

this is the worst scenario in our case… but it’s only ~1 sec. Is one sec a problem?

denisT

Dec 30, 2020 7:30 pm

I don’t know how an exact string comparison is done in MXS … maybe using memcmp it might be faster … if we can’t use the SDK, at least .NET methods can give us comparable speed.