[Closed] Collect unique arrays/compound values?
I need a function that works like AppendIfUnique, but for arrays. So for example, it would go through the following:
#(1, true, 1, 1, 1, false, false, false, false, false, true, true, 0, false, false, 1)
#(1, true, 1, 1, 1, false, false, false, false, false, true, true, 0, false, false, 1)
#(1, true, 1, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
#(1, true, 2, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
#(1, true, 2, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
#(1, true, 3, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
#(1, true, 5, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
#(1, true, 5, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
and ignores any duplicates to return the following compound array:
#(
#(1, true, 1, 1, 1, false, false, false, false, false, true, true, 0, false, false, 1),
#(1, true, 1, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 2, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 3, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 5, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
)
I’d convert them to text and use AppendIfUnique, but I’m going to need the resulting arrays back in their original format, and short of using execute() there wouldn’t be any way to do that with the resulting strings.
There may be cases where I will end up needing to check thousands of arrays of 20+ values each, so I want something that will run as quickly as possible. I’m fairly certain that every value I’m going to be using is going to be either a boolean or an integer under 30, in case that opens up any other options…
This is what I have so far, FWIW:
for i = arr.count to 1 by -1 do (
for j = (i – 1) to 1 by -1 do (
if arr[i] as string == arr[j] as string do deleteItem arr i
)
)
i would search in hash values… the easiest way to get hashes is
aa =
#(
#(1, true, 1, 1, 1, false, false, false, false, false, true, true, 0, false, false, 1),
#(1, true, 1, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 2, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 3, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 5, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
)
vv = for a in aa collect (gethashvalue (a as string) 0)
after that you can sort hash value and use bsearch if you need to search quicker
It’s incredibly fast!
dataList = #(
#(1, true, 1, 1, 1, false, false, false, false, false, true, true, 0, false, false, 1),
#(1, true, 1, 1, 1, false, false, false, false, false, true, true, 0, false, false, 1),
#(1, true, 1, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 2, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 2, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 3, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1),
#(1, true, 5, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
)
dataList2 = #(
#(1, true, 2, 1, 1, false, false, false, false, true, false, false, 0, false, false, 1),
#(1, true, 3, 1, 1, false, false, false, false, true, false, false, 0, false, false, 1),
#(1, true, 5, 1, 1, false, false, false, false, true, true, false, 0, false, false, 1)
)
-- MakeUnique Array:
hashArray = #()
uniqueArray = for dt in dataList where (hs = gethashvalue (dt as string) 0; appendIfUnique hashArray hs) collect dt
-- AppendIfUnique:
for dt in dataList2 do
(
hs = gethashvalue (dt as string) 0; if (appendIfUnique hashArray hs) do append uniqueArray dt
)
I’ve done some tests with over 5000 items and takes less than 300ms!
All my attemps through C# were far worse both memory and time.
you don’t need to append to hash array at the end. you can use just ‘finditem’ to check…
but as i said, if you need it for many items, it’s better to simply sort hash array and use bsearch.
for big arrays it might help to make whole search and append ~10 times faster
Probably I’m wrong, but for further checks of new items, you need to have an updated array of used hash values.
Using findItem + append gives me the same performance than appendIfUnique.
If all items are known from the begining, sorting+bsearch is the key.