starting array:
a = np.array([1,1,1,2,3,4,5,5])
and filter:
m = np.array([1,5])
i building mask with:
b = np.in1d(a,m)
that correctly returns:
array([ true, true, true, false, false, false, true, true], dtype=bool)
i need limit number of boolean true
s unique values maximum value of 2, 1 masked 2 times instead of three). resulting mask appear (no matter order of first real true
values):
array([ true, true, false, false, false, false, true, true], dtype=bool)
or
array([ true, false, true, false, false, false, true, true], dtype=bool)
or
array([ false, true, true, false, false, false, true, true], dtype=bool)
ideally kind of "random" masking on limited frequency of values. far tried random select original unique elements in array, mask select true
values no matter frequency.
for generic case unsorted input array, here's 1 approach based on np.searchsorted
-
n = 2 # parameter decide how many duplicates allowed sortidx = a.argsort() idx = np.searchsorted(a,m,sorter=sortidx)[:,none] + np.arange(n) lim_counts = (a[:,none] == m).sum(0).clip(max=n) idx_clipped = idx[lim_counts[:,none] > np.arange(n)] out = np.in1d(np.arange(a.size),idx_clipped)[sortidx.argsort()]
sample run -
in [37]: out[37]: array([5, 1, 4, 2, 1, 3, 5, 1]) in [38]: m out[38]: [1, 2, 5] in [39]: n out[39]: 2 in [40]: out out[40]: array([ true, true, false, true, true, false, true, false], dtype=bool)
Comments
Post a Comment