Quantcast
Viewing all articles
Browse latest Browse all 3

Answer by Paul Panzer for Get indices of N maximum values in a numpy array without sorting them?

It probably depends a bit on the sizes of a and k but often the fastest appears to be combining partition with flatnonzero or where:

>>> a = np.random.random(10000)>>> k = 5>>> >>> timeit("np.flatnonzero(a >= np.partition(a, len(a) - k)[len(a) - k])", globals=globals(), number=10000)0.8328661819687113>>> timeit("np.sort(np.argpartition(a, len(a) - k)[len(a) - k:])", globals=globals(), number=10000)1.0577796879806556>>> np.flatnonzero(a >= np.partition(a, len(a) - k)[len(a) - k])array([2527, 4299, 5531, 6945, 7174])>>> np.sort(np.argpartition(a, len(a) - k)[len(a) - k:])array([2527, 4299, 5531, 6945, 7174])

Note 1: this highlights the significant performance cost of indirect indexing.

Note 2: as we only use the pivot element and discard the actual partition percentile should in theory be at least as fast but in practice it is way slower.


Viewing all articles
Browse latest Browse all 3

Trending Articles