A Fast Way to Find the Largest N Elements in an Numpy Array |
您所在的位置:网站首页 › numpymaximum › A Fast Way to Find the Largest N Elements in an Numpy Array |
A fast way to find the largest N elements in an numpy array The bottleneck module has a fast partial sort method that works directly with Numpy arrays: bottleneck.partition(). Note that bottleneck.partition() returns the actual values sorted, if you want the indexes of the sorted values (what numpy.argsort() returns) you should use bottleneck.argpartition(). I've benchmarked: z = -bottleneck.partition(-a, 10)[:10]z = a.argsort()[-10:]z = heapq.nlargest(10, a)where a is a random 1,000,000-element array. The timings were as follows: bottleneck.partition(): 25.6 ms per loopnp.argsort(): 198 ms per loopheapq.nlargest(): 358 ms per loop How do I get indices of N maximum values in a NumPy array?Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do >>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])>>> aarray([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])>>> ind = np.argpartition(a, -4)[-4:]>>> indarray([1, 5, 8, 0]) >>> top4 = a[ind]>>> top4array([4, 9, 6, 9]) Unlike argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind]. If you need that too, sort them afterwards: >>> ind[np.argsort(a[ind])]array([1, 8, 5, 0])To get the top-k elements in sorted order in this way takes O(n + k log k) time. Quickest way to find the nth largest value in a numpy MatrixYou can flatten the matrix and then sort it: >>> k = np.array([[ 35, 48, 63],... [ 60, 77, 96],... [ 91, 112, 135]])>>> flat=k.flatten()>>> flat.sort()>>> flatarray([ 35, 48, 60, 63, 77, 91, 96, 112, 135])>>> flat[-2]112>>> flat[-3]96 N largest values in each row of ndarrayYou can use np.partition in the same way as the question you linked: the sorting is already along the last axis: In [2]: a = np.array([[ 5, 4, 3, 2, 1], [10, 9, 8, 7, 6]])In [3]: b = np.partition(a, -3) # top 3 values from each rowIn [4]: b[:,-3:]Out[4]: array([[ 3, 4, 5], [ 8, 9, 10]]) how to get the index of the largest n values in a multi-dimensional numpy arrayI don't have access to bottleneck, so in this example I am using argsort, but you should be able to use it in the same way: #!/usr/bin/env pythonimport numpy as npN = 4a = np.random.random(20).reshape(4, 5)print(a)# Convert it into a 1D arraya_1d = a.flatten() # Find the indices in the 1D arrayidx_1d = a_1d.argsort()[-N:] # convert the idx_1d back into indices arrays for each dimensionx_idx, y_idx = np.unravel_index(idx_1d, a.shape) # Check that we got the largest values.for x, y, in zip(x_idx, y_idx): print(a[x][y]) How to get first 5 maximum values from numpy array in python?You can do this (each step is commented for clarity): import numpy as npx = np.array([3, 4, 2, 1, 7, 8, 6, 5, 9])y = x.copy() # |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |