A Fast Way to Find the Largest N Elements in an Numpy Array

您所在的位置:网站首页 numpymaximum A Fast Way to Find the Largest N Elements in an Numpy Array

A Fast Way to Find the Largest N Elements in an Numpy Array

#A Fast Way to Find the Largest N Elements in an Numpy Array| 来源: 网络整理| 查看: 265

A fast way to find the largest N elements in an numpy array

The bottleneck module has a fast partial sort method that works directly with Numpy arrays: bottleneck.partition().

Note that bottleneck.partition() returns the actual values sorted, if you want the indexes of the sorted values (what numpy.argsort() returns) you should use bottleneck.argpartition().

I've benchmarked:

z = -bottleneck.partition(-a, 10)[:10]z = a.argsort()[-10:]z = heapq.nlargest(10, a)

where a is a random 1,000,000-element array.

The timings were as follows:

bottleneck.partition(): 25.6 ms per loopnp.argsort(): 198 ms per loopheapq.nlargest(): 358 ms per loop How do I get indices of N maximum values in a NumPy array?

Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do

>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])>>> aarray([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])

>>> ind = np.argpartition(a, -4)[-4:]>>> indarray([1, 5, 8, 0])

>>> top4 = a[ind]>>> top4array([4, 9, 6, 9])

Unlike argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind]. If you need that too, sort them afterwards:

>>> ind[np.argsort(a[ind])]array([1, 8, 5, 0])

To get the top-k elements in sorted order in this way takes O(n + k log k) time.

Quickest way to find the nth largest value in a numpy Matrix

You can flatten the matrix and then sort it:

>>> k = np.array([[ 35, 48, 63],... [ 60, 77, 96],... [ 91, 112, 135]])>>> flat=k.flatten()>>> flat.sort()>>> flatarray([ 35, 48, 60, 63, 77, 91, 96, 112, 135])>>> flat[-2]112>>> flat[-3]96 N largest values in each row of ndarray

You can use np.partition in the same way as the question you linked: the sorting is already along the last axis:

In [2]: a = np.array([[ 5, 4, 3, 2, 1], [10, 9, 8, 7, 6]])In [3]: b = np.partition(a, -3) # top 3 values from each rowIn [4]: b[:,-3:]Out[4]: array([[ 3, 4, 5], [ 8, 9, 10]]) how to get the index of the largest n values in a multi-dimensional numpy array

I don't have access to bottleneck, so in this example I am using argsort, but you should be able to use it in the same way:

#!/usr/bin/env pythonimport numpy as npN = 4a = np.random.random(20).reshape(4, 5)print(a)

# Convert it into a 1D arraya_1d = a.flatten()

# Find the indices in the 1D arrayidx_1d = a_1d.argsort()[-N:]

# convert the idx_1d back into indices arrays for each dimensionx_idx, y_idx = np.unravel_index(idx_1d, a.shape)

# Check that we got the largest values.for x, y, in zip(x_idx, y_idx): print(a[x][y])

How to get first 5 maximum values from numpy array in python?

You can do this (each step is commented for clarity):

import numpy as npx = np.array([3, 4, 2, 1, 7, 8, 6, 5, 9])

y = x.copy() #



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3