numpy , pandas 划分bins

您所在的位置:网站首页 bins_array怎么设置数值 numpy , pandas 划分bins

numpy , pandas 划分bins

2024-07-09 13:20| 来源: 网络整理| 查看: 265

numpy 中划分bins,并计算一个bin内的均值 import numpy data = np.array([range(100)]) bins = numpy.linspace(0, 50, 10) bins=np.append(bins,np.inf)#最后一个bin到无穷大 digitized = numpy.digitize(data, bins)#Return the indices of the bins to which each value in input array belongs. # 计算bin内均值法一 bin_means = [data[digitized == i].mean() for i in range(1, len(bins))] #法二 bin_means1 = (numpy.histogram(data, bins, weights=data)[0] / numpy.histogram(data, bins)[0]) # https://stackoverflow.com/questions/6163334/binning-data-in-python-with-scipy-numpy

如果numpy.digitize(data, bins)中,data,超过bins的边缘,那么函数会自动在bins边缘加一个bin,如:

data=np.array([-1,0.5,1.5,2.5,3.5,4.5,5,6]) bins=np.linspace(0,5,6) print(bins) di=np.digitize(data,bins) dt=np.c_[data,di] print(dt) ''' [0. 1. 2. 3. 4. 5.] [[-1. 0. ] [ 0.5 1. ] [ 1.5 2. ] [ 2.5 3. ] [ 3.5 4. ] [ 4.5 5. ] [ 5. 6. ] [ 6. 6. ]] '''

解释下法二, numpy.histogram(a, bins=10, range=None, normed=None, weights=None, density=None)

Returns – histarray The values of the histogram. See density and weights for a description of the possible semantics. – bin_edges array of dtype float Return the bin edges (length(hist)+1).Parameters – weights array_like, optional An array of weights, of the same shape as a. Each value in a only contributes its associated weight towards the bin count (instead of 1).

举例说明这里怎么计算均值,一个bin里包括[1,2,3,4],那么 n u m p y . h i s t o g r a m ( d a t a , b i n s , w e i g h t s = d a t a ) [ 0 ] / n u m p y . h i s t o g r a m ( d a t a , b i n s ) [ 0 ] = ( 1 ∗ 1 + 2 ∗ 1 + 3 ∗ 1 + 4 ∗ 1 ) / 4 = 2.5 numpy.histogram(data, bins, weights=data)[0] /numpy.histogram(data, bins)[0]=(1*1+2*1+3*1+4*1)/4=2.5 numpy.histogram(data,bins,weights=data)[0]/numpy.histogram(data,bins)[0]=(1∗1+2∗1+3∗1+4∗1)/4=2.5

pandas 划分bins a=pd.DataFrame(np.random.rand(10,1),columns=['A']) a['A_cat']=pd.cut(a['A'],bins=np.linspace(0,1,5),labels=[1,2,3,4])

显然labels应该比bins多一个。 参考:

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlowhttps://stackoverflow.com/questions/6163334/binning-data-in-python-with-scipy-numpy


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3