回答(8)

2 years ago

percentile() is available在 numpy 中也是如此 .

import numpy as np

a = np.array([1,2,3,4,5])

p = np.percentile(a, 50) # return 50th percentile, e.g median.

print p

3.0

这张票让我相信他们不会很快将百分之百()变成numpy .

2 years ago

## {{{ http://code.activestate.com/recipes/511478/ (r1)

import math

import functools

def percentile(N, percent, key=lambda x:x):

"""

Find the percentile of a list of values.

@parameter N - is a list of values. Note N MUST BE already sorted.

@parameter percent - a float value from 0.0 to 1.0.

@parameter key - optional key function to compute value from each element of N.

@return - the percentile of the values

"""

if not N:

return None

k = (len(N)-1) * percent

f = math.floor(k)

c = math.ceil(k)

if f == c:

return key(N[int(k)])

d0 = key(N[int(f)]) * (c-k)

d1 = key(N[int(c)]) * (k-f)

return d0+d1

# median is 50th percentile.

median = functools.partial(percentile, percent=0.5)

## end of http://code.activestate.com/recipes/511478/ }}}

2 years ago

import numpy as np

a = [154, 400, 1124, 82, 94, 108]

print np.percentile(a,95) # gives the 95th percentile

2 years ago

我通常看到的百分位数的定义结果是所提供的列表中的值,其中P%的值被找到...这意味着结果必须来自集合,而不是集合元素之间的插值 . 为此,您可以使用更简单的功能 .

def percentile(N, P):

"""

Find the percentile of a list of values

@parameter N - A list of values. N must be sorted.

@parameter P - A float value from 0.0 to 1.0

@return - The percentile of the values.

"""

n = int(round(P * len(N) + 0.5))

return N[n-1]

# A = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

# B = (15, 20, 35, 40, 50)

#

# print percentile(A, P=0.3)

# 4

# print percentile(A, P=0.8)

# 9

# print percentile(B, P=0.3)

# 20

# print percentile(B, P=0.8)

# 50

如果您希望从提供的列表中获取值,或者在其中找到P%的值,则使用以下简单修改:

def percentile(N, P):

n = int(round(P * len(N) + 0.5))

if n > 1:

return N[n-2]

else:

return N[0]

或者@ijustlovemath建议的简化:

def percentile(N, P):

n = max(int(round(P * len(N) + 0.5)), 2)

return N[n-2]

2 years ago

检查scipy.stats模块:

scipy.stats.scoreatpercentile

2 years ago

这里是如何在没有numpy的情况下完成它,只使用python来计算百分位数 .

import math

def percentile(data, percentile):

size = len(data)

return sorted(data)[int(math.ceil((size * percentile) / 100)) - 1]

p5 = percentile(mylist, 5)

p25 = percentile(mylist, 25)

p50 = percentile(mylist, 50)

p75 = percentile(mylist, 75)

p95 = percentile(mylist, 95)

2 years ago

要计算系列的百分位数,请运行:

from scipy.stats import rankdata

import numpy as np

def calc_percentile(a, method='min'):

if isinstance(a, list):

a = np.asarray(a)

return rankdata(a, method=method) / float(len(a))

例如:

a = range(20)

print {val: round(percentile, 3) for val, percentile in zip(a, calc_percentile(a))}

>>> {0: 0.05, 1: 0.1, 2: 0.15, 3: 0.2, 4: 0.25, 5: 0.3, 6: 0.35, 7: 0.4, 8: 0.45, 9: 0.5, 10: 0.55, 11: 0.6, 12: 0.65, 13: 0.7, 14: 0.75, 15: 0.8, 16: 0.85, 17: 0.9, 18: 0.95, 19: 1.0}

2 years ago

In case you need the answer to be a member of the input numpy array:

只是补充一点,默认情况下numpy中的百分位函数将输出计算为输入向量中两个相邻条目的线性加权平均值 . 在某些情况下,人们可能希望返回的百分位数是向量的实际元素,在这种情况下,从v1.9.0起,您可以使用“插值”选项,“低”,“高”或“最近” .

import numpy as np

x=np.random.uniform(10,size=(1000))-5.0

np.percentile(x,70) # 70th percentile

2.075966046220879

np.percentile(x,70,interpolation="nearest")

2.0729677997904314

后者是向量中的实际条目,而前者是边界百分位数的两个向量条目的线性插值

Logo

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。

更多推荐