random

#random | 来源: 网络整理| 查看: 265

random --- Generate pseudo-random numbers¶

源码： Lib/random.py

该模块实现了各种分布的伪随机数生成器。

对于整数，从范围中有统一的选择。对于序列，存在随机元素的统一选择、用于生成列表的随机排列的函数、以及用于随机抽样而无需替换的函数。

在实数轴上，有计算均匀、正态（高斯）、对数正态、负指数、伽马和贝塔分布的函数。为了生成角度分布，可以使用 von Mises 分布。

几乎所有模块函数都依赖于基本函数 random()，它在左开右闭区间 0.0 > deck = 'ace two three four'.split() >>> shuffle(deck) # Shuffle a list >>> deck ['four', 'two', 'ace', 'three'] >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement [40, 10, 50, 30]

模拟:

>>> # Six roulette wheel spins (weighted sampling with replacement) >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) ['red', 'green', 'black', 'black', 'red', 'black'] >>> # Deal 20 cards without replacement from a deck >>> # of 52 playing cards, and determine the proportion of cards >>> # with a ten-value: ten, jack, queen, or king. >>> deal = sample(['tens', 'low cards'], counts=[16, 36], k=20) >>> deal.count('tens') / 20 0.15 >>> # Estimate the probability of getting 5 or more heads from 7 spins >>> # of a biased coin that settles on heads 60% of the time. >>> sum(binomialvariate(n=7, p=0.6) >= 5 for i in range(10_000)) / 10_000 0.4169 >>> # Probability of the median of 5 samples being in middle two quartiles >>> def trial(): ... return 2_500 > sum(trial() for i in range(10_000)) / 10_000 0.7958

statistical bootstrapping 的示例，使用重新采样和替换来估计一个样本的均值的置信区间:

# https://www.thoughtco.com/example-of-bootstrapping-3126155 from statistics import fmean as mean from random import choices data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95] means = sorted(mean(choices(data, k=len(data))) for i in range(100)) print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' f'interval from {means[5]:.1f} to {means[94]:.1f}')

使用重新采样排列测试来确定统计学显著性或者使用 p-值来观察药物与安慰剂的作用之间差异的示例:

# Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson from statistics import fmean as mean from random import shuffle drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] observed_diff = mean(drug) - mean(placebo) n = 10_000 count = 0 combined = drug + placebo for i in range(n): shuffle(combined) new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) count += (new_diff >= observed_diff) print(f'{n} label reshufflings produced only {count} instances with a difference') print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') print(f'hypothesis that there is no difference between the drug and the placebo.')

多服务器队列的到达时间和服务交付模拟:

from heapq import heapify, heapreplace from random import expovariate, gauss from statistics import mean, quantiles average_arrival_interval = 5.6 average_service_time = 15.0 stdev_service_time = 3.5 num_servers = 3 waits = [] arrival_time = 0.0 servers = [0.0] * num_servers # time when each server becomes available heapify(servers) for i in range(1_000_000): arrival_time += expovariate(1.0 / average_arrival_interval) next_server_available = servers[0] wait = max(0.0, next_server_available - arrival_time) waits.append(wait) service_duration = max(0.0, gauss(average_service_time, stdev_service_time)) service_completed = arrival_time + wait + service_duration heapreplace(servers, service_completed) print(f'Mean wait: {mean(waits):.1f} Max wait: {max(waits):.1f}') print('Quartiles:', [round(q, 1) for q in quantiles(waits)])

参见

Statistics for Hackers Jake Vanderplas 撰写的视频教程，使用一些基本概念进行统计分析，包括模拟、抽样、洗牌和交叉验证。

Economics Simulation 是 Peter Norvig 编写的市场模拟，它演示了对此模块所提供的许多工具和分布（gauss, uniform, sample, betavariate, choice, triangular 和 randrange）的高效运用。

A Concrete Introduction to Probability (using Python) 是 Peter Norvig 撰写的教程，其中涉及概率论基础、如何编写模拟以及如何使用 Python 进行数据分析等内容。

例程¶

这些例程演示了如何有效地使用 itertools 模块中的组合迭代器进行随机选取:

def random_product(*args, repeat=1): "Random selection from itertools.product(*args, **kwds)" pools = [tuple(pool) for pool in args] * repeat return tuple(map(random.choice, pools)) def random_permutation(iterable, r=None): "Random selection from itertools.permutations(iterable, r)" pool = tuple(iterable) r = len(pool) if r is None else r return tuple(random.sample(pool, r)) def random_combination(iterable, r): "Random selection from itertools.combinations(iterable, r)" pool = tuple(iterable) n = len(pool) indices = sorted(random.sample(range(n), r)) return tuple(pool[i] for i in indices) def random_combination_with_replacement(iterable, r): "Choose r elements with replacement. Order the result to match the iterable." # Result will be in set(itertools.combinations_with_replacement(iterable, r)). pool = tuple(iterable) n = len(pool) indices = sorted(random.choices(range(n), k=r)) return tuple(pool[i] for i in indices)

默认的 random() 返回在 0.0 ≤ x < 1.0 范围内 2⁻⁵³ 的倍数。所有这些数值间隔相等并能精确表示为 Python 浮点数。但是在此间隔上有许多其他可表示浮点数是不可能的选择。例如，0.05954861408025609 就不是 2⁻⁵³ 的整数倍。

以下规范程序采取了一种不同的方式。在间隔上的所有浮点数都是可能的选择。它们的尾数取值来自 2⁵² ≤ 尾数 < 2⁵³ 范围内整数的均匀分布。指数取值则来自几何分布，其中小于 -53 的指数的出现频率为下一个较大指数的一半。

from random import Random from math import ldexp class FullRandom(Random): def random(self): mantissa = 0x10_0000_0000_0000 | self.getrandbits(52) exponent = -53 x = 0 while not x: x = self.getrandbits(32) exponent += x.bit_length() - 32 return ldexp(mantissa, exponent)

该类中所有的实值分布都将使用新的方法:

>>> fr = FullRandom() >>> fr.random() 0.05954861408025609 >>> fr.expovariate(0.25) 8.87925541791544

该规范程序在概念上等效于在 0.0 ≤ x < 1.0 范围内对所有 2⁻¹⁰⁷⁴ 的倍数进行选择的算法。所有这样的数字间隔都相等，但大多必须向下舍入为最接近的 Python 浮点数表示形式。（2⁻¹⁰⁷⁴ 这个数值是等于 math.ulp(0.0) 的未经正规化的最小正浮点数。）

参见

生成伪随机浮点数值为 Allen B. Downey 所撰写的描述如何生成相比通过 random() 正常生成的数值更细粒度浮点数的论文。

【本文地址】

random

random

今日新闻

推荐新闻