Function reference¶
Bottleneck provides the following functions:
NumPy/SciPy | median , nanmedian , rankdata ,
ss , nansum , nanmin ,
nanmax , nanmean , nanstd ,
nanargmin , nanargmax |
Functions | nanrankdata , nanvar ,
partsort , argpartsort ,
replace , nn , anynan ,
allnan |
Moving window | move_sum , move_nansum ,
move_mean , move_nanmean ,
move_median ,
move_std , move_nanstd ,
move_min , move_nanmin ,
move_max , move_nanmax |
NumPy/SciPy¶
Fast replacements for NumPy and SciPy functions.
-
bottleneck.
median
(arr, axis=None)¶ Median of array elements along given axis.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the median is computed. The default (axis=None) is to compute the median of the flattened array.
Returns: y : ndarray
An array with the same shape as arr, except that the specified axis has been removed. If arr is a 0d array, or if axis is None, a scalar is returned. float64 return values are used for integer inputs.
See also
bottleneck.nanmedian
- Median along specified axis ignoring NaNs.
Notes
This function returns the same output as NumPy’s median except when the input contains NaN.
Examples
>>> a = np.array([[10, 7, 4], [3, 2, 1]]) >>> a array([[10, 7, 4], [ 3, 2, 1]]) >>> bn.median(a) 3.5 >>> bn.median(a, axis=0) array([ 6.5, 4.5, 2.5]) >>> bn.median(a, axis=1) array([ 7., 2.])
-
bottleneck.
nanmedian
(arr, axis=None)¶ Median of array elements along given axis ignoring NaNs.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the median is computed. The default (axis=None) is to compute the median of the flattened array.
Returns: y : ndarray
An array with the same shape as arr, except that the specified axis has been removed. If arr is a 0d array, or if axis is None, a scalar is returned. float64 return values are used for integer inputs.
See also
bottleneck.median
- Median along specified axis.
Examples
>>> a = np.array([[np.nan, 7, 4], [3, 2, 1]]) >>> a array([[ nan, 7., 4.], [ 3., 2., 1.]]) >>> bn.nanmedian(a) 3.0 >> bn.nanmedian(a, axis=0) array([ 3. , 4.5, 2.5]) >> bn.nanmedian(a, axis=1) array([ 5.5, 2. ])
-
bottleneck.
rankdata
(arr, axis=None)¶ Ranks the data, dealing with ties appropriately.
Equal values are assigned a rank that is the average of the ranks that would have been otherwise assigned to all of the values within that set. Ranks begin at 1, not 0.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the elements of the array are ranked. The default (axis=None) is to rank the elements of the flattened array.
Returns: y : ndarray
An array with the same shape as arr. The dtype is ‘float64’.
See also
bottleneck.nanrankdata
- Ranks the data dealing with ties and NaNs.
Examples
>>> bn.rankdata([0, 2, 2, 3]) array([ 1. , 2.5, 2.5, 4. ]) >>> bn.rankdata([[0, 2], [2, 3]]) array([ 1. , 2.5, 2.5, 4. ]) >>> bn.rankdata([[0, 2], [2, 3]], axis=0) array([[ 1., 1.], [ 2., 2.]]) >>> bn.rankdata([[0, 2], [2, 3]], axis=1) array([[ 1., 2.], [ 1., 2.]])
-
bottleneck.
ss
(arr, axis=0)¶ Sum of the square of each element along specified axis.
Parameters: arr : array_like
Array whose sum of squares is desired. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the sum if squared is computed. The default (axis=0) is to sum the squares along the first dimension.
Returns: y : ndarray
The sum of a**2 along the given axis.
See also
bottleneck.nn
- Nearest neighbor.
Examples
>>> a = np.array([1., 2., 5.]) >>> bn.ss(a) 30.0
And calculating along an axis:
>>> b = np.array([[1., 2., 5.], [2., 5., 6.]]) >>> bn.ss(b, axis=1) array([ 30., 65.])
-
bottleneck.
nansum
(arr, axis=None)¶ Sum of array elements along given axis ignoring NaNs.
When the input has an integer type with less precision than the default platform integer, the default platform integer is used for the accumulator and return values.
Parameters: arr : array_like
Array containing numbers whose sum is desired. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the sum is computed. The default (axis=None) is to compute the sum of the flattened array.
Returns: y : ndarray
An array with the same shape as arr, with the specified axis removed. If arr is a 0-d array, or if axis is None, a scalar is returned.
Notes
No error is raised on overflow.
If positive or negative infinity are present the result is positive or negative infinity. But if both positive and negative infinity are present, the result is Not A Number (NaN).
Examples
>>> bn.nansum(1) 1 >>> bn.nansum([1]) 1 >>> bn.nansum([1, np.nan]) 1.0 >>> a = np.array([[1, 1], [1, np.nan]]) >>> bn.nansum(a) 3.0 >>> bn.nansum(a, axis=0) array([ 2., 1.])
When positive infinity and negative infinity are present:
>>> bn.nansum([1, np.nan, np.inf]) inf >>> bn.nansum([1, np.nan, np.NINF]) -inf >>> bn.nansum([1, np.nan, np.inf, np.NINF]) nan
-
bottleneck.
nanmin
(arr, axis=None)¶ Minimum values along specified axis, ignoring NaNs.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the minimum is computed. The default (axis=None) is to compute the minimum of the flattened array.
Returns: y : ndarray
An array with the same shape as arr, with the specified axis removed. If arr is a 0-d array, or if axis is None, a scalar is returned.
See also
bottleneck.nanmax
- Maximum along specified axis, ignoring NaNs.
bottleneck.nanargmin
- Indices of minimum values along axis, ignoring NaNs.
Examples
>>> bn.nanmin(1) 1 >>> bn.nanmin([1]) 1 >>> bn.nanmin([1, np.nan]) 1.0 >>> a = np.array([[1, 4], [1, np.nan]]) >>> bn.nanmin(a) 1.0 >>> bn.nanmin(a, axis=0) array([ 1., 4.])
-
bottleneck.
nanmax
(arr, axis=None)¶ Maximum values along specified axis, ignoring NaNs.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the maximum is computed. The default (axis=None) is to compute the maximum of the flattened array.
Returns: y : ndarray
An array with the same shape as arr, with the specified axis removed. If arr is a 0-d array, or if axis is None, a scalar is returned.
See also
bottleneck.nanmin
- Minimum along specified axis, ignoring NaNs.
bottleneck.nanargmax
- Indices of maximum values along axis, ignoring NaNs.
Examples
>>> bn.nanmax(1) 1 >>> bn.nanmax([1]) 1 >>> bn.nanmax([1, np.nan]) 1.0 >>> a = np.array([[1, 4], [1, np.nan]]) >>> bn.nanmax(a) 4.0 >>> bn.nanmax(a, axis=0) array([ 1., 4.])
-
bottleneck.
nanmean
(arr, axis=None)¶ Mean of array elements along given axis ignoring NaNs.
float64 intermediate and return values are used for integer inputs.
Parameters: arr : array_like
Array containing numbers whose mean is desired. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the mean is computed. The default (axis=None) is to compute the mean of the flattened array.
Returns: y : ndarray
An array with the same shape as arr, with the specified axis removed. If arr is a 0-d array, or if axis is None, a scalar is returned. float64 intermediate and return values are used for integer inputs.
See also
bottleneck.nanmedian
- Median along specified axis, ignoring NaNs.
Notes
No error is raised on overflow. (The sum is computed and then the result is divided by the number of non-NaN elements.)
If positive or negative infinity are present the result is positive or negative infinity. But if both positive and negative infinity are present, the result is Not A Number (NaN).
Examples
>>> bn.nanmean(1) 1.0 >>> bn.nanmean([1]) 1.0 >>> bn.nanmean([1, np.nan]) 1.0 >>> a = np.array([[1, 4], [1, np.nan]]) >>> bn.nanmean(a) 2.0 >>> bn.nanmean(a, axis=0) array([ 1., 4.])
When positive infinity and negative infinity are present:
>>> bn.nanmean([1, np.nan, np.inf]) inf >>> bn.nanmean([1, np.nan, np.NINF]) -inf >>> bn.nanmean([1, np.nan, np.inf, np.NINF]) nan
-
bottleneck.
nanstd
(arr, axis=None, int ddof=0)¶ Standard deviation along the specified axis, ignoring NaNs.
float64 intermediate and return values are used for integer inputs.
Instead of a faster one-pass algorithm, a more stable two-pass algorithm is used.
An example of a one-pass algorithm:
>>> np.sqrt((arr*arr).mean() - arr.mean()**2)
An example of a two-pass algorithm:
>>> np.sqrt(((arr - arr.mean())**2).mean())
Note in the two-pass algorithm the mean must be found (first pass) before the squared deviation (second pass) can be found.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the standard deviation is computed. The default (axis=None) is to compute the standard deviation of the flattened array.
ddof : int, optional
Means Delta Degrees of Freedom. The divisor used in calculations is
N - ddof
, whereN
represents the number of elements. By default ddof is zero.Returns: y : ndarray
An array with the same shape as arr, with the specified axis removed. If arr is a 0-d array, or if axis is None, a scalar is returned. float64 intermediate and return values are used for integer inputs.
See also
bottleneck.nanvar
- Variance along specified axis ignoring NaNs
Notes
If positive or negative infinity are present the result is Not A Number (NaN).
Examples
>>> bn.nanstd(1) 0.0 >>> bn.nanstd([1]) 0.0 >>> bn.nanstd([1, np.nan]) 0.0 >>> a = np.array([[1, 4], [1, np.nan]]) >>> bn.nanstd(a) 1.4142135623730951 >>> bn.nanstd(a, axis=0) array([ 0., 0.])
When positive infinity or negative infinity are present NaN is returned:
>>> bn.nanstd([1, np.nan, np.inf]) nan
-
bottleneck.
nanargmin
(arr, axis=None)¶ Indices of the minimum values along an axis, ignoring NaNs.
Parameters: a : array_like
Input data.
axis : {int, None}, optional
Axis along which to operate. By default (axis=None) flattened input is used.
Returns: index_array : ndarray
An array of indices or a single index value.
See also
bottleneck.nanargmax
- Indices of the maximum values along an axis.
bottleneck.nanmin
- Minimum values along specified axis, ignoring NaNs.
Examples
>>> a = np.array([[np.nan, 4], [2, 3]]) >>> bn.nanargmin(a) 2 >>> a.flat[1] 2.0 >>> bn.nanargmax(a, axis=0) array([1, 1]) >>> bn.nanargmax(a, axis=1) array([1, 0])
-
bottleneck.
nanargmax
(arr, axis=None)¶ Indices of the maximum values along an axis, ignoring NaNs.
Parameters: a : array_like
Input data.
axis : {int, None}, optional
Axis along which to operate. By default (axis=None) flattened input is used.
Returns: index_array : ndarray
An array of indices or a single index value.
See also
bottleneck.nanargmin
- Indices of the minimum values along an axis.
bottleneck.nanmax
- Maximum values along specified axis, ignoring NaNs.
Examples
>>> a = np.array([[np.nan, 4], [2, 3]]) >>> bn.nanargmax(a) 1 >>> a.flat[1] 4.0 >>> bn.nanargmax(a, axis=0) array([1, 0]) >>> bn.nanargmax(a, axis=1) array([1, 1])
Functions¶
Miscellaneous functions.
-
bottleneck.
nanrankdata
(arr, axis=None)¶ Ranks the data, dealing with ties and NaNs appropriately.
Equal values are assigned a rank that is the average of the ranks that would have been otherwise assigned to all of the values within that set. Ranks begin at 1, not 0.
NaNs in the input array are returned as NaNs.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the elements of the array are ranked. The default (axis=None) is to rank the elements of the flattened array.
Returns: y : ndarray
An array with the same shape as arr. The dtype is ‘float64’.
See also
bottleneck.rankdata
- Ranks the data, dealing with ties and appropriately.
Examples
>>> bn.nanrankdata([np.nan, 2, 2, 3]) array([ nan, 1.5, 1.5, 3. ]) >>> bn.nanrankdata([[np.nan, 2], [2, 3]]) array([ nan, 1.5, 1.5, 3. ]) >>> bn.nanrankdata([[np.nan, 2], [2, 3]], axis=0) array([[ nan, 1.], [ 1., 2.]]) >>> bn.nanrankdata([[np.nan, 2], [2, 3]], axis=1) array([[ nan, 1.], [ 1., 2.]])
-
bottleneck.
nanvar
(arr, axis=None, int ddof=0)¶ Variance along the specified axis, ignoring NaNs.
float64 intermediate and return values are used for integer inputs.
Instead of a faster one-pass algorithm, a more stable two-pass algorithm is used.
An example of a one-pass algorithm:
>>> np.sqrt((arr*arr).mean() - arr.mean()**2)
An example of a two-pass algorithm:
>>> np.sqrt(((arr - arr.mean())**2).mean())
Note in the two-pass algorithm the mean must be found (first pass) before the squared deviation (second pass) can be found.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which the variance is computed. The default (axis=None)is to compute the variance of the flattened array.
ddof : int, optional
Means Delta Degrees of Freedom. The divisor used in calculations is
N - ddof
, whereN
represents the number of elements. By default ddof is zero.Returns: y : ndarray
An array with the same shape as arr, with the specified axis removed. If arr is a 0-d array, or if axis is None, a scalar is returned. float64 intermediate and return values are used for integer inputs.
See also
bottleneck.nanstd
- Standard deviation along specified axis ignoring NaNs.
Notes
If positive or negative infinity are present the result is Not A Number (NaN).
Examples
>>> bn.nanvar(1) 0.0 >>> bn.nanvar([1]) 0.0 >>> bn.nanvar([1, np.nan]) 0.0 >>> a = np.array([[1, 4], [1, np.nan]]) >>> bn.nanvar(a) 2.0 >>> bn.nanvar(a, axis=0) array([ 0., 0.])
When positive infinity or negative infinity are present NaN is returned:
>>> bn.nanvar([1, np.nan, np.inf]) nan
-
bottleneck.
partsort
(arr, n, axis=-1)¶ Partial sorting of array elements along given axis.
A partially sorted array is one in which the n smallest values appear (in any order) in the first n elements. The remaining largest elements are also unordered. Due to the algorithm used (Wirth’s method), the nth smallest element is in its sorted position (at index n-1).
Shuffling the input array may change the output. The only guarantee is that the first n elements will be the n smallest and the remaining element will appear in the remainder of the output.
This functions is not protected against NaN. Therefore, you may get unexpected results if the input contains NaN.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
n : int
The n smallest elements will appear (unordered) in the first n elements of the output array.
axis : {int, None}, optional
Axis along which the partial sort is performed. The default (axis=-1) is to sort along the last axis.
Returns: y : ndarray
A partially sorted copy of the input array where the n smallest elements will appear (unordered) in the first n elements.
See also
bottleneck.argpartsort
- Indices that would partially sort an array
Notes
Unexpected results may occur if the input array contains NaN.
Examples
Create a numpy array:
>>> a = np.array([1, 0, 3, 4, 2])
Partially sort array so that the first 3 elements are the smallest 3 elements (note, as in this example, that the smallest 3 elements may not be sorted):
>>> bn.partsort(a, n=3) array([1, 0, 2, 4, 3])
-
bottleneck.
argpartsort
(arr, n, axis=-1)¶ Return indices that would partially sort an array.
A partially sorted array is one in which the n smallest values appear (in any order) in the first n elements. The remaining largest elements are also unordered. Due to the algorithm used (Wirth’s method), the nth smallest element is in its sorted position (at index n-1).
Shuffling the input array may change the output. The only guarantee is that the first n elements will be the n smallest and the remaining element will appear in the remainder of the output.
This functions is not protected against NaN. Therefore, you may get unexpected results if the input contains NaN.
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
n : int
The indices of the n smallest elements will appear in the first n elements of the output array along the given axis.
axis : {int, None}, optional
Axis along which the partial sort is performed. The default (axis=-1) is to sort along the last axis.
Returns: y : ndarray
An array the same shape as the input array containing the indices that partially sort arr such that the n smallest elements will appear (unordered) in the first n elements.
See also
bottleneck.partsort
- Partial sorting of array elements along given axis.
Notes
Unexpected results may occur if the input array contains NaN.
Examples
Create a numpy array:
>>> a = np.array([1, 0, 3, 4, 2])
Find the indices that partially sort that array so that the first 3 elements are the smallest 3 elements:
>>> index = bn.argpartsort(a, n=3) >>> index array([0, 1, 4, 3, 2])
Let’s use the indices to partially sort the array (note, as in this example, that the smallest 3 elements may not be in order):
>>> a[index] array([1, 0, 2, 4, 3])
-
bottleneck.
replace
(arr, old, new)¶ Replace (inplace) given scalar values of an array with new values.
The equivalent numpy function:
arr[arr==old] = newOr in the case where old=np.nan:
arr[np.isnan(old)] = newParameters: arr : numpy.ndarray
The input array, which is also the output array since this functions works inplace.
old : scalar
All elements in arr with this value will be replaced by new.
new : scalar
All elements in arr with a value of old will be replaced by new.
Returns: None, the operation is inplace. :
Examples
Replace zero with 3 (note that the input array is modified):
>>> a = np.array([1, 2, 0]) >>> bn.replace(a, 0, 3) >>> a array([1, 2, 3])
Replace np.nan with 0:
>>> a = np.array([1, 2, np.nan]) >>> bn.replace(a, np.nan, 0) >>> a array([ 1., 2., 0.])
-
bottleneck.
nn
(arr, arr0, int axis=1)¶ Distance of nearest neighbor (and its index) along specified axis.
The Euclidian distance between arr0 and its nearest neighbor in arr is returned along with the index of the nearest neighbor in arr.
The squared distance used to determine the nearest neighbor of arr0 is equivalent to np.sum((arr - arr0) ** 2), axis) where arr is 2d and arr0 is 1d and arr0 must be reshaped if axis is 1.
If all distances are NaN then the distance returned is NaN and the index is zero.
Parameters: arr : array_like
A 2d array. If arr is not an array, a conversion is attempted.
arr0 : array_like
A 1d array. If arr0 is not an array, a conversion is attempted.
axis : int, optional
Axis along which the distance is computed. The default (axis=1) is to compute the distance along rows.
Returns: dist : np.float64
The Euclidian distance between arr0 and the nearest neighbor in arr. If all distances are NaN then the distance returned is NaN.
idx : int
Index of nearest neighbor in arr. If all distances are NaN then the index returned is zero.
See also
bottleneck.ss
- Sum of squares along specified axis.
Notes
A brute force algorithm is used to find the nearest neighbor.
Depending on the shapes of arr and arr0, SciPy’s cKDTree may be faster than bn.nn(). So benchmark if speed is important.
The relative speed also depends on how many times you will use the same array arr to find nearest neighbors with different arr0. That is because it takes time to set up SciPy’s cKDTree.
Examples
Create the input arrays:
>>> arr = np.array([[1, 2], [3, 4]]) >>> arr0 = np.array([2, 4])
Find nearest neighbor of arr0 in arr along axis 1:
>>> dist, idx = bn.nn(arr, arr0, axis=1) >>> dist 1.0 >>> idx 1
Find nearest neighbor of arr0 in arr along axis 0:
>>> dist, idx = bn.nn(arr, arr0, axis=0) >>> dist 0.0 >>> idx 1
-
bottleneck.
anynan
(arr, axis=None)¶ Test whether any array element along a given axis is NaN.
Returns single boolean unless axis is not
None
.Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which NaNs are searched. The default (axis =
None
) is to search for NaNs over a flattened input array. axis may be negative, in which case it counts from the last to the first axis.Returns: y : bool or ndarray
A new boolean or ndarray is returned.
See also
bottleneck.allnan
- Test if all array elements along given axis are NaN
Examples
>>> bn.anynan(1) False >>> bn.anynan(np.nan) True >>> bn.anynan([1, np.nan]) True >>> a = np.array([[1, 4], [1, np.nan]]) >>> bn.anynan(a) True >>> bn.anynan(a, axis=0) array([False, True], dtype=bool)
-
bottleneck.
allnan
(arr, axis=None)¶ Test whether all array elements along a given axis are NaN.
Returns single boolean unless axis is not
None
.Note that allnan([]) is True to match np.isnan([]).all().
Parameters: arr : array_like
Input array. If arr is not an array, a conversion is attempted.
axis : {int, None}, optional
Axis along which NaNs are searched. The default (axis =
None
) is to search for NaNs over a flattened input array. axis may be negative, in which case it counts from the last to the first axis.Returns: y : bool or ndarray
A new boolean or ndarray is returned.
See also
bottleneck.anynan
- Test if any array element along given axis is NaN
Examples
>>> bn.allnan(1) False >>> bn.allnan(np.nan) True >>> bn.allnan([1, np.nan]) False >>> a = np.array([[1, np.nan], [1, np.nan]]) >>> bn.allnan(a) False >>> bn.allnan(a, axis=0) array([False, True], dtype=bool)
An empty array returns True:
>>> bn.allnan([]) True
which is similar to:
>>> all([]) True >>> np.isnan([]).all() True