Detail: Frame: Method
- Frame.__array__(dtype=None)
Support the __array__ interface, returning an array of values.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.__array__() [[0 1] [2 3] [4 5]]
- Frame.__array_ufunc__(ufunc, method, *args, **kwargs)
Support for NumPy elements or arrays on the left hand of binary operators.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> np.array((1, 0)) * f <Frame> <Index> a b <<U1> <Index> p 0 0 q 2 0 r 4 0 <<U1> <int64> <int64>
- Frame.__bool__()
Raises ValueError to prohibit ambiguous use of truthy evaluation.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> bool(f) ErrorNotTruthy('The truth value of a container is ambiguous. For a truthy indicator of non-empty status, use the `size` attribute.')
- Frame.__dataframe__(nan_as_null=False, allow_copy=True)[source]
Return a data-frame interchange protocol compliant object. See https://data-apis.org/dataframe-protocol/latest for more information.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> dfi = f.__dataframe__() >>> tuple(dfi.get_columns()) (<DFIColumn: shape=(3,) dtype=<i8>, <DFIColumn: shape=(3,) dtype=<i8>)
- Frame.__deepcopy__(memo)[source]
>>> import copy >>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> copy.deepcopy(f) <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64>
- Frame.__len__()[source]
Length of rows in values.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> len(f) 3
- Frame.__round__(decimals=0)[source]
Return a
Frame
rounded to the given decimals. Negative decimals round to the left of the decimal point.- Parameters:
decimals – number of decimals to round to.
- Returns:
>>> f = sf.Frame((np.arange(6).reshape(3,2) * 4/3), index=('p', 'q', 'r'), columns=('a', 'b'), name='y') >>> f <Frame: y> <Index> a b <<U1> <Index> p 0.0 1.3333333333333333 q 2.6666666666666665 4.0 r 5.333333333333333 6.666666666666667 <<U1> <float64> <float64> >>> round(f, 1) <Frame: y> <Index> a b <<U1> <Index> p 0.0 1.3 q 2.7 4.0 r 5.3 6.7 <<U1> <float64> <float64>
- Frame.all(axis=0, skipna=True, out=None)
Logical
and
over values along the specified axis.- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y') >>> f <Frame: y> <Index> c d <<U1> <Index> p False True q False True r False True <<U1> <bool> <bool> >>> f.all() <Series> <Index> c False d True <<U1> <bool>
- Frame.any(axis=0, skipna=True, out=None)
Logical
or
over values along the specified axis.- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y') >>> f <Frame: y> <Index> c d <<U1> <Index> p False True q False True r False True <<U1> <bool> <bool> >>> f.any() <Series> <Index> c False d True <<U1> <bool>
- Frame.astype[key](dtypes, *, consolidate_blocks)
- astype
Retype one or more columns. When used as a function, can be used to retype the entire
Frame
. Alternatively, when used as a__getitem__
interface, loc-style column selection can be used to type one or more coloumns.- Parameters:
dtype – A value suitable for specyfying a NumPy dtype, such as a Python type (float), NumPy array protocol strings (‘f8’), or a dtype instance.
- InterfaceFrameAsType.__getitem__(key)[source]
Selector of columns by label.
- Parameters:
key – A loc selector, either a label, a list of labels, a slice of labels, or a Boolean array.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.astype['c'](object) <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <object>
- Frame.astype(dtype, *, consolidate_blocks)
- astype
Retype one or more columns. When used as a function, can be used to retype the entire
Frame
. Alternatively, when used as a__getitem__
interface, loc-style column selection can be used to type one or more coloumns.- Parameters:
dtype – A value suitable for specyfying a NumPy dtype, such as a Python type (float), NumPy array protocol strings (‘f8’), or a dtype instance.
- InterfaceFrameAsType.__call__(dtype, *, consolidate_blocks=False)[source]
Apply a single
dtype
to all columns.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.astype(float) <Frame: x> <Index> a b <<U1> <Index> p 0.0 1.0 q 2.0 3.0 r 4.0 5.0 <<U1> <float64> <float64>
- Frame.clip(*, lower=None, upper=None, axis=None)[source]
Apply a clip operation to this
Frame
. Note that clip operations can be applied to object types, but cannot be applied to non-numerical objects (e.g., strings, None)- Parameters:
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.clip(lower=2, upper=4) <Frame: x> <Index> a b <<U1> <Index> p 2 2 q 2 3 r 4 4 <<U1> <int64> <int64>
- Frame.consolidate[key]
- consolidate
Consolidate one or more columns. When used as a function, can be used to retype the entire
Frame
. Alternatively, when used as a__getitem__
interface, loc-style column selection can be used to consolidate one or more coloumns.
- InterfaceConsolidate.__getitem__(key)[source]
Return the full
Frame
, selecting withkey
a subset of columns for consolidation.- Parameters:
key – A loc selector, either a label, a list of labels, a slice of labels, or a Boolean array.
>>> f1 = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1), (False, True, True, False), (True, True, False, True)), columns=('a', 'b', 'c', 'd', 'e'), name='x') >>> f1 <Frame: x> <Index> a b c d e <<U1> <Index> 0 0 20 0 False True 1 0 18 0 True True 2 10 -3 0 True False 3 2 18 1 False True <int64> <int64> <int64> <int64> <bool> <bool> >>> f1.consolidate.status <Frame> <Index> loc iloc dtype shape ndim owndata f_contiguous c_contiguous <<U12> <Index> 0 a 0 int64 (4,) 1 True True True 1 b 1 int64 (4,) 1 True True True 2 c 2 int64 (4,) 1 True True True 3 d 3 bool (4,) 1 True True True 4 e 4 bool (4,) 1 True True True <int64> <<U1> <int64> <object> <object> <int64> <bool> <bool> <bool> >>> f2 = f1.consolidate['a':'c'] >>> f2.consolidate.status <Frame> <Index> loc iloc dtype shape ndim owndata f_contiguous c_contiguous <<U12> <Index> 0 slice('a', 'c', N... slice(0, 3, None) int64 (4, 3) 2 True False True 1 d 3 bool (4,) 1 True True True 2 e 4 bool (4,) 1 True True True <int64> <object> <object> <object> <object> <int64> <bool> <bool> <bool>
- Frame.consolidate
- consolidate
Consolidate one or more columns. When used as a function, can be used to retype the entire
Frame
. Alternatively, when used as a__getitem__
interface, loc-style column selection can be used to consolidate one or more coloumns.
- InterfaceConsolidate.__call__()[source]
Apply consolidation to all columns.
>>> f1 = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1), (False, True, True, False), (True, True, False, True)), columns=('a', 'b', 'c', 'd', 'e'), name='x') >>> f1 <Frame: x> <Index> a b c d e <<U1> <Index> 0 0 20 0 False True 1 0 18 0 True True 2 10 -3 0 True False 3 2 18 1 False True <int64> <int64> <int64> <int64> <bool> <bool> >>> f1.consolidate.status <Frame> <Index> loc iloc dtype shape ndim owndata f_contiguous c_contiguous <<U12> <Index> 0 a 0 int64 (4,) 1 True True True 1 b 1 int64 (4,) 1 True True True 2 c 2 int64 (4,) 1 True True True 3 d 3 bool (4,) 1 True True True 4 e 4 bool (4,) 1 True True True <int64> <<U1> <int64> <object> <object> <int64> <bool> <bool> <bool> >>> f2 = f1.consolidate() >>> f2.consolidate.status <Frame> <Index> loc iloc dtype shape ndim owndata f_contiguous c_contiguous <<U12> <Index> 0 slice('a', 'c', N... slice(0, 3, None) int64 (4, 3) 2 True False True 1 slice('d', 'e', N... slice(3, None, None) bool (4, 2) 2 True False True <int64> <object> <object> <object> <object> <int64> <bool> <bool> <bool>
- Frame.consolidate.status
- Frame.consolidate
Consolidate one or more columns. When used as a function, can be used to retype the entire
Frame
. Alternatively, when used as a__getitem__
interface, loc-style column selection can be used to consolidate one or more coloumns.
- InterfaceConsolidate.status
Display consolidation status of this Frame.
>>> f1 = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1), (False, True, True, False), (True, True, False, True)), columns=('a', 'b', 'c', 'd', 'e'), name='x') >>> f1 <Frame: x> <Index> a b c d e <<U1> <Index> 0 0 20 0 False True 1 0 18 0 True True 2 10 -3 0 True False 3 2 18 1 False True <int64> <int64> <int64> <int64> <bool> <bool> >>> f1.consolidate.status <Frame> <Index> loc iloc dtype shape ndim owndata f_contiguous c_contiguous <<U12> <Index> 0 a 0 int64 (4,) 1 True True True 1 b 1 int64 (4,) 1 True True True 2 c 2 int64 (4,) 1 True True True 3 d 3 bool (4,) 1 True True True 4 e 4 bool (4,) 1 True True True <int64> <<U1> <int64> <object> <object> <int64> <bool> <bool> <bool> >>> f2 = f1.consolidate() >>> f2.consolidate.status <Frame> <Index> loc iloc dtype shape ndim owndata f_contiguous c_contiguous <<U12> <Index> 0 slice('a', 'c', N... slice(0, 3, None) int64 (4, 3) 2 True False True 1 slice('d', 'e', N... slice(3, None, None) bool (4, 2) 2 True False True <int64> <object> <object> <object> <object> <int64> <bool> <bool> <bool>
- Frame.corr(*, axis=1)[source]
Compute a correlation matrix.
- Parameters:
axis – if 0, each row represents a variable, with observations as columns; if 1, each column represents a variable, with observations as rows. Defaults to 1.
>>> f1 = sf.Frame((np.concatenate((np.arange(8) * 2, np.arange(8) ** 2)).reshape(4,4)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c', 'd'), name='x') >>> f1 <Frame: x> <Index> a b c d <<U1> <Index> p 0 2 4 6 q 8 10 12 14 r 0 1 4 9 s 16 25 36 49 <<U1> <int64> <int64> <int64> <int64> >>> f1.corr() <Frame: x> <Index> a b c d <<U1> <Index> a 1.0 0.9888513796308233 0.965581028730576 0.9340437381585037 b 0.9888513796308233 0.9999999999999999 0.9923448088115435 0.972134396307783 c 0.9655810287305759 0.9923448088115435 0.9999999999999999 0.9934089501944108 d 0.9340437381585037 0.9721343963077829 0.9934089501944108 1.0 <<U1> <float64> <float64> <float64> <float64>
- Frame.count(*, skipna=True, skipfalsy=False, unique=False, axis=0)[source]
Return the count of non-NA values along the provided
axis
, where 0 provides counts per column, 1 provides counts per row.- Parameters:
axis –
>>> f = sf.Frame.from_items((('a', (10, 2, np.nan, 3)), ('b', ('qrs ', 'XYZ', None, None))), index=('p', 'q', 'r', 's'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 10.0 qrs q 2.0 XYZ r nan None s 3.0 None <<U1> <float64> <object> >>> f.count(skipna=True) <Series> <Index> a 3 b 2 <<U1> <int64> >>> f.count(unique=True) <Series> <Index> a 3 b 2 <<U1> <int64>
- Frame.cov(*, axis=1, ddof=1)[source]
Compute a covariance matrix.
- Parameters:
axis – if 0, each row represents a variable, with observations as columns; if 1, each column represents a variable, with observations as rows. Defaults to 1.
ddof – Delta degrees of freedom, defaults to 1.
>>> f1 = sf.Frame((np.concatenate((np.arange(8) * 2, np.arange(8) ** 2)).reshape(4,4)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c', 'd'), name='x') >>> f1 <Frame: x> <Index> a b c d <<U1> <Index> p 0 2 4 6 q 8 10 12 14 r 0 1 4 9 s 16 25 36 49 <<U1> <int64> <int64> <int64> <int64> >>> f1.cov() <Frame: x> <Index> a b c d <<U1> <Index> a 58.666666666666664 84.0 112.0 142.66666666666666 b 84.0 123.0 166.66666666666666 215.0 c 112.0 166.66666666666666 229.33333333333331 300.0 d 142.66666666666666 215.0 300.0 397.66666666666663 <<U1> <float64> <float64> <float64> <float64>
- Frame.cumprod(axis=0, skipna=True)
Return the cumulative product over the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.cumprod() <Frame> <Index> a b <<U1> <Index> p 0 1 q 0 3 r 0 15 <<U1> <int64> <int64>
- Frame.cumsum(axis=0, skipna=True)
Return the cumulative sum over the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.cumsum() <Frame> <Index> a b <<U1> <Index> p 0 1 q 2 4 r 6 9 <<U1> <int64> <int64>
- Frame.drop_duplicated(*, axis=0, exclude_first=False, exclude_last=False)[source]
Return a
Frame
with duplicated rows (axis 0) or columns (axis 1) removed. All values in the row or column are compared to determine duplication.- Parameters:
axis – Integer specifying axis, where 0 is rows and 1 is columns. Axis 0 is set by default.
exclude_first – Boolean to select if the first duplicated value is excluded.
exclude_last – Boolean to select if the last duplicated value is excluded.
>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 1 2.0 True 1517-04-01 2 nan None NaT 3 2.0 True 1517-04-01 <int64> <float64> <object> <datetime64[D]> >>> f.drop_duplicated() <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 2 nan None NaT <int64> <float64> <object> <datetime64[D]>
- Frame.dropfalsy(axis=0, condition=<function all>)[source]
Return a new Frame after removing rows (axis 0) or columns (axis 1) where any or all values are falsy. The condition is determined by a NumPy ufunc that process the Boolean array returned by
isfalsy()
; the default isnp.all
.- Parameters:
axis –
condition –
>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 qrs 1517-01-01 1 2 XYZ 1517-04-01 2 0 NaT 3 2 123 1517-04-01 <int64> <int64> <<U4> <datetime64[D]> >>> f.dropfalsy() <Frame: x> <Index> a b c <<U1> <Index> 0 10 qrs 1517-01-01 1 2 XYZ 1517-04-01 3 2 123 1517-04-01 <int64> <int64> <<U4> <datetime64[D]>
- Frame.dropna(axis=0, condition=<function all>)[source]
Return a new
Frame
after removing rows (axis 0) or columns (axis 1) where any or all values are NA (NaN or None). The condition is determined by a NumPy ufunc that process the Boolean array returned byisna()
; the default isnp.all
.- Parameters:
axis –
condition –
>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 1 2.0 True 1517-04-01 2 nan None NaT 3 2.0 True 1517-04-01 <int64> <float64> <object> <datetime64[D]> >>> f.dropna() <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 1 2.0 True 1517-04-01 3 2.0 True 1517-04-01 <int64> <float64> <object> <datetime64[D]>
- Frame.duplicated(*, axis=0, exclude_first=False, exclude_last=False)[source]
Return an axis-sized Boolean
Series
that shows True for all rows (axis 0) or columns (axis 1) duplicated.- Parameters:
axis – Integer specifying axis, where 0 is rows and 1 is columns. Axis 0 is set by default.
exclude_first – Boolean to select if the first duplicated value is excluded.
exclude_last – Boolean to select if the last duplicated value is excluded.
>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 1 2.0 True 1517-04-01 2 nan None NaT 3 2.0 True 1517-04-01 <int64> <float64> <object> <datetime64[D]> >>> f.duplicated() <Series> <Index> 0 False 1 True 2 False 3 True <int64> <bool>
- Frame.equals(other, *, compare_name=False, compare_dtype=False, compare_class=False, skipna=True)[source]
Return a
bool
from comparison to any other object.- Parameters:
compare_name – Include equality of the container’s name (and all composed containers) in the comparison.
compare_dtype – Include equality of the container’s dtype (and all composed containers) in the comparison.
compare_class – Include equality of the container’s class (and all composed containers) in the comparison.
skipna – If True, comparisons between missing values are equal.
>>> f1 = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f1 <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f2 = sf.Frame((np.arange(6).reshape(3,2) * 4/3), index=('p', 'q', 'r'), columns=('a', 'b'), name='y') >>> f2 <Frame: y> <Index> a b <<U1> <Index> p 0.0 1.3333333333333333 q 2.6666666666666665 4.0 r 5.333333333333333 6.666666666666667 <<U1> <float64> <float64> >>> f1.equals(f2) False
- Frame.fillfalsy(value)[source]
Return a new
Frame
after replacing falsy values with the supplied value.- Parameters:
value – Value to be used to replace missing values (NaN or None).
>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 qrs 1517-01-01 1 2 XYZ 1517-04-01 2 0 NaT 3 2 123 1517-04-01 <int64> <int64> <<U4> <datetime64[D]> >>> f.fillfalsy(dict(a=1, b='abc', c=np.datetime64('2022-01-10'))) <Frame: x> <Index> a b c <<U1> <Index> 0 10 qrs 1517-01-01 1 2 XYZ 1517-04-01 2 1 abc 2022-01-10 3 2 123 1517-04-01 <int64> <int64> <<U4> <datetime64[D]>
- Frame.fillfalsy_backward(limit=0, *, axis=0)[source]
Return a new
Frame
after filling backward falsy values with the first observed value.- Parameters:
limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 0 20 0 1 0 18 0 2 10 -3 0 3 2 18 1 <int64> <int64> <int64> <int64> >>> f.fillfalsy_backward() <Frame: x> <Index> a b c <<U1> <Index> 0 10 20 1 1 10 18 1 2 10 -3 1 3 2 18 1 <int64> <int64> <int64> <int64>
- Frame.fillfalsy_forward(limit=0, *, axis=0)[source]
Return a new
Frame
after filling forward falsy values with the last observed value.- Parameters:
limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, 0, 0), (8, 3, 8, 0), (1, 0, 0, 0)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 8 1 1 2 3 0 2 0 8 0 3 0 0 0 <int64> <int64> <int64> <int64> >>> f.fillfalsy_forward() <Frame: x> <Index> a b c <<U1> <Index> 0 10 8 1 1 2 3 1 2 2 8 1 3 2 8 1 <int64> <int64> <int64> <int64>
- Frame.fillfalsy_leading(value, *, axis=0)[source]
Return a new
Frame
after filling leading (and only leading) falsy values with the providedvalue
.- Parameters:
value – Value to be used to replace missing values (NaN or None).
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 0 20 0 1 0 18 0 2 10 -3 0 3 2 18 1 <int64> <int64> <int64> <int64> >>> f.fillfalsy_leading(-1) <Frame: x> <Index> a b c <<U1> <Index> 0 -1 20 -1 1 -1 18 -1 2 10 -3 -1 3 2 18 1 <int64> <int64> <int64> <int64>
- Frame.fillfalsy_trailing(value, *, axis=0)[source]
Return a new
Frame
after filling trailing (and only trailing) falsy values with the providedvalue
.- Parameters:
value – Value to be used to replace missing values (NaN or None).
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, 0, 0), (8, 3, 8, 0), (1, 0, 0, 0)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 8 1 1 2 3 0 2 0 8 0 3 0 0 0 <int64> <int64> <int64> <int64> >>> f.fillfalsy_trailing(-1) <Frame: x> <Index> a b c <<U1> <Index> 0 10 8 1 1 2 3 -1 2 -1 8 -1 3 -1 -1 -1 <int64> <int64> <int64> <int64>
- Frame.fillna(value)[source]
Return a new
Frame
after replacing null (NaN or None) values with the supplied value.- Parameters:
value – Value to be used to replace missing values (NaN or None).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 qrs 1517-01-01 1 2.0 XYZ 1517-04-01 2 nan NaT 3 2.0 123 1517-04-01 <int64> <float64> <<U4> <datetime64[D]> >>> f.fillna(-1) <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 qrs 1517-01-01 1 2.0 XYZ 1517-04-01 2 -1.0 -1 3 2.0 123 1517-04-01 <int64> <float64> <<U4> <object>
- Frame.fillna_backward(limit=0, *, axis=0)[source]
Return a new
Frame
after filling backward null (NaN or None) with the first observed value.- Parameters:
limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((np.nan, np.nan, 10, 2), (np.nan, 8, 3, 8), (np.nan, np.nan, np.nan, 1)), columns=('a', 'b', 'c'), name='y') >>> f <Frame: y> <Index> a b c <<U1> <Index> 0 nan nan nan 1 nan 8.0 nan 2 10.0 3.0 nan 3 2.0 8.0 1.0 <int64> <float64> <float64> <float64> >>> f.fillna_backward() <Frame: y> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 10.0 8.0 1.0 2 10.0 3.0 1.0 3 2.0 8.0 1.0 <int64> <float64> <float64> <float64>
- Frame.fillna_forward(limit=0, *, axis=0)[source]
Return a new
Frame
after filling forward null (NaN or None) with the last observed value.- Parameters:
limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.fillna_forward() <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 1.0 2 2.0 8.0 1.0 3 2.0 8.0 1.0 <int64> <float64> <float64> <float64>
- Frame.fillna_leading(value, *, axis=0)[source]
Return a new
Frame
after filling leading (and only leading) null (NaN or None) with the providedvalue
.- Parameters:
value – Value to be used to replace missing values (NaN or None).
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((np.nan, np.nan, 10, 2), (np.nan, 8, 3, 8), (np.nan, np.nan, np.nan, 1)), columns=('a', 'b', 'c'), name='y') >>> f <Frame: y> <Index> a b c <<U1> <Index> 0 nan nan nan 1 nan 8.0 nan 2 10.0 3.0 nan 3 2.0 8.0 1.0 <int64> <float64> <float64> <float64> >>> f.fillna_leading(-1) <Frame: y> <Index> a b c <<U1> <Index> 0 -1.0 -1.0 -1.0 1 -1.0 8.0 -1.0 2 10.0 3.0 -1.0 3 2.0 8.0 1.0 <int64> <float64> <float64> <float64>
- Frame.fillna_trailing(value, *, axis=0)[source]
Return a new
Frame
after filling trailing (and only trailing) null (NaN or None) with the providedvalue
.- Parameters:
value – Value to be used to replace missing values (NaN or None).
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.fillna_trailing(-1) <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 -1.0 2 -1.0 8.0 -1.0 3 -1.0 -1.0 -1.0 <int64> <float64> <float64> <float64>
- Frame.head(count=5)[source]
Return a
Frame
consisting only of the top elements as specified bycount
.- Parameters:
count – Number of elements to be returned from the top of the
Frame
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.head(2) <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 <int64> <int64> <bool> <datetime64[D]>
- Frame.iloc_max(*, skipna=True, axis=0)[source]
Return the integer indices corresponding to the maximum values found.
- Parameters:
skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.iloc_max() <Series> <Index> a 0 b 0 c 0 <<U1> <int64>
- Frame.iloc_min(*, skipna=True, axis=0)[source]
Return the integer indices corresponding to the minimum values found.
- Parameters:
skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.iloc_min() <Series> <Index> a 1 b 1 c 0 <<U1> <int64>
- Frame.iloc_notfalsy_first(*, fill_value=-1, axis=0)[source]
Return the position corresponding to the first non-falsy (including nan) values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> p 10 8 1 q -2 -3 0 r 0 8 9 s 0 0 12 <<U1> <int64> <int64> <int64> >>> f.iloc_notfalsy_first(axis=0) <Series> <Index> a 0 b 0 c 0 <<U1> <int64> >>> f.iloc_notfalsy_first(axis=1) <Series> <Index> p 0 q 0 r 1 s 2 <<U1> <int64>
- Frame.iloc_notfalsy_last(*, fill_value=-1, axis=0)[source]
Return the position corresponding to the last non-falsy (including nan) values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> p 10 8 1 q -2 -3 0 r 0 8 9 s 0 0 12 <<U1> <int64> <int64> <int64> >>> f.iloc_notfalsy_last(axis=0) <Series> <Index> a 1 b 2 c 3 <<U1> <int64> >>> f.iloc_notfalsy_last(axis=1) <Series> <Index> p 2 q 1 r 2 s 2 <<U1> <int64>
- Frame.iloc_notna_first(*, fill_value=-1, axis=0)[source]
Return the position corresponding to the first non-missing values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.iloc_notna_first(axis=0) <Series> <Index> a 0 b 0 c 0 <<U1> <int64> >>> f.iloc_notna_first(axis=1) <Series> <Index> 0 0 1 0 2 1 3 -1 <int64> <int64>
- Frame.iloc_notna_last(*, fill_value=-1, axis=0)[source]
Return the position corresponding to the last non-missing values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.iloc_notna_last(axis=0) <Series> <Index> a 1 b 2 c 0 <<U1> <int64> >>> f.iloc_notna_last(axis=1) <Series> <Index> 0 2 1 1 2 1 3 -1 <int64> <int64>
- Frame.insert_after(key, container, *, fill_value=nan)[source]
Create a new
Frame
by inserting a namedSeries
orFrame
at the position after the label specified bykey
.- Parameters:
key – Label after which the new container will be inserted.
container – Container to be inserted.
fill_value – A value to be used to fill space after reindexing the new container.
- Returns:
>>> f1 = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f1 <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f2 = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y') >>> f2 <Frame: y> <Index> c d <<U1> <Index> p False True q False True r False True <<U1> <bool> <bool> >>> f1.insert_after('b', f2) <Frame: x> <Index> a b c d <<U1> <Index> p 0 1 False True q 2 3 False True r 4 5 False True <<U1> <int64> <int64> <bool> <bool>
- Frame.insert_before(key, container, *, fill_value=nan)[source]
Create a new
Frame
by inserting a namedSeries
orFrame
at the position before the label specified bykey
.- Parameters:
key – Label before which the new container will be inserted.
container – Container to be inserted.
fill_value – A value to be used to fill space after reindexing the new container.
- Returns:
>>> f1 = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f1 <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f2 = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y') >>> f2 <Frame: y> <Index> c d <<U1> <Index> p False True q False True r False True <<U1> <bool> <bool> >>> f1.insert_before('b', f2) <Frame: x> <Index> a c d b <<U1> <Index> p 0 False True 1 q 2 False True 3 r 4 False True 5 <<U1> <int64> <bool> <bool> <int64>
- Frame.isfalsy()[source]
Return a same-indexed, Boolean Frame indicating True which values are falsy.
>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 qrs 1517-01-01 1 2 XYZ 1517-04-01 2 0 NaT 3 2 123 1517-04-01 <int64> <int64> <<U4> <datetime64[D]> >>> f.isfalsy() <Frame> <Index> a b c <<U1> <Index> 0 False False False 1 False False False 2 True True True 3 False False False <int64> <bool> <bool> <bool>
- Frame.isin(other)[source]
Return a same-sized Boolean
Frame
that shows if the same-positioned element is in the passed iterable.>>> f = sf.Frame.from_fields(((10, 2, 0, 0), (8, 3, 8, 0), (1, 0, 0, 0)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 8 1 1 2 3 0 2 0 8 0 3 0 0 0 <int64> <int64> <int64> <int64> >>> f.isin((0, 8)) <Frame: x> <Index> a b c <<U1> <Index> 0 False True False 1 False False True 2 True True True 3 True True True <int64> <bool> <bool> <bool>
- Frame.isna()[source]
Return a same-indexed, Boolean Frame indicating True which values are NaN or None.
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.isna() <Frame> <Index> a b c <<U1> <Index> 0 False False False 1 False False True 2 True False True 3 True True True <int64> <bool> <bool> <bool>
- Frame.join_inner(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform an inner join.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.join_inner(f2, left_columns='c', right_columns='f') <Frame> <Index> a b c d e f <<U1> <Index> 0 11 0 0 7 8 0 1 4 8 1 2 3 1 2 10 3 0 7 8 0 3 2 8 1 2 3 1 <int64> <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.join_left(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform a left outer join.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.join_left(f2, left_columns='c', right_columns='f') <Frame> <Index> a b c d e f <<U1> <Index> 0 11 0 0 7 8 0 1 4 8 1 2 3 1 2 10 3 0 7 8 0 3 2 8 1 2 3 1 <int64> <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.join_outer(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform an outer join.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.join_outer(f2, left_columns='c', right_columns='f') <Frame> <Index> a b c d e f <<U1> <Index> 0 11 0 0 7 8 0 1 4 8 1 2 3 1 2 10 3 0 7 8 0 3 2 8 1 2 3 1 <int64> <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.join_right(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform a right outer join.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.join_right(f2, left_columns='c', right_columns='f') <Frame> <Index> a b c d e f <<U1> <Index> 0 4 8 1 2 3 1 1 2 8 1 2 3 1 2 11 0 0 7 8 0 3 10 3 0 7 8 0 <int64> <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.loc_max(*, skipna=True, axis=0)[source]
Return the labels corresponding to the maximum values found.
- Parameters:
skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.loc_max() <Series> <Index> a 0 b 0 c 0 <<U1> <int64>
- Frame.loc_min(*, skipna=True, axis=0)[source]
Return the labels corresponding to the minimum value found.
- Parameters:
skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.
axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.loc_min() <Series> <Index> a 1 b 1 c 0 <<U1> <int64>
- Frame.loc_notfalsy_first(*, fill_value=nan, axis=0)[source]
Return the labels corresponding to the first non-falsy (including nan) values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> p 10 8 1 q -2 -3 0 r 0 8 9 s 0 0 12 <<U1> <int64> <int64> <int64> >>> f.loc_notfalsy_first(axis=0) <Series> <Index> a p b p c p <<U1> <<U1> >>> f.loc_notfalsy_first(axis=1) <Series> <Index> p a q a r b s c <<U1> <<U1>
- Frame.loc_notfalsy_last(*, fill_value=nan, axis=0)[source]
Return the labels corresponding to the last non-falsy (including nan) values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> p 10 8 1 q -2 -3 0 r 0 8 9 s 0 0 12 <<U1> <int64> <int64> <int64> >>> f.loc_notfalsy_last(axis=0) <Series> <Index> a q b r c s <<U1> <<U1> >>> f.loc_notfalsy_last(axis=1) <Series> <Index> p c q b r c s c <<U1> <<U1>
- Frame.loc_notna_first(*, fill_value=nan, axis=0)[source]
Return the labels corresponding to the first non-missing values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.loc_notna_first(axis=0) <Series> <Index> a 0 b 0 c 0 <<U1> <int64> >>> f.loc_notna_first(axis=1) <Series> <Index> 0 a 1 a 2 b 3 nan <int64> <object>
- Frame.loc_notna_last(*, fill_value=nan, axis=0)[source]
Return the labels corresponding to the last non-missing values along the selected axis.
- Parameters:
{skipna} –
{axis} –
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.loc_notna_last(axis=0) <Series> <Index> a 1 b 2 c 0 <<U1> <int64> >>> f.loc_notna_last(axis=1) <Series> <Index> 0 c 1 b 2 b 3 nan <int64> <object>
- Frame.max(axis=0, skipna=True, out=None)
Return the maximum along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.max() <Series> <Index> a 4 b 5 <<U1> <int64>
- Frame.mean(axis=0, skipna=True, out=None)
Return the mean along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.mean() <Series> <Index> a 2.0 b 3.0 <<U1> <float64>
- Frame.median(axis=0, skipna=True, out=None)
Return the median along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.median() <Series> <Index> a 2.0 b 3.0 <<U1> <float64>
- Frame.merge_inner(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform an inner merge, an inner join where matched columns are coalesced.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –
left. (merge fields will be named from the) –
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.merge_inner(f2, left_columns='c', right_columns='f') <Frame> <Index> c a b d e <<U1> <Index> 0 0 11 0 7 8 1 1 4 8 2 3 2 0 10 3 7 8 3 1 2 8 2 3 <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.merge_left(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform a left merge, a left join where matched columns are coalesced.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –
left. (merge fields will be named from the) –
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.merge_left(f2, left_columns='c', right_columns='f', merge_labels='x') <Frame> <Index> x a b d e <<U1> <Index> 0 0 11 0 7 8 1 1 4 8 2 3 2 0 10 3 7 8 3 1 2 8 2 3 <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.merge_outer(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform an outer merge, an outer join where matched columns are coalesced.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –
left. (merge fields will be named from the) –
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.merge_outer(f2, left_columns='c', right_columns='f', merge_labels='x') <Frame> <Index> x a b d e <<U1> <Index> 0 0 11 0 7 8 1 1 4 8 2 3 2 0 10 3 7 8 3 1 2 8 2 3 <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.merge_right(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]
Perform a right merge, a right join where matched columns are coalesced.
- Parameters:
left_depth_level – Specify one or more left index depths to include in the join predicate.
left_columns – Specify one or more left columns to include in the join predicate.
right_depth_level – Specify one or more right index depths to include in the join predicate.
right_columns – Specify one or more right columns to include in the join predicate.
provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –
left. (merge fields will be named from the) –
left_template – Provide a format string for naming left columns in the joined result.
right_template – Provide a format string for naming right columns in the joined result.
fill_value – A value to be used to fill space created in the join.
True (If) –
Frame. (and appropriate index will be returned in the resultant) –
- Returns:
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y') >>> f2 <Frame: y> <Index> d e f <<U1> <Index> 0 2 3 1 1 7 8 0 <int64> <int64> <int64> <int64> >>> f1.merge_right(f2, left_columns='c', right_columns='f') <Frame> <Index> f a b d e <<U1> <Index> 0 1 4 8 2 3 1 1 2 8 2 3 2 0 11 0 7 8 3 0 10 3 7 8 <int64> <int64> <int64> <int64> <int64> <int64>
- Frame.min(axis=0, skipna=True, out=None)
Return the minimum along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.min() <Series> <Index> a 0 b 1 <<U1> <int64>
- Frame.notfalsy()[source]
Return a same-indexed, Boolean Frame indicating True which values are not falsy.
>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 qrs 1517-01-01 1 2 XYZ 1517-04-01 2 0 NaT 3 2 123 1517-04-01 <int64> <int64> <<U4> <datetime64[D]> >>> f.notfalsy() <Frame> <Index> a b c <<U1> <Index> 0 True True True 1 True True True 2 False False False 3 True True True <int64> <bool> <bool> <bool>
- Frame.notna()[source]
Return a same-indexed, Boolean Frame indicating True which values are not NaN or None.
>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 8.0 1.0 1 2.0 3.0 nan 2 nan 8.0 nan 3 nan nan nan <int64> <float64> <float64> <float64> >>> f.notna() <Frame> <Index> a b c <<U1> <Index> 0 True True True 1 True True False 2 False True False 3 False False False <int64> <bool> <bool> <bool>
- Frame.pivot(index_fields, columns_fields=(), data_fields=(), *, func=<function nansum>, fill_value=nan, index_constructor=None)[source]
Produce a pivot table, where one or more columns is selected for each of index_fields, columns_fields, and data_fields. Unique values from the provided
index_fields
will be used to create a new index; unique values from the providedcolumns_fields
will be used to create a new columns; if onedata_fields
value is selected, that is the value that will be displayed; if more than one values is given, those values will be presented with a hierarchical index on the columns; ifdata_fields
is not provided, all unused fields will be displayed.- Parameters:
index_fields –
columns_fields –
data_fields –
* –
fill_value – If the index expansion produces coordinates that have no existing data value, fill that position with this value.
func – function to apply to
data_fields
, or a dictionary of labelled functions to apply to data fields, producing an additional hierarchical level.index_constructor –
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f1.pivot(index_fields='b', columns_fields='c') <Frame> <Index: c> 0 1 <int64> <Index: b> 0 11.0 nan 3 10.0 nan 8 nan 6.0 <int64> <float64> <float64>
- Frame.pivot_stack(depth_level=-1, *, fill_value=nan)[source]
Move labels from the columns to the index, creating or extending an
IndexHierarchy
on the index.- Parameters:
depth_level – selection of columns depth or depth to move onto the index.
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f1.pivot_stack() <Frame: x> <Index> 0 <int64> <IndexHierarchy> 0 a 11 0 b 0 0 c 0 1 a 4 1 b 8 1 c 1 2 a 10 2 b 3 2 c 0 3 a 2 3 b 8 3 c 1 <int64> <<U1> <int64>
- Frame.pivot_unstack(depth_level=-1, *, fill_value=nan)[source]
Move labels from the index to the columns, creating or extending an
IndexHierarchy
on the columns.- Parameters:
depth_level – selection of index depth or depth to move onto the columns.
>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f1 <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f2 = f1.pivot_stack() >>> f2 <Frame: x> <Index> 0 <int64> <IndexHierarchy> 0 a 11 0 b 0 0 c 0 1 a 4 1 b 8 1 c 1 2 a 10 2 b 3 2 c 0 3 a 2 3 b 8 3 c 1 <int64> <<U1> <int64> >>> f2.pivot_unstack() <Frame: x> <IndexHierarchy> 0 0 0 <int64> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64>
- Frame.prod(axis=0, skipna=True, allna=1, out=None)
Return the product along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.prod() <Series> <Index> a 0 b 15 <<U1> <int64>
- Frame.rank_dense(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]
Rank values as compactly as possible, where ties get the same value, and ranks are contiguous (potentially non-unique) integers.
- Parameters:
axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)
skipna – If
True
, exclude NA values (NaN or None) from ranking, replacing those values withfill_value
.ascending – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.
fill_value – A value to be used to fill NA values ignored in ranking when
skipna
isTrue
. The default isnp.nan
but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.
- Returns:
>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f.rank_dense() <Frame: x> <Index> a b c <<U1> <Index> 0 3 0 0 1 1 2 1 2 2 1 0 3 0 2 1 <int64> <int64> <int64> <int64>
- Frame.rank_max(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]
Rank values where tied values are assigned the maximum ordinal rank; ranks are potentially non-contiguous and non-unique integers.
- Parameters:
axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)
skipna – If
True
, exclude NA values (NaN or None) from ranking, replacing those values withfill_value
.ascending – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.
fill_value – A value to be used to fill NA values ignored in ranking when
skipna
isTrue
. The default isnp.nan
but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.
- Returns:
>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f.rank_max() <Frame: x> <Index> a b c <<U1> <Index> 0 3 0 1 1 1 3 3 2 2 1 1 3 0 3 3 <int64> <int64> <int64> <int64>
- Frame.rank_mean(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]
Rank values where tied values are assigned the mean of the ordinal ranks; ranks are potentially non-contiguous and non-unique floats.
- Parameters:
axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)
skipna – If
True
, exclude NA values (NaN or None) from ranking, replacing those values withfill_value
.ascending – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.
fill_value – A value to be used to fill NA values ignored in ranking when
skipna
isTrue
. The default isnp.nan
but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.
- Returns:
>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f.rank_mean() <Frame: x> <Index> a b c <<U1> <Index> 0 3.0 0.0 0.5 1 1.0 2.5 2.5 2 2.0 1.0 0.5 3 0.0 2.5 2.5 <int64> <float64> <float64> <float64>
- Frame.rank_min(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]
Rank values where tied values are assigned the minimum ordinal rank; ranks are potentially non-contiguous and non-unique integers.
- Parameters:
axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)
skipna – If
True
, exclude NA values (NaN or None) from ranking, replacing those values withfill_value
.ascending – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.
fill_value – A value to be used to fill NA values ignored in ranking when
skipna
isTrue
. The default isnp.nan
but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.
- Returns:
>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f.rank_min() <Frame: x> <Index> a b c <<U1> <Index> 0 3 0 0 1 1 2 2 2 2 1 0 3 0 2 2 <int64> <int64> <int64> <int64>
- Frame.rank_ordinal(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]
Rank values distinctly, where ties get distinct values that maintain their ordering, and ranks are contiguous unique integers.
- Parameters:
axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)
skipna – If
True
, exclude NA values (NaN or None) from ranking, replacing those values withfill_value
.ascending – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.
fill_value – A value to be used to fill NA values ignored in ranking when
skipna
isTrue
. The default isnp.nan
but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.
- Returns:
>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 11 0 0 1 4 8 1 2 10 3 0 3 2 8 1 <int64> <int64> <int64> <int64> >>> f.rank_ordinal() <Frame: x> <Index> a b c <<U1> <Index> 0 3 0 0 1 1 2 2 2 2 1 1 3 0 3 3 <int64> <int64> <int64> <int64>
- Frame.rehierarch(index=None, columns=None, *, index_constructors=None, columns_constructors=None)[source]
Produce a new Frame with index and/or columns constructed with a transformed hierarchy.
- Parameters:
index – Depth level specifier
columns – Depth level specifier
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.rehierarch((1, 0)) <Frame: x> <Index> a b c <<U1> <IndexHierarchy> p 0 10 False 1517-01-01 p 1 8 True 1517-12-31 q 0 2 True 1517-04-01 q 1 3 False 1517-06-30 <<U1> <int64> <int64> <bool> <datetime64[D]>
- Frame.reindex(index=None, columns=None, *, fill_value=nan, own_index=False, own_columns=False, check_equals=True)[source]
Return a new
Frame
with labels defined by the provided index. The size and ordering of the data is determined by the newly provided index, where data will continue to be aligned under labels found in both the new and the old index. Labels found only in the new index will be filled withfill_value
.- Parameters:
index – An iterable of unique, hashable values, or another
Index
orIndexHierarchy
, to be used as the labels of the index.columns – An iterable of unique, hashable values, or another
Index
orIndexHierarchy
, to be used as the labels of the index.fill_value – A value to be used to fill space created by a new index that has values not found in the previous index.
own_index – Flag the passed index as ownable by this
static_frame.Frame
. Primarily used by internal clients.own_columns – Flag the passed columns as ownable by this
static_frame.Frame
. Primarily used by internal clients.check_equals –
>>> f = sf.Frame.from_items((('a', (10, 2, 8, 3)), ('b', ('qrs ', 'XYZ', '123', ' wX '))), index=('p', 'q', 'r', 's'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 10 qrs q 2 XYZ r 8 123 s 3 wX <<U1> <int64> <<U4> >>> f.reindex(('q', 't', 's', 'r'), fill_value=sf.FillValueAuto(i=-1, U='')) <Frame: x> <Index> a b <<U1> <Index> q 2 XYZ t -1 s 3 wX r 8 123 <<U1> <int64> <<U4>
- Frame.relabel(index=None, columns=None, *, index_constructor=None, columns_constructor=None)[source]
Return a new
Frame
with transformed labels on the index. The size and ordering of the data is never changed in a relabeling operation. The resulting index must be unique.- Parameters:
index – One of the following types, used to create new index labels with the same size as the previous index. (a) A mapping (as a dictionary or
Series
), used to lookup and transform the labels in the previous index. Labels not found in the mapping will be reused. (b) A function, returning a hashable, that is applied to each label in the previous index. (c) TheIndexAutoFactory
type, to apply auto-incremented integer labels. (d) AnIndex
initializer, i.e., either an iterable of hashables or anIndex
instance.columns – One of the following types, used to create new columns labels with the same size as the previous columns. (a) A mapping (as a dictionary or
Series
), used to lookup and transform the labels in the previous columns. Labels not found in the mapping will be reused. (b) A function, returning a hashable, that is applied to each label in the previous columns. (c) TheIndexAutoFactory
type, to apply auto-incremented integer labels. (d) AnIndex
initializer, i.e., either an iterable of hashables or anIndex
instance.
>>> f = sf.Frame.from_records(((10, False, '1517-01-01'), (8, True,'1517-04-01')), index=('p', 'q'), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> p 10 False 1517-01-01 q 8 True 1517-04-01 <<U1> <int64> <bool> <datetime64[D]> >>> f.relabel(('y', 'z')) <Frame: x> <Index> a b c <<U1> <Index> y 10 False 1517-01-01 z 8 True 1517-04-01 <<U1> <int64> <bool> <datetime64[D]> >>> f.relabel(dict(q='x', p='y')) <Frame: x> <Index> a b c <<U1> <Index> y 10 False 1517-01-01 x 8 True 1517-04-01 <<U1> <int64> <bool> <datetime64[D]> >>> f.relabel(lambda l: f'+{l.upper()}+') <Frame: x> <Index> a b c <<U1> <Index> +P+ 10 False 1517-01-01 +Q+ 8 True 1517-04-01 <<U3> <int64> <bool> <datetime64[D]>
- Frame.relabel_flat(index=False, columns=False)[source]
Return a new
Frame
, where anIndexHierarchy
(if defined) is replaced with a flat, one-dimension index of tuples.- Parameters:
index – Boolean to flag flatening on the index.
columns – Boolean to flag flatening on the columns.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.relabel_flat(index=True) <Frame: x> <Index> a b c <<U1> <Index> (0, 'p') 10 False 1517-01-01 (0, 'q') 2 True 1517-04-01 (1, 'p') 8 True 1517-12-31 (1, 'q') 3 False 1517-06-30 <object> <int64> <bool> <datetime64[D]>
- Frame.relabel_level_add(index=None, columns=None, *, index_constructor=None, columns_constructor=None)[source]
Return a new
Frame
, adding a new root level to an existingIndexHierarchy
, or creating anIndexHierarchy
if one is not yet defined.- Parameters:
index – A hashable value to be used as a new root level, extending or creating an
IndexHierarchy
columns – A hashable value to be used as a new root level, extending or creating an
IndexHierarchy
* –
index_constructor –
columns_constructor –
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.relabel_level_add('I') <Frame: x> <Index> a b c <<U1> <IndexHierarchy> I 0 p 10 False 1517-01-01 I 0 q 2 True 1517-04-01 I 1 p 8 True 1517-12-31 I 1 q 3 False 1517-06-30 <<U1> <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.relabel_level_drop(index=0, columns=0)[source]
Return a new
Frame
, dropping one or more levels from a either the root or the leaves of anIndexHierarchy
. The resulting index must be unique.- Parameters:
index – A positive integer drops that many outer-most (root) levels; a negative integer drops that many inner-most (leaf)levels. Default is zero.
columns – A positive integer drops that many outer-most (root) levels; a negative integer drops that many inner-most (leaf)levels. Default is zero.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.iloc[:2].relabel_level_drop(1) <Frame: x> <Index> a b c <<U1> <Index> p 10 False 1517-01-01 q 2 True 1517-04-01 <<U1> <int64> <bool> <datetime64[D]>
- Frame.relabel_shift_in(key, *, axis=0, index_constructors=None)[source]
Create, or augment, an
IndexHierarchy
by providing one or more selections from the Frame (via axis-appropriateloc
selections) to move into theIndex
.- Parameters:
key – a loc-style selection on the opposite axis.
axis – 0 modifies the index by selecting columns with
key
; 1 modifies the columns by selecting rows withkey
.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.relabel_shift_in('a') <Frame: x> <Index> b c <<U1> <IndexHierarchy: ('__index0__', '... 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.relabel_shift_out(depth_level, *, axis=0)[source]
Shift values from an index on an axis to the Frame by providing one or more depth level selections.
- Parameters:
dpeth_level – an iloc-style selection on the
Index
of the specified axis.axis – 0 modifies the index by selecting columns with
depth_level
; 1 modifies the columns by selecting rows withdepth_level
.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.rename(index=('d', 'e')).relabel_shift_out([1, 0]) <Frame: x> <Index> e d a b c <<U1> <Index> 0 p 0 10 False 1517-01-01 1 q 0 2 True 1517-04-01 2 p 1 8 True 1517-12-31 3 q 1 3 False 1517-06-30 <int64> <<U1> <int64> <int64> <bool> <datetime64[D]>
- Frame.rename(name=<object object>, *, index=<object object>, columns=<object object>)[source]
Return a new Frame with an updated name attribute. Optionally update the name attribute of
index
andcolumns
.>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.rename('y', index='p', columns='q') <Frame: y> <Index: q> a b c <<U1> <IndexHierarchy: p> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.roll(index=0, columns=0, *, include_index=False, include_columns=False)[source]
Roll columns and/or rows by positive or negative integer counts, where columns and/or rows roll around the axis.
- Parameters:
include_index – Determine if index is included in index-wise rotation.
include_columns – Determine if column index is included in index-wise rotation.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.roll(3) <Frame: x> <Index> a b c <<U1> <Index> 0 2 True 1517-04-01 1 8 True 1517-12-31 2 3 False 1517-06-30 3 10 False 1517-01-01 <int64> <int64> <bool> <datetime64[D]>
- Frame.sample(index=None, columns=None, *, seed=None)[source]
Randomly (optionally made deterministic with a fixed seed) extract items from the container to return a subset of the container.
- Parameters:
index. (Number of labels to select from the) –
columns. (Number of labels to select from the) –
selection. (Initial state of random) –
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.sample(2, 2, seed=0) <Frame: x> <Index> b c <<U1> <Index> 2 True 1517-12-31 3 False 1517-06-30 <int64> <bool> <datetime64[D]>
- Frame.set_columns(index, *, drop=False, columns_constructor=None)[source]
Return a new
Frame
produced by setting the given row as the columns, optionally removing that row from the newFrame
.- Parameters:
index –
* –
drop –
columns_constructor –
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.set_columns((1, 'p'), drop=True) <Frame: x> <Index: (1, 'p')> 8 True 1517-12-31 <object> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.set_columns_hierarchy(index, *, drop=False, columns_constructors=None, reorder_for_hierarchy=False)[source]
Given an iterable of index labels, return a new
Frame
with those rows as anIndexHierarchy
on the columns.- Parameters:
index – Iterable of index labels.
drop – Boolean to determine if selected rows should be removed from the data.
columns_constructors – Optionally provide a sequence of
Index
constructors, of length equal to depth, to be used in converting row Index components in theIndexHierarchy
.reorder_for_hierarchy – reorder the columns to produce a hierarchible Index from the selected columns.
- Returns:
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.set_columns_hierarchy([(1, 'p'), (1, 'q')], drop=True) <Frame: x> <IndexHierarchy: ((1, 'p'), (1, '... 8 True 1517-12-31 <object> 3 False 1517-06-30 <object> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.set_index(column, *, drop=False, index_constructor=None)[source]
Return a new
Frame
produced by setting the given column as the index, optionally removing that column from the newFrame
.- Parameters:
column –
* –
drop –
index_constructor –
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.set_index('c', drop=True, index_constructor=sf.IndexDate) <Frame: x> <Index> a b <<U1> <IndexDate: c> 1517-01-01 10 False 1517-04-01 2 True 1517-12-31 8 True 1517-06-30 3 False <datetime64[D]> <int64> <bool>
- Frame.set_index_hierarchy(columns, *, drop=False, index_constructors=None, reorder_for_hierarchy=False)[source]
Given an iterable of column labels, return a new
Frame
with those columns as anIndexHierarchy
on the index.- Parameters:
columns – Iterable of column labels.
drop – Boolean to determine if selected columns should be removed from the data.
index_constructors – Optionally provide a sequence of
Index
constructors, of length equal to depth, to be used in converting columns Index components in theIndexHierarchy
.reorder_for_hierarchy – reorder the rows to produce a hierarchible Index from the selected columns, assuming hierarchability is possible.
- Returns:
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.set_index_hierarchy(['b', 'c'], drop=True, index_constructors=(sf.Index, sf.IndexDate)) <Frame: x> <Index> a <<U1> <IndexHierarchy: ('b', 'c')> False 1517-01-01 10 True 1517-04-01 2 True 1517-12-31 8 False 1517-06-30 3 <bool> <datetime64[D]> <int64>
- Frame.shift(index=0, columns=0, *, fill_value=nan)[source]
Shift columns and/or rows by positive or negative integer counts, where columns and/or rows fall of the axis and introduce missing values, filled by fill_value.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.shift(3, fill_value=sf.FillValueAuto) <Frame: x> <Index> a b c <<U1> <Index> 0 0 False NaT 1 0 False NaT 2 0 False NaT 3 10 False 1517-01-01 <int64> <int64> <bool> <datetime64[D]>
- Frame.sort_columns(*, ascending=True, kind='mergesort', key=None)[source]
Return a new
Frame
ordered by the sortedcolumns
.- Parameters:
ascendings – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.kind – Name of the sort algorithm as passed to NumPy.
key – A function that is used to pre-process the selected columns or rows and derive new values to sort by.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.sort_columns(ascending=False) <Frame: x> <Index> c b a <<U1> <IndexHierarchy> 0 p 1517-01-01 False 10 0 q 1517-04-01 True 2 1 p 1517-12-31 True 8 1 q 1517-06-30 False 3 <int64> <<U1> <datetime64[D]> <bool> <int64>
- Frame.sort_index(*, ascending=True, kind='mergesort', key=None)[source]
Return a new
Frame
ordered by the sorted Index.- Parameters:
ascendings – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.kind – Name of the sort algorithm as passed to NumPy.
key – A function that is used to pre-process the selected columns or rows and derive new values to sort by.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.sort_index(ascending=False) <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 1 q 3 False 1517-06-30 1 p 8 True 1517-12-31 0 q 2 True 1517-04-01 0 p 10 False 1517-01-01 <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.sort_values(label, *, ascending=True, axis=1, kind='mergesort', key=None)[source]
Return a new
Frame
ordered by the sorted values, where values are given by single column or iterable of columns.- Parameters:
label – A label or iterable of labels to select the columns (for axis 1) or rows (for axis 0) to sort.
* –
ascendings – Boolean, or iterable of Booleans; if
True
, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default isTrue
.axis – Axis upon which to sort; 0 orders columns based on one or more rows; 1 orders rows based on one or more columns.
kind – Name of the sort algorithm as passed to NumPy.
key – A function that is used to pre-process the selected columns or rows and derive new values to sort by.
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.sort_values('c') <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 3 3 False 1517-06-30 2 8 True 1517-12-31 <int64> <int64> <bool> <datetime64[D]> >>> f.sort_values(['c', 'b'], ascending=False) <Frame: x> <Index> a b c <<U1> <Index> 2 8 True 1517-12-31 3 3 False 1517-06-30 1 2 True 1517-04-01 0 10 False 1517-01-01 <int64> <int64> <bool> <datetime64[D]>
- Frame.std(axis=0, skipna=True, ddof=0, out=None)
Return the standard deviaton along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.std() <Series> <Index> a 1.632993161855452 b 1.632993161855452 <<U1> <float64>
- Frame.sum(axis=0, skipna=True, allna=0, out=None)
Sum values along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.sum() <Series> <Index> a 6 b 9 <<U1> <int64>
- Frame.tail(count=5)[source]
Return a
Frame
consisting only of the bottom elements as specified bycount
.- Parameters:
count – Number of elements to be returned from the bottom of the
Frame
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10 False 1517-01-01 1 2 True 1517-04-01 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]> >>> f.tail(2) <Frame: x> <Index> a b c <<U1> <Index> 2 8 True 1517-12-31 3 3 False 1517-06-30 <int64> <int64> <bool> <datetime64[D]>
- Frame.transpose()[source]
Transpose. Return a
Frame
withindex
ascolumns
and vice versa.>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.transpose() <Frame: x> <Index> p q r <<U1> <Index> a 0 2 4 b 1 3 5 <<U1> <int64> <int64> <int64>
- Frame.unique(*, axis=None)[source]
Return a NumPy array of unqiue values. If the axis argument is provided, uniqueness is determined by columns or row.
>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 1 2.0 True 1517-04-01 2 nan None NaT 3 2.0 True 1517-04-01 <int64> <float64> <object> <datetime64[D]> >>> f.unique() [10.0 False datetime.date(1517, 1, 1) 2.0 True datetime.date(1517, 4, 1) nan None]
- Frame.unique_enumerated(*, retain_order=False, func=None)[source]
{doc} {args}
>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <Index> 0 10.0 False 1517-01-01 1 2.0 True 1517-04-01 2 nan None NaT 3 2.0 True 1517-04-01 <int64> <float64> <object> <datetime64[D]> >>> f.unique_enumerated(retain_order=True, func=sf.isna_element) (array([[ 0, 2, 4], [ 1, 3, 5], [-1, -1, -1], [ 1, 3, 5]]), array([10.0, 2.0, False, True, datetime.date(1517, 1, 1), datetime.date(1517, 4, 1)], dtype=object))
- Frame.unset_columns(*, names=(), drop=False, index_constructors=None)[source]
Return a new
Frame
where columns are added to the top of the data, and anIndexAutoFactory
is used to populate new columns. This operation potentially forces a complete copy of all data.- Parameters:
names – An sequence of hashables to be used to name the unset columns. If an
Index
, a single hashable should be provided; if anIndexHierarchy
, as many hashables as the depth must be provided.index_constructors –
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.rename(columns='o').unset_columns() <Frame: x> <Index> 0 1 <int64> <Index> o a b p 0 1 q 2 3 r 4 5 <<U1> <object> <object>
- Frame.unset_index(*, names=(), drop=False, consolidate_blocks=False, columns_constructors=None)[source]
Return a new
Frame
where the index is added to the front of the data, and anIndexAutoFactory
is used to populate a new index. If theIndex
has aname
, that name will be used for the column name, otherwise a suitable default will be used. As underlying NumPy arrays are immutable, data is not copied.- Parameters:
names – An iterable of hashables to be used to name the unset index. If an
Index
, a single hashable should be provided; if anIndexHierarchy
, as many hashables as the depth must be provided.consolidate_blocks –
columns_constructors –
>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x') >>> f <Frame: x> <Index> a b c <<U1> <IndexHierarchy> 0 p 10 False 1517-01-01 0 q 2 True 1517-04-01 1 p 8 True 1517-12-31 1 q 3 False 1517-06-30 <int64> <<U1> <int64> <bool> <datetime64[D]> >>> f.rename(index=(('d', 'e'))).unset_index() <Frame: x> <Index> d e a b c <<U1> <Index> 0 0 p 10 False 1517-01-01 1 0 q 2 True 1517-04-01 2 1 p 8 True 1517-12-31 3 1 q 3 False 1517-06-30 <int64> <int64> <<U1> <int64> <bool> <datetime64[D]>
- Frame.var(axis=0, skipna=True, ddof=0, out=None)
Return the variance along the specified axis.
- Parameters:
axis – Axis, defaulting to axis 0.
skipna – Skip missing (NaN) values, defaulting to True.
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x') >>> f <Frame: x> <Index> a b <<U1> <Index> p 0 1 q 2 3 r 4 5 <<U1> <int64> <int64> >>> f.var() <Series> <Index> a 2.6666666666666665 b 2.6666666666666665 <<U1> <float64>
Frame: Constructor | Exporter | Attribute | Method | Dictionary-Like | Display | Assignment | Selector | Iterator | Operator Binary | Operator Unary | Accessor Values | Accessor Datetime | Accessor String | Accessor Transpose | Accessor Fill Value | Accessor Regular Expression | Accessor Hashlib | Accessor Type Clinic | Accessor Reduce