Detail: Frame: Method

Overview: Frame: Method

Frame.__array__(dtype=None)

Support the __array__ interface, returning an array of values.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.__array__()
[[0 1]
 [2 3]
 [4 5]]
Frame.__array_ufunc__(ufunc, method, *args, **kwargs)

Support for NumPy elements or arrays on the left hand of binary operators.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> np.array((1, 0)) * f
<Frame>
<Index> a       b       <<U1>
<Index>
p       0       0
q       2       0
r       4       0
<<U1>   <int64> <int64>
Frame.__bool__()

Raises ValueError to prohibit ambiguous use of truthy evaluation.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> bool(f)
ErrorNotTruthy('The truth value of a container is ambiguous. For a truthy indicator of non-empty status, use the `size` attribute.')
Frame.__dataframe__(nan_as_null=False, allow_copy=True)[source]

Return a data-frame interchange protocol compliant object. See https://data-apis.org/dataframe-protocol/latest for more information.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> dfi = f.__dataframe__()
>>> tuple(dfi.get_columns())
(<DFIColumn: shape=(3,) dtype=<i8>, <DFIColumn: shape=(3,) dtype=<i8>)
Frame.__deepcopy__(memo)[source]
>>> import copy
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> copy.deepcopy(f)
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
Frame.__len__()[source]

Length of rows in values.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> len(f)
3
Frame.__round__(decimals=0)[source]

Return a Frame rounded to the given decimals. Negative decimals round to the left of the decimal point.

Parameters:

decimals – number of decimals to round to.

Returns:

Frame

>>> f = sf.Frame((np.arange(6).reshape(3,2) * 4/3), index=('p', 'q', 'r'), columns=('a', 'b'), name='y')
>>> f
<Frame: y>
<Index>    a                  b                  <<U1>
<Index>
p          0.0                1.3333333333333333
q          2.6666666666666665 4.0
r          5.333333333333333  6.666666666666667
<<U1>      <float64>          <float64>
>>> round(f, 1)
<Frame: y>
<Index>    a         b         <<U1>
<Index>
p          0.0       1.3
q          2.7       4.0
r          5.3       6.7
<<U1>      <float64> <float64>
Frame.all(axis=0, skipna=True, out=None)

Logical and over values along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y')
>>> f
<Frame: y>
<Index>    c      d      <<U1>
<Index>
p          False  True
q          False  True
r          False  True
<<U1>      <bool> <bool>
>>> f.all()
<Series>
<Index>
c        False
d        True
<<U1>    <bool>
Frame.any(axis=0, skipna=True, out=None)

Logical or over values along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y')
>>> f
<Frame: y>
<Index>    c      d      <<U1>
<Index>
p          False  True
q          False  True
r          False  True
<<U1>      <bool> <bool>
>>> f.any()
<Series>
<Index>
c        False
d        True
<<U1>    <bool>
Frame.astype[key](dtypes, *, consolidate_blocks)
astype

Retype one or more columns. When used as a function, can be used to retype the entire Frame. Alternatively, when used as a __getitem__ interface, loc-style column selection can be used to type one or more coloumns.

Parameters:

dtype – A value suitable for specyfying a NumPy dtype, such as a Python type (float), NumPy array protocol strings (‘f8’), or a dtype instance.

InterfaceFrameAsType.__getitem__(key)[source]

Selector of columns by label.

Parameters:

key – A loc selector, either a label, a list of labels, a slice of labels, or a Boolean array.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.astype['c'](object)
<Frame: x>
<Index>    a       b      c          <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <object>
Frame.astype(dtype, *, consolidate_blocks)
astype

Retype one or more columns. When used as a function, can be used to retype the entire Frame. Alternatively, when used as a __getitem__ interface, loc-style column selection can be used to type one or more coloumns.

Parameters:

dtype – A value suitable for specyfying a NumPy dtype, such as a Python type (float), NumPy array protocol strings (‘f8’), or a dtype instance.

InterfaceFrameAsType.__call__(dtype, *, consolidate_blocks=False)[source]

Apply a single dtype to all columns.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.astype(float)
<Frame: x>
<Index>    a         b         <<U1>
<Index>
p          0.0       1.0
q          2.0       3.0
r          4.0       5.0
<<U1>      <float64> <float64>
Frame.clip(*, lower=None, upper=None, axis=None)[source]

Apply a clip operation to this Frame. Note that clip operations can be applied to object types, but cannot be applied to non-numerical objects (e.g., strings, None)

Parameters:
>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.clip(lower=2, upper=4)
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          2       2
q          2       3
r          4       4
<<U1>      <int64> <int64>
Frame.consolidate[key]
consolidate

Consolidate one or more columns. When used as a function, can be used to retype the entire Frame. Alternatively, when used as a __getitem__ interface, loc-style column selection can be used to consolidate one or more coloumns.

InterfaceConsolidate.__getitem__(key)[source]

Return the full Frame, selecting with key a subset of columns for consolidation.

Parameters:

key – A loc selector, either a label, a list of labels, a slice of labels, or a Boolean array.

>>> f1 = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1), (False, True, True, False), (True, True, False, True)), columns=('a', 'b', 'c', 'd', 'e'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       d      e      <<U1>
<Index>
0          0       20      0       False  True
1          0       18      0       True   True
2          10      -3      0       True   False
3          2       18      1       False  True
<int64>    <int64> <int64> <int64> <bool> <bool>
>>> f1.consolidate.status
<Frame>
<Index> loc   iloc    dtype    shape    ndim    owndata f_contiguous c_contiguous <<U12>
<Index>
0       a     0       int64    (4,)     1       True    True         True
1       b     1       int64    (4,)     1       True    True         True
2       c     2       int64    (4,)     1       True    True         True
3       d     3       bool     (4,)     1       True    True         True
4       e     4       bool     (4,)     1       True    True         True
<int64> <<U1> <int64> <object> <object> <int64> <bool>  <bool>       <bool>
>>> f2 = f1.consolidate['a':'c']
>>> f2.consolidate.status
<Frame>
<Index> loc                  iloc              dtype    shape    ndim    owndata f_contiguous c_contiguous <<U12>
<Index>
0       slice('a', 'c', N... slice(0, 3, None) int64    (4, 3)   2       True    False        True
1       d                    3                 bool     (4,)     1       True    True         True
2       e                    4                 bool     (4,)     1       True    True         True
<int64> <object>             <object>          <object> <object> <int64> <bool>  <bool>       <bool>
Frame.consolidate
consolidate

Consolidate one or more columns. When used as a function, can be used to retype the entire Frame. Alternatively, when used as a __getitem__ interface, loc-style column selection can be used to consolidate one or more coloumns.

InterfaceConsolidate.__call__()[source]

Apply consolidation to all columns.

>>> f1 = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1), (False, True, True, False), (True, True, False, True)), columns=('a', 'b', 'c', 'd', 'e'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       d      e      <<U1>
<Index>
0          0       20      0       False  True
1          0       18      0       True   True
2          10      -3      0       True   False
3          2       18      1       False  True
<int64>    <int64> <int64> <int64> <bool> <bool>
>>> f1.consolidate.status
<Frame>
<Index> loc   iloc    dtype    shape    ndim    owndata f_contiguous c_contiguous <<U12>
<Index>
0       a     0       int64    (4,)     1       True    True         True
1       b     1       int64    (4,)     1       True    True         True
2       c     2       int64    (4,)     1       True    True         True
3       d     3       bool     (4,)     1       True    True         True
4       e     4       bool     (4,)     1       True    True         True
<int64> <<U1> <int64> <object> <object> <int64> <bool>  <bool>       <bool>
>>> f2 = f1.consolidate()
>>> f2.consolidate.status
<Frame>
<Index> loc                  iloc                 dtype    shape    ndim    owndata f_contiguous c_contiguous <<U12>
<Index>
0       slice('a', 'c', N... slice(0, 3, None)    int64    (4, 3)   2       True    False        True
1       slice('d', 'e', N... slice(3, None, None) bool     (4, 2)   2       True    False        True
<int64> <object>             <object>             <object> <object> <int64> <bool>  <bool>       <bool>
Frame.consolidate.status
Frame.consolidate

Consolidate one or more columns. When used as a function, can be used to retype the entire Frame. Alternatively, when used as a __getitem__ interface, loc-style column selection can be used to consolidate one or more coloumns.

InterfaceConsolidate.status

Display consolidation status of this Frame.

>>> f1 = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1), (False, True, True, False), (True, True, False, True)), columns=('a', 'b', 'c', 'd', 'e'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       d      e      <<U1>
<Index>
0          0       20      0       False  True
1          0       18      0       True   True
2          10      -3      0       True   False
3          2       18      1       False  True
<int64>    <int64> <int64> <int64> <bool> <bool>
>>> f1.consolidate.status
<Frame>
<Index> loc   iloc    dtype    shape    ndim    owndata f_contiguous c_contiguous <<U12>
<Index>
0       a     0       int64    (4,)     1       True    True         True
1       b     1       int64    (4,)     1       True    True         True
2       c     2       int64    (4,)     1       True    True         True
3       d     3       bool     (4,)     1       True    True         True
4       e     4       bool     (4,)     1       True    True         True
<int64> <<U1> <int64> <object> <object> <int64> <bool>  <bool>       <bool>
>>> f2 = f1.consolidate()
>>> f2.consolidate.status
<Frame>
<Index> loc                  iloc                 dtype    shape    ndim    owndata f_contiguous c_contiguous <<U12>
<Index>
0       slice('a', 'c', N... slice(0, 3, None)    int64    (4, 3)   2       True    False        True
1       slice('d', 'e', N... slice(3, None, None) bool     (4, 2)   2       True    False        True
<int64> <object>             <object>             <object> <object> <int64> <bool>  <bool>       <bool>
Frame.corr(*, axis=1)[source]

Compute a correlation matrix.

Parameters:

axis – if 0, each row represents a variable, with observations as columns; if 1, each column represents a variable, with observations as rows. Defaults to 1.

>>> f1 = sf.Frame((np.concatenate((np.arange(8) * 2, np.arange(8) ** 2)).reshape(4,4)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c', 'd'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       d       <<U1>
<Index>
p          0       2       4       6
q          8       10      12      14
r          0       1       4       9
s          16      25      36      49
<<U1>      <int64> <int64> <int64> <int64>
>>> f1.corr()
<Frame: x>
<Index>    a                  b                  c                  d                  <<U1>
<Index>
a          1.0                0.9888513796308233 0.965581028730576  0.9340437381585037
b          0.9888513796308233 0.9999999999999999 0.9923448088115435 0.972134396307783
c          0.9655810287305759 0.9923448088115435 0.9999999999999999 0.9934089501944108
d          0.9340437381585037 0.9721343963077829 0.9934089501944108 1.0
<<U1>      <float64>          <float64>          <float64>          <float64>
Frame.count(*, skipna=True, skipfalsy=False, unique=False, axis=0)[source]

Return the count of non-NA values along the provided axis, where 0 provides counts per column, 1 provides counts per row.

Parameters:

axis

>>> f = sf.Frame.from_items((('a', (10, 2, np.nan, 3)), ('b', ('qrs ', 'XYZ', None, None))), index=('p', 'q', 'r', 's'), name='x')
>>> f
<Frame: x>
<Index>    a         b        <<U1>
<Index>
p          10.0      qrs
q          2.0       XYZ
r          nan       None
s          3.0       None
<<U1>      <float64> <object>
>>> f.count(skipna=True)
<Series>
<Index>
a        3
b        2
<<U1>    <int64>
>>> f.count(unique=True)
<Series>
<Index>
a        3
b        2
<<U1>    <int64>
Frame.cov(*, axis=1, ddof=1)[source]

Compute a covariance matrix.

Parameters:
  • axis – if 0, each row represents a variable, with observations as columns; if 1, each column represents a variable, with observations as rows. Defaults to 1.

  • ddof – Delta degrees of freedom, defaults to 1.

>>> f1 = sf.Frame((np.concatenate((np.arange(8) * 2, np.arange(8) ** 2)).reshape(4,4)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c', 'd'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       d       <<U1>
<Index>
p          0       2       4       6
q          8       10      12      14
r          0       1       4       9
s          16      25      36      49
<<U1>      <int64> <int64> <int64> <int64>
>>> f1.cov()
<Frame: x>
<Index>    a                  b                  c                  d                  <<U1>
<Index>
a          58.666666666666664 84.0               112.0              142.66666666666666
b          84.0               123.0              166.66666666666666 215.0
c          112.0              166.66666666666666 229.33333333333331 300.0
d          142.66666666666666 215.0              300.0              397.66666666666663
<<U1>      <float64>          <float64>          <float64>          <float64>
Frame.cumprod(axis=0, skipna=True)

Return the cumulative product over the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.cumprod()
<Frame>
<Index> a       b       <<U1>
<Index>
p       0       1
q       0       3
r       0       15
<<U1>   <int64> <int64>
Frame.cumsum(axis=0, skipna=True)

Return the cumulative sum over the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.cumsum()
<Frame>
<Index> a       b       <<U1>
<Index>
p       0       1
q       2       4
r       6       9
<<U1>   <int64> <int64>
Frame.drop_duplicated(*, axis=0, exclude_first=False, exclude_last=False)[source]

Return a Frame with duplicated rows (axis 0) or columns (axis 1) removed. All values in the row or column are compared to determine duplication.

Parameters:
  • axis – Integer specifying axis, where 0 is rows and 1 is columns. Axis 0 is set by default.

  • exclude_first – Boolean to select if the first duplicated value is excluded.

  • exclude_last – Boolean to select if the last duplicated value is excluded.

>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
1          2.0       True     1517-04-01
2          nan       None     NaT
3          2.0       True     1517-04-01
<int64>    <float64> <object> <datetime64[D]>
>>> f.drop_duplicated()
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
2          nan       None     NaT
<int64>    <float64> <object> <datetime64[D]>
Frame.dropfalsy(axis=0, condition=<function all>)[source]

Return a new Frame after removing rows (axis 0) or columns (axis 1) where any or all values are falsy. The condition is determined by a NumPy ufunc that process the Boolean array returned by isfalsy(); the default is np.all.

Parameters:
  • axis

  • condition

>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b     c               <<U1>
<Index>
0          10      qrs   1517-01-01
1          2       XYZ   1517-04-01
2          0             NaT
3          2       123   1517-04-01
<int64>    <int64> <<U4> <datetime64[D]>
>>> f.dropfalsy()
<Frame: x>
<Index>    a       b     c               <<U1>
<Index>
0          10      qrs   1517-01-01
1          2       XYZ   1517-04-01
3          2       123   1517-04-01
<int64>    <int64> <<U4> <datetime64[D]>
Frame.dropna(axis=0, condition=<function all>)[source]

Return a new Frame after removing rows (axis 0) or columns (axis 1) where any or all values are NA (NaN or None). The condition is determined by a NumPy ufunc that process the Boolean array returned by isna(); the default is np.all.

Parameters:
  • axis

  • condition

>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
1          2.0       True     1517-04-01
2          nan       None     NaT
3          2.0       True     1517-04-01
<int64>    <float64> <object> <datetime64[D]>
>>> f.dropna()
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
1          2.0       True     1517-04-01
3          2.0       True     1517-04-01
<int64>    <float64> <object> <datetime64[D]>
Frame.duplicated(*, axis=0, exclude_first=False, exclude_last=False)[source]

Return an axis-sized Boolean Series that shows True for all rows (axis 0) or columns (axis 1) duplicated.

Parameters:
  • axis – Integer specifying axis, where 0 is rows and 1 is columns. Axis 0 is set by default.

  • exclude_first – Boolean to select if the first duplicated value is excluded.

  • exclude_last – Boolean to select if the last duplicated value is excluded.

>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
1          2.0       True     1517-04-01
2          nan       None     NaT
3          2.0       True     1517-04-01
<int64>    <float64> <object> <datetime64[D]>
>>> f.duplicated()
<Series>
<Index>
0        False
1        True
2        False
3        True
<int64>  <bool>
Frame.equals(other, *, compare_name=False, compare_dtype=False, compare_class=False, skipna=True)[source]

Return a bool from comparison to any other object.

Parameters:
  • compare_name – Include equality of the container’s name (and all composed containers) in the comparison.

  • compare_dtype – Include equality of the container’s dtype (and all composed containers) in the comparison.

  • compare_class – Include equality of the container’s class (and all composed containers) in the comparison.

  • skipna – If True, comparisons between missing values are equal.

>>> f1 = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f2 = sf.Frame((np.arange(6).reshape(3,2) * 4/3), index=('p', 'q', 'r'), columns=('a', 'b'), name='y')
>>> f2
<Frame: y>
<Index>    a                  b                  <<U1>
<Index>
p          0.0                1.3333333333333333
q          2.6666666666666665 4.0
r          5.333333333333333  6.666666666666667
<<U1>      <float64>          <float64>
>>> f1.equals(f2)
False
Frame.fillfalsy(value)[source]

Return a new Frame after replacing falsy values with the supplied value.

Parameters:

value – Value to be used to replace missing values (NaN or None).

>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b     c               <<U1>
<Index>
0          10      qrs   1517-01-01
1          2       XYZ   1517-04-01
2          0             NaT
3          2       123   1517-04-01
<int64>    <int64> <<U4> <datetime64[D]>
>>> f.fillfalsy(dict(a=1, b='abc', c=np.datetime64('2022-01-10')))
<Frame: x>
<Index>    a       b     c               <<U1>
<Index>
0          10      qrs   1517-01-01
1          2       XYZ   1517-04-01
2          1       abc   2022-01-10
3          2       123   1517-04-01
<int64>    <int64> <<U4> <datetime64[D]>
Frame.fillfalsy_backward(limit=0, *, axis=0)[source]

Return a new Frame after filling backward falsy values with the first observed value.

Parameters:
  • limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          0       20      0
1          0       18      0
2          10      -3      0
3          2       18      1
<int64>    <int64> <int64> <int64>
>>> f.fillfalsy_backward()
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          10      20      1
1          10      18      1
2          10      -3      1
3          2       18      1
<int64>    <int64> <int64> <int64>
Frame.fillfalsy_forward(limit=0, *, axis=0)[source]

Return a new Frame after filling forward falsy values with the last observed value.

Parameters:
  • limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, 0, 0), (8, 3, 8, 0), (1, 0, 0, 0)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          10      8       1
1          2       3       0
2          0       8       0
3          0       0       0
<int64>    <int64> <int64> <int64>
>>> f.fillfalsy_forward()
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          10      8       1
1          2       3       1
2          2       8       1
3          2       8       1
<int64>    <int64> <int64> <int64>
Frame.fillfalsy_leading(value, *, axis=0)[source]

Return a new Frame after filling leading (and only leading) falsy values with the provided value.

Parameters:
  • value – Value to be used to replace missing values (NaN or None).

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((0, 0, 10, 2), (20, 18, -3, 18), (0, 0, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          0       20      0
1          0       18      0
2          10      -3      0
3          2       18      1
<int64>    <int64> <int64> <int64>
>>> f.fillfalsy_leading(-1)
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          -1      20      -1
1          -1      18      -1
2          10      -3      -1
3          2       18      1
<int64>    <int64> <int64> <int64>
Frame.fillfalsy_trailing(value, *, axis=0)[source]

Return a new Frame after filling trailing (and only trailing) falsy values with the provided value.

Parameters:
  • value – Value to be used to replace missing values (NaN or None).

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, 0, 0), (8, 3, 8, 0), (1, 0, 0, 0)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          10      8       1
1          2       3       0
2          0       8       0
3          0       0       0
<int64>    <int64> <int64> <int64>
>>> f.fillfalsy_trailing(-1)
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          10      8       1
1          2       3       -1
2          -1      8       -1
3          -1      -1      -1
<int64>    <int64> <int64> <int64>
Frame.fillna(value)[source]

Return a new Frame after replacing null (NaN or None) values with the supplied value.

Parameters:

value – Value to be used to replace missing values (NaN or None).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a         b     c               <<U1>
<Index>
0          10.0      qrs   1517-01-01
1          2.0       XYZ   1517-04-01
2          nan             NaT
3          2.0       123   1517-04-01
<int64>    <float64> <<U4> <datetime64[D]>
>>> f.fillna(-1)
<Frame: x>
<Index>    a         b     c          <<U1>
<Index>
0          10.0      qrs   1517-01-01
1          2.0       XYZ   1517-04-01
2          -1.0            -1
3          2.0       123   1517-04-01
<int64>    <float64> <<U4> <object>
Frame.fillna_backward(limit=0, *, axis=0)[source]

Return a new Frame after filling backward null (NaN or None) with the first observed value.

Parameters:
  • limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((np.nan, np.nan, 10, 2), (np.nan, 8, 3, 8), (np.nan, np.nan, np.nan, 1)), columns=('a', 'b', 'c'), name='y')
>>> f
<Frame: y>
<Index>    a         b         c         <<U1>
<Index>
0          nan       nan       nan
1          nan       8.0       nan
2          10.0      3.0       nan
3          2.0       8.0       1.0
<int64>    <float64> <float64> <float64>
>>> f.fillna_backward()
<Frame: y>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          10.0      8.0       1.0
2          10.0      3.0       1.0
3          2.0       8.0       1.0
<int64>    <float64> <float64> <float64>
Frame.fillna_forward(limit=0, *, axis=0)[source]

Return a new Frame after filling forward null (NaN or None) with the last observed value.

Parameters:
  • limit – Set the maximum count of missing values (NaN or None) to be filled per contiguous region of missing vlaues. A value of 0 is equivalent to no limit.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.fillna_forward()
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       1.0
2          2.0       8.0       1.0
3          2.0       8.0       1.0
<int64>    <float64> <float64> <float64>
Frame.fillna_leading(value, *, axis=0)[source]

Return a new Frame after filling leading (and only leading) null (NaN or None) with the provided value.

Parameters:
  • value – Value to be used to replace missing values (NaN or None).

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((np.nan, np.nan, 10, 2), (np.nan, 8, 3, 8), (np.nan, np.nan, np.nan, 1)), columns=('a', 'b', 'c'), name='y')
>>> f
<Frame: y>
<Index>    a         b         c         <<U1>
<Index>
0          nan       nan       nan
1          nan       8.0       nan
2          10.0      3.0       nan
3          2.0       8.0       1.0
<int64>    <float64> <float64> <float64>
>>> f.fillna_leading(-1)
<Frame: y>
<Index>    a         b         c         <<U1>
<Index>
0          -1.0      -1.0      -1.0
1          -1.0      8.0       -1.0
2          10.0      3.0       -1.0
3          2.0       8.0       1.0
<int64>    <float64> <float64> <float64>
Frame.fillna_trailing(value, *, axis=0)[source]

Return a new Frame after filling trailing (and only trailing) null (NaN or None) with the provided value.

Parameters:
  • value – Value to be used to replace missing values (NaN or None).

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.fillna_trailing(-1)
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       -1.0
2          -1.0      8.0       -1.0
3          -1.0      -1.0      -1.0
<int64>    <float64> <float64> <float64>
Frame.head(count=5)[source]

Return a Frame consisting only of the top elements as specified by count.

Parameters:

count – Number of elements to be returned from the top of the Frame

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.head(2)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
<int64>    <int64> <bool> <datetime64[D]>
Frame.iloc_max(*, skipna=True, axis=0)[source]

Return the integer indices corresponding to the maximum values found.

Parameters:
  • skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.iloc_max()
<Series>
<Index>
a        0
b        0
c        0
<<U1>    <int64>
Frame.iloc_min(*, skipna=True, axis=0)[source]

Return the integer indices corresponding to the minimum values found.

Parameters:
  • skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.iloc_min()
<Series>
<Index>
a        1
b        1
c        0
<<U1>    <int64>
Frame.iloc_notfalsy_first(*, fill_value=-1, axis=0)[source]

Return the position corresponding to the first non-falsy (including nan) values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
p          10      8       1
q          -2      -3      0
r          0       8       9
s          0       0       12
<<U1>      <int64> <int64> <int64>
>>> f.iloc_notfalsy_first(axis=0)
<Series>
<Index>
a        0
b        0
c        0
<<U1>    <int64>
>>> f.iloc_notfalsy_first(axis=1)
<Series>
<Index>
p        0
q        0
r        1
s        2
<<U1>    <int64>
Frame.iloc_notfalsy_last(*, fill_value=-1, axis=0)[source]

Return the position corresponding to the last non-falsy (including nan) values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
p          10      8       1
q          -2      -3      0
r          0       8       9
s          0       0       12
<<U1>      <int64> <int64> <int64>
>>> f.iloc_notfalsy_last(axis=0)
<Series>
<Index>
a        1
b        2
c        3
<<U1>    <int64>
>>> f.iloc_notfalsy_last(axis=1)
<Series>
<Index>
p        2
q        1
r        2
s        2
<<U1>    <int64>
Frame.iloc_notna_first(*, fill_value=-1, axis=0)[source]

Return the position corresponding to the first non-missing values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.iloc_notna_first(axis=0)
<Series>
<Index>
a        0
b        0
c        0
<<U1>    <int64>
>>> f.iloc_notna_first(axis=1)
<Series>
<Index>
0        0
1        0
2        1
3        -1
<int64>  <int64>
Frame.iloc_notna_last(*, fill_value=-1, axis=0)[source]

Return the position corresponding to the last non-missing values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.iloc_notna_last(axis=0)
<Series>
<Index>
a        1
b        2
c        0
<<U1>    <int64>
>>> f.iloc_notna_last(axis=1)
<Series>
<Index>
0        2
1        1
2        1
3        -1
<int64>  <int64>
Frame.insert_after(key, container, *, fill_value=nan)[source]

Create a new Frame by inserting a named Series or Frame at the position after the label specified by key.

Parameters:
  • key – Label after which the new container will be inserted.

  • container – Container to be inserted.

  • fill_value – A value to be used to fill space after reindexing the new container.

Returns:

Frame

>>> f1 = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f2 = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y')
>>> f2
<Frame: y>
<Index>    c      d      <<U1>
<Index>
p          False  True
q          False  True
r          False  True
<<U1>      <bool> <bool>
>>> f1.insert_after('b', f2)
<Frame: x>
<Index>    a       b       c      d      <<U1>
<Index>
p          0       1       False  True
q          2       3       False  True
r          4       5       False  True
<<U1>      <int64> <int64> <bool> <bool>
Frame.insert_before(key, container, *, fill_value=nan)[source]

Create a new Frame by inserting a named Series or Frame at the position before the label specified by key.

Parameters:
  • key – Label before which the new container will be inserted.

  • container – Container to be inserted.

  • fill_value – A value to be used to fill space after reindexing the new container.

Returns:

Frame

>>> f1 = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f2 = sf.Frame((np.arange(6).reshape(3,2) % 2).astype(bool), index=('p', 'q', 'r'), columns=('c', 'd'), name='y')
>>> f2
<Frame: y>
<Index>    c      d      <<U1>
<Index>
p          False  True
q          False  True
r          False  True
<<U1>      <bool> <bool>
>>> f1.insert_before('b', f2)
<Frame: x>
<Index>    a       c      d      b       <<U1>
<Index>
p          0       False  True   1
q          2       False  True   3
r          4       False  True   5
<<U1>      <int64> <bool> <bool> <int64>
Frame.isfalsy()[source]

Return a same-indexed, Boolean Frame indicating True which values are falsy.

>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b     c               <<U1>
<Index>
0          10      qrs   1517-01-01
1          2       XYZ   1517-04-01
2          0             NaT
3          2       123   1517-04-01
<int64>    <int64> <<U4> <datetime64[D]>
>>> f.isfalsy()
<Frame>
<Index> a      b      c      <<U1>
<Index>
0       False  False  False
1       False  False  False
2       True   True   True
3       False  False  False
<int64> <bool> <bool> <bool>
Frame.isin(other)[source]

Return a same-sized Boolean Frame that shows if the same-positioned element is in the passed iterable.

>>> f = sf.Frame.from_fields(((10, 2, 0, 0), (8, 3, 8, 0), (1, 0, 0, 0)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          10      8       1
1          2       3       0
2          0       8       0
3          0       0       0
<int64>    <int64> <int64> <int64>
>>> f.isin((0, 8))
<Frame: x>
<Index>    a      b      c      <<U1>
<Index>
0          False  True   False
1          False  False  True
2          True   True   True
3          True   True   True
<int64>    <bool> <bool> <bool>
Frame.isna()[source]

Return a same-indexed, Boolean Frame indicating True which values are NaN or None.

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.isna()
<Frame>
<Index> a      b      c      <<U1>
<Index>
0       False  False  False
1       False  False  True
2       True   False  True
3       True   True   True
<int64> <bool> <bool> <bool>
Frame.join_inner(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform an inner join.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.join_inner(f2, left_columns='c', right_columns='f')
<Frame>
<Index> a       b       c       d       e       f       <<U1>
<Index>
0       11      0       0       7       8       0
1       4       8       1       2       3       1
2       10      3       0       7       8       0
3       2       8       1       2       3       1
<int64> <int64> <int64> <int64> <int64> <int64> <int64>
Frame.join_left(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform a left outer join.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.join_left(f2, left_columns='c', right_columns='f')
<Frame>
<Index> a       b       c       d       e       f       <<U1>
<Index>
0       11      0       0       7       8       0
1       4       8       1       2       3       1
2       10      3       0       7       8       0
3       2       8       1       2       3       1
<int64> <int64> <int64> <int64> <int64> <int64> <int64>
Frame.join_outer(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform an outer join.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.join_outer(f2, left_columns='c', right_columns='f')
<Frame>
<Index> a       b       c       d       e       f       <<U1>
<Index>
0       11      0       0       7       8       0
1       4       8       1       2       3       1
2       10      3       0       7       8       0
3       2       8       1       2       3       1
<int64> <int64> <int64> <int64> <int64> <int64> <int64>
Frame.join_right(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform a right outer join.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.join_right(f2, left_columns='c', right_columns='f')
<Frame>
<Index> a       b       c       d       e       f       <<U1>
<Index>
0       4       8       1       2       3       1
1       2       8       1       2       3       1
2       11      0       0       7       8       0
3       10      3       0       7       8       0
<int64> <int64> <int64> <int64> <int64> <int64> <int64>
Frame.loc_max(*, skipna=True, axis=0)[source]

Return the labels corresponding to the maximum values found.

Parameters:
  • skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.loc_max()
<Series>
<Index>
a        0
b        0
c        0
<<U1>    <int64>
Frame.loc_min(*, skipna=True, axis=0)[source]

Return the labels corresponding to the minimum value found.

Parameters:
  • skipna – if True, NaN or None values will be ignored; if False, a found NaN will propagate.

  • axis – Axis upon which to evaluate contiguous missing values, where 0 is vertically (between row values) and 1 is horizontally (between column values).

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.loc_min()
<Series>
<Index>
a        1
b        1
c        0
<<U1>    <int64>
Frame.loc_notfalsy_first(*, fill_value=nan, axis=0)[source]

Return the labels corresponding to the first non-falsy (including nan) values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
p          10      8       1
q          -2      -3      0
r          0       8       9
s          0       0       12
<<U1>      <int64> <int64> <int64>
>>> f.loc_notfalsy_first(axis=0)
<Series>
<Index>
a        p
b        p
c        p
<<U1>    <<U1>
>>> f.loc_notfalsy_first(axis=1)
<Series>
<Index>
p        a
q        a
r        b
s        c
<<U1>    <<U1>
Frame.loc_notfalsy_last(*, fill_value=nan, axis=0)[source]

Return the labels corresponding to the last non-falsy (including nan) values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, -2, 0, 0), (8, -3, 8, 0), (1, 0, 9, 12)), index=('p', 'q', 'r', 's'), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
p          10      8       1
q          -2      -3      0
r          0       8       9
s          0       0       12
<<U1>      <int64> <int64> <int64>
>>> f.loc_notfalsy_last(axis=0)
<Series>
<Index>
a        q
b        r
c        s
<<U1>    <<U1>
>>> f.loc_notfalsy_last(axis=1)
<Series>
<Index>
p        c
q        b
r        c
s        c
<<U1>    <<U1>
Frame.loc_notna_first(*, fill_value=nan, axis=0)[source]

Return the labels corresponding to the first non-missing values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.loc_notna_first(axis=0)
<Series>
<Index>
a        0
b        0
c        0
<<U1>    <int64>
>>> f.loc_notna_first(axis=1)
<Series>
<Index>
0        a
1        a
2        b
3        nan
<int64>  <object>
Frame.loc_notna_last(*, fill_value=nan, axis=0)[source]

Return the labels corresponding to the last non-missing values along the selected axis.

Parameters:
  • {skipna}

  • {axis}

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.loc_notna_last(axis=0)
<Series>
<Index>
a        1
b        2
c        0
<<U1>    <int64>
>>> f.loc_notna_last(axis=1)
<Series>
<Index>
0        c
1        b
2        b
3        nan
<int64>  <object>
Frame.max(axis=0, skipna=True, out=None)

Return the maximum along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.max()
<Series>
<Index>
a        4
b        5
<<U1>    <int64>
Frame.mean(axis=0, skipna=True, out=None)

Return the mean along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.mean()
<Series>
<Index>
a        2.0
b        3.0
<<U1>    <float64>
Frame.median(axis=0, skipna=True, out=None)

Return the median along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.median()
<Series>
<Index>
a        2.0
b        3.0
<<U1>    <float64>
Frame.merge_inner(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform an inner merge, an inner join where matched columns are coalesced.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –

  • left. (merge fields will be named from the) –

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.merge_inner(f2, left_columns='c', right_columns='f')
<Frame>
<Index> c       a       b       d       e       <<U1>
<Index>
0       0       11      0       7       8
1       1       4       8       2       3
2       0       10      3       7       8
3       1       2       8       2       3
<int64> <int64> <int64> <int64> <int64> <int64>
Frame.merge_left(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform a left merge, a left join where matched columns are coalesced.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –

  • left. (merge fields will be named from the) –

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.merge_left(f2, left_columns='c', right_columns='f', merge_labels='x')
<Frame>
<Index> x       a       b       d       e       <<U1>
<Index>
0       0       11      0       7       8
1       1       4       8       2       3
2       0       10      3       7       8
3       1       2       8       2       3
<int64> <int64> <int64> <int64> <int64> <int64>
Frame.merge_outer(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform an outer merge, an outer join where matched columns are coalesced.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –

  • left. (merge fields will be named from the) –

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.merge_outer(f2, left_columns='c', right_columns='f', merge_labels='x')
<Frame>
<Index> x       a       b       d       e       <<U1>
<Index>
0       0       11      0       7       8
1       1       4       8       2       3
2       0       10      3       7       8
3       1       2       8       2       3
<int64> <int64> <int64> <int64> <int64> <int64>
Frame.merge_right(other, *, left_depth_level=None, left_columns=None, right_depth_level=None, right_columns=None, merge_labels=None, left_template='{}', right_template='{}', fill_value=nan, include_index=False)[source]

Perform a right merge, a right join where matched columns are coalesced.

Parameters:
  • left_depth_level – Specify one or more left index depths to include in the join predicate.

  • left_columns – Specify one or more left columns to include in the join predicate.

  • right_depth_level – Specify one or more right index depths to include in the join predicate.

  • right_columns – Specify one or more right columns to include in the join predicate.

  • provided (Provide a sequence of labels to be used for the merge fields. Must have a length equal to left and right selections. If not) –

  • left. (merge fields will be named from the) –

  • left_template – Provide a format string for naming left columns in the joined result.

  • right_template – Provide a format string for naming right columns in the joined result.

  • fill_value – A value to be used to fill space created in the join.

  • True (If) –

  • Frame. (and appropriate index will be returned in the resultant) –

Returns:

Frame

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = sf.Frame.from_fields(((2, 7), (3, 8), (1, 0)), columns=('d', 'e', 'f'), name='y')
>>> f2
<Frame: y>
<Index>    d       e       f       <<U1>
<Index>
0          2       3       1
1          7       8       0
<int64>    <int64> <int64> <int64>
>>> f1.merge_right(f2, left_columns='c', right_columns='f')
<Frame>
<Index> f       a       b       d       e       <<U1>
<Index>
0       1       4       8       2       3
1       1       2       8       2       3
2       0       11      0       7       8
3       0       10      3       7       8
<int64> <int64> <int64> <int64> <int64> <int64>
Frame.min(axis=0, skipna=True, out=None)

Return the minimum along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.min()
<Series>
<Index>
a        0
b        1
<<U1>    <int64>
Frame.notfalsy()[source]

Return a same-indexed, Boolean Frame indicating True which values are not falsy.

>>> f = sf.Frame.from_fields(((10, 2, 0, 2), ('qrs ', 'XYZ', '', '123'), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b     c               <<U1>
<Index>
0          10      qrs   1517-01-01
1          2       XYZ   1517-04-01
2          0             NaT
3          2       123   1517-04-01
<int64>    <int64> <<U4> <datetime64[D]>
>>> f.notfalsy()
<Frame>
<Index> a      b      c      <<U1>
<Index>
0       True   True   True
1       True   True   True
2       False  False  False
3       True   True   True
<int64> <bool> <bool> <bool>
Frame.notna()[source]

Return a same-indexed, Boolean Frame indicating True which values are not NaN or None.

>>> f = sf.Frame.from_fields(((10, 2, np.nan, np.nan), (8, 3, 8, np.nan), (1, np.nan, np.nan, np.nan)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          10.0      8.0       1.0
1          2.0       3.0       nan
2          nan       8.0       nan
3          nan       nan       nan
<int64>    <float64> <float64> <float64>
>>> f.notna()
<Frame>
<Index> a      b      c      <<U1>
<Index>
0       True   True   True
1       True   True   False
2       False  True   False
3       False  False  False
<int64> <bool> <bool> <bool>
Frame.pivot(index_fields, columns_fields=(), data_fields=(), *, func=<function nansum>, fill_value=nan, index_constructor=None)[source]

Produce a pivot table, where one or more columns is selected for each of index_fields, columns_fields, and data_fields. Unique values from the provided index_fields will be used to create a new index; unique values from the provided columns_fields will be used to create a new columns; if one data_fields value is selected, that is the value that will be displayed; if more than one values is given, those values will be presented with a hierarchical index on the columns; if data_fields is not provided, all unused fields will be displayed.

Parameters:
  • index_fields

  • columns_fields

  • data_fields

  • *

  • fill_value – If the index expansion produces coordinates that have no existing data value, fill that position with this value.

  • func – function to apply to data_fields, or a dictionary of labelled functions to apply to data fields, producing an additional hierarchical level.

  • index_constructor

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f1.pivot(index_fields='b', columns_fields='c')
<Frame>
<Index: c> 0         1         <int64>
<Index: b>
0          11.0      nan
3          10.0      nan
8          nan       6.0
<int64>    <float64> <float64>
Frame.pivot_stack(depth_level=-1, *, fill_value=nan)[source]

Move labels from the columns to the index, creating or extending an IndexHierarchy on the index.

Parameters:

depth_level – selection of columns depth or depth to move onto the index.

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f1.pivot_stack()
<Frame: x>
<Index>                0       <int64>
<IndexHierarchy>
0                a     11
0                b     0
0                c     0
1                a     4
1                b     8
1                c     1
2                a     10
2                b     3
2                c     0
3                a     2
3                b     8
3                c     1
<int64>          <<U1> <int64>
Frame.pivot_unstack(depth_level=-1, *, fill_value=nan)[source]

Move labels from the index to the columns, creating or extending an IndexHierarchy on the columns.

Parameters:

depth_level – selection of index depth or depth to move onto the columns.

>>> f1 = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f1
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f2 = f1.pivot_stack()
>>> f2
<Frame: x>
<Index>                0       <int64>
<IndexHierarchy>
0                a     11
0                b     0
0                c     0
1                a     4
1                b     8
1                c     1
2                a     10
2                b     3
2                c     0
3                a     2
3                b     8
3                c     1
<int64>          <<U1> <int64>
>>> f2.pivot_unstack()
<Frame: x>
<IndexHierarchy> 0       0       0       <int64>
                 a       b       c       <<U1>
<Index>
0                11      0       0
1                4       8       1
2                10      3       0
3                2       8       1
<int64>          <int64> <int64> <int64>
Frame.prod(axis=0, skipna=True, allna=1, out=None)

Return the product along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.prod()
<Series>
<Index>
a        0
b        15
<<U1>    <int64>
Frame.rank_dense(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]

Rank values as compactly as possible, where ties get the same value, and ranks are contiguous (potentially non-unique) integers.

Parameters:
  • axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)

  • skipna – If True, exclude NA values (NaN or None) from ranking, replacing those values with fill_value.

  • ascending – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.

  • fill_value – A value to be used to fill NA values ignored in ranking when skipna is True. The default is np.nan but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.

Returns:

Frame

>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f.rank_dense()
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          3       0       0
1          1       2       1
2          2       1       0
3          0       2       1
<int64>    <int64> <int64> <int64>
Frame.rank_max(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]

Rank values where tied values are assigned the maximum ordinal rank; ranks are potentially non-contiguous and non-unique integers.

Parameters:
  • axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)

  • skipna – If True, exclude NA values (NaN or None) from ranking, replacing those values with fill_value.

  • ascending – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.

  • fill_value – A value to be used to fill NA values ignored in ranking when skipna is True. The default is np.nan but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.

Returns:

Frame

>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f.rank_max()
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          3       0       1
1          1       3       3
2          2       1       1
3          0       3       3
<int64>    <int64> <int64> <int64>
Frame.rank_mean(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]

Rank values where tied values are assigned the mean of the ordinal ranks; ranks are potentially non-contiguous and non-unique floats.

Parameters:
  • axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)

  • skipna – If True, exclude NA values (NaN or None) from ranking, replacing those values with fill_value.

  • ascending – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.

  • fill_value – A value to be used to fill NA values ignored in ranking when skipna is True. The default is np.nan but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.

Returns:

Frame

>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f.rank_mean()
<Frame: x>
<Index>    a         b         c         <<U1>
<Index>
0          3.0       0.0       0.5
1          1.0       2.5       2.5
2          2.0       1.0       0.5
3          0.0       2.5       2.5
<int64>    <float64> <float64> <float64>
Frame.rank_min(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]

Rank values where tied values are assigned the minimum ordinal rank; ranks are potentially non-contiguous and non-unique integers.

Parameters:
  • axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)

  • skipna – If True, exclude NA values (NaN or None) from ranking, replacing those values with fill_value.

  • ascending – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.

  • fill_value – A value to be used to fill NA values ignored in ranking when skipna is True. The default is np.nan but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.

Returns:

Frame

>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f.rank_min()
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          3       0       0
1          1       2       2
2          2       1       0
3          0       2       2
<int64>    <int64> <int64> <int64>
Frame.rank_ordinal(*, axis=0, skipna=True, ascending=True, start=0, fill_value=nan)[source]

Rank values distinctly, where ties get distinct values that maintain their ordering, and ranks are contiguous unique integers.

Parameters:
  • axis – Integer specifying axis of ranking, where 0 ranks vertically (within each column) and 1 ranks horizontally (within each row)

  • skipna – If True, exclude NA values (NaN or None) from ranking, replacing those values with fill_value.

  • ascending – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • start – The reference value for the lowest rank. Some ranking methodologies (mean, max) may not return this value given some inputs. The default is 0; for ranks that start from 1, provide a value of 1.

  • fill_value – A value to be used to fill NA values ignored in ranking when skipna is True. The default is np.nan but can be set to any value to force NA values to the “bottom” or “top” of a rank as needed.

Returns:

Series

>>> f = sf.Frame.from_fields(((11, 4, 10, 2), (0, 8, 3, 8), (0, 1, 0, 1)), columns=('a', 'b', 'c'), name='x')
>>> f
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          11      0       0
1          4       8       1
2          10      3       0
3          2       8       1
<int64>    <int64> <int64> <int64>
>>> f.rank_ordinal()
<Frame: x>
<Index>    a       b       c       <<U1>
<Index>
0          3       0       0
1          1       2       2
2          2       1       1
3          0       3       3
<int64>    <int64> <int64> <int64>
Frame.rehierarch(index=None, columns=None, *, index_constructors=None, columns_constructors=None)[source]

Produce a new Frame with index and/or columns constructed with a transformed hierarchy.

Parameters:
  • index – Depth level specifier

  • columns – Depth level specifier

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.rehierarch((1, 0))
<Frame: x>
<Index>                  a       b      c               <<U1>
<IndexHierarchy>
p                0       10      False  1517-01-01
p                1       8       True   1517-12-31
q                0       2       True   1517-04-01
q                1       3       False  1517-06-30
<<U1>            <int64> <int64> <bool> <datetime64[D]>
Frame.reindex(index=None, columns=None, *, fill_value=nan, own_index=False, own_columns=False, check_equals=True)[source]

Return a new Frame with labels defined by the provided index. The size and ordering of the data is determined by the newly provided index, where data will continue to be aligned under labels found in both the new and the old index. Labels found only in the new index will be filled with fill_value.

Parameters:
  • index – An iterable of unique, hashable values, or another Index or IndexHierarchy, to be used as the labels of the index.

  • columns – An iterable of unique, hashable values, or another Index or IndexHierarchy, to be used as the labels of the index.

  • fill_value – A value to be used to fill space created by a new index that has values not found in the previous index.

  • own_index – Flag the passed index as ownable by this static_frame.Frame. Primarily used by internal clients.

  • own_columns – Flag the passed columns as ownable by this static_frame.Frame. Primarily used by internal clients.

  • check_equals

>>> f = sf.Frame.from_items((('a', (10, 2, 8, 3)), ('b', ('qrs ', 'XYZ', '123', ' wX '))), index=('p', 'q', 'r', 's'), name='x')
>>> f
<Frame: x>
<Index>    a       b     <<U1>
<Index>
p          10      qrs
q          2       XYZ
r          8       123
s          3        wX
<<U1>      <int64> <<U4>
>>> f.reindex(('q', 't', 's', 'r'), fill_value=sf.FillValueAuto(i=-1, U=''))
<Frame: x>
<Index>    a       b     <<U1>
<Index>
q          2       XYZ
t          -1
s          3        wX
r          8       123
<<U1>      <int64> <<U4>
Frame.relabel(index=None, columns=None, *, index_constructor=None, columns_constructor=None)[source]

Return a new Frame with transformed labels on the index. The size and ordering of the data is never changed in a relabeling operation. The resulting index must be unique.

Parameters:
  • index – One of the following types, used to create new index labels with the same size as the previous index. (a) A mapping (as a dictionary or Series), used to lookup and transform the labels in the previous index. Labels not found in the mapping will be reused. (b) A function, returning a hashable, that is applied to each label in the previous index. (c) The IndexAutoFactory type, to apply auto-incremented integer labels. (d) An Index initializer, i.e., either an iterable of hashables or an Index instance.

  • columns – One of the following types, used to create new columns labels with the same size as the previous columns. (a) A mapping (as a dictionary or Series), used to lookup and transform the labels in the previous columns. Labels not found in the mapping will be reused. (b) A function, returning a hashable, that is applied to each label in the previous columns. (c) The IndexAutoFactory type, to apply auto-incremented integer labels. (d) An Index initializer, i.e., either an iterable of hashables or an Index instance.

>>> f = sf.Frame.from_records(((10, False, '1517-01-01'), (8, True,'1517-04-01')), index=('p', 'q'), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
p          10      False  1517-01-01
q          8       True   1517-04-01
<<U1>      <int64> <bool> <datetime64[D]>
>>> f.relabel(('y', 'z'))
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
y          10      False  1517-01-01
z          8       True   1517-04-01
<<U1>      <int64> <bool> <datetime64[D]>
>>> f.relabel(dict(q='x', p='y'))
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
y          10      False  1517-01-01
x          8       True   1517-04-01
<<U1>      <int64> <bool> <datetime64[D]>
>>> f.relabel(lambda l: f'+{l.upper()}+')
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
+P+        10      False  1517-01-01
+Q+        8       True   1517-04-01
<<U3>      <int64> <bool> <datetime64[D]>
Frame.relabel_flat(index=False, columns=False)[source]

Return a new Frame, where an IndexHierarchy (if defined) is replaced with a flat, one-dimension index of tuples.

Parameters:
  • index – Boolean to flag flatening on the index.

  • columns – Boolean to flag flatening on the columns.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.relabel_flat(index=True)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
(0, 'p')   10      False  1517-01-01
(0, 'q')   2       True   1517-04-01
(1, 'p')   8       True   1517-12-31
(1, 'q')   3       False  1517-06-30
<object>   <int64> <bool> <datetime64[D]>
Frame.relabel_level_add(index=None, columns=None, *, index_constructor=None, columns_constructor=None)[source]

Return a new Frame, adding a new root level to an existing IndexHierarchy, or creating an IndexHierarchy if one is not yet defined.

Parameters:
  • index – A hashable value to be used as a new root level, extending or creating an IndexHierarchy

  • columns – A hashable value to be used as a new root level, extending or creating an IndexHierarchy

  • *

  • index_constructor

  • columns_constructor

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.relabel_level_add('I')
<Frame: x>
<Index>                        a       b      c               <<U1>
<IndexHierarchy>
I                0       p     10      False  1517-01-01
I                0       q     2       True   1517-04-01
I                1       p     8       True   1517-12-31
I                1       q     3       False  1517-06-30
<<U1>            <int64> <<U1> <int64> <bool> <datetime64[D]>
Frame.relabel_level_drop(index=0, columns=0)[source]

Return a new Frame, dropping one or more levels from a either the root or the leaves of an IndexHierarchy. The resulting index must be unique.

Parameters:
  • index – A positive integer drops that many outer-most (root) levels; a negative integer drops that many inner-most (leaf)levels. Default is zero.

  • columns – A positive integer drops that many outer-most (root) levels; a negative integer drops that many inner-most (leaf)levels. Default is zero.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.iloc[:2].relabel_level_drop(1)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
p          10      False  1517-01-01
q          2       True   1517-04-01
<<U1>      <int64> <bool> <datetime64[D]>
Frame.relabel_shift_in(key, *, axis=0, index_constructors=None)[source]

Create, or augment, an IndexHierarchy by providing one or more selections from the Frame (via axis-appropriate loc selections) to move into the Index.

Parameters:
  • key – a loc-style selection on the opposite axis.

  • axis – 0 modifies the index by selecting columns with key; 1 modifies the columns by selecting rows with key.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.relabel_shift_in('a')
<Frame: x>
<Index>                                            b      c               <<U1>
<IndexHierarchy: ('__index0__', '...
0                                    p     10      False  1517-01-01
0                                    q     2       True   1517-04-01
1                                    p     8       True   1517-12-31
1                                    q     3       False  1517-06-30
<int64>                              <<U1> <int64> <bool> <datetime64[D]>
Frame.relabel_shift_out(depth_level, *, axis=0)[source]

Shift values from an index on an axis to the Frame by providing one or more depth level selections.

Parameters:
  • dpeth_level – an iloc-style selection on the Index of the specified axis.

  • axis – 0 modifies the index by selecting columns with depth_level; 1 modifies the columns by selecting rows with depth_level.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.rename(index=('d', 'e')).relabel_shift_out([1, 0])
<Frame: x>
<Index>    e     d       a       b      c               <<U1>
<Index>
0          p     0       10      False  1517-01-01
1          q     0       2       True   1517-04-01
2          p     1       8       True   1517-12-31
3          q     1       3       False  1517-06-30
<int64>    <<U1> <int64> <int64> <bool> <datetime64[D]>
Frame.rename(name=<object object>, *, index=<object object>, columns=<object object>)[source]

Return a new Frame with an updated name attribute. Optionally update the name attribute of index and columns.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.rename('y', index='p', columns='q')
<Frame: y>
<Index: q>                a       b      c               <<U1>
<IndexHierarchy: p>
0                   p     10      False  1517-01-01
0                   q     2       True   1517-04-01
1                   p     8       True   1517-12-31
1                   q     3       False  1517-06-30
<int64>             <<U1> <int64> <bool> <datetime64[D]>
Frame.roll(index=0, columns=0, *, include_index=False, include_columns=False)[source]

Roll columns and/or rows by positive or negative integer counts, where columns and/or rows roll around the axis.

Parameters:
  • include_index – Determine if index is included in index-wise rotation.

  • include_columns – Determine if column index is included in index-wise rotation.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.roll(3)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          2       True   1517-04-01
1          8       True   1517-12-31
2          3       False  1517-06-30
3          10      False  1517-01-01
<int64>    <int64> <bool> <datetime64[D]>
Frame.sample(index=None, columns=None, *, seed=None)[source]

Randomly (optionally made deterministic with a fixed seed) extract items from the container to return a subset of the container.

Parameters:
  • index. (Number of labels to select from the) –

  • columns. (Number of labels to select from the) –

  • selection. (Initial state of random) –

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.sample(2, 2, seed=0)
<Frame: x>
<Index>    b      c               <<U1>
<Index>
2          True   1517-12-31
3          False  1517-06-30
<int64>    <bool> <datetime64[D]>
Frame.set_columns(index, *, drop=False, columns_constructor=None)[source]

Return a new Frame produced by setting the given row as the columns, optionally removing that row from the new Frame.

Parameters:
  • index

  • *

  • drop

  • columns_constructor

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.set_columns((1, 'p'), drop=True)
<Frame: x>
<Index: (1, 'p')>       8       True   1517-12-31      <object>
<IndexHierarchy>
0                 p     10      False  1517-01-01
0                 q     2       True   1517-04-01
1                 q     3       False  1517-06-30
<int64>           <<U1> <int64> <bool> <datetime64[D]>
Frame.set_columns_hierarchy(index, *, drop=False, columns_constructors=None, reorder_for_hierarchy=False)[source]

Given an iterable of index labels, return a new Frame with those rows as an IndexHierarchy on the columns.

Parameters:
  • index – Iterable of index labels.

  • drop – Boolean to determine if selected rows should be removed from the data.

  • columns_constructors – Optionally provide a sequence of Index constructors, of length equal to depth, to be used in converting row Index components in the IndexHierarchy.

  • reorder_for_hierarchy – reorder the columns to produce a hierarchible Index from the selected columns.

Returns:

Frame

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.set_columns_hierarchy([(1, 'p'), (1, 'q')], drop=True)
<Frame: x>
<IndexHierarchy: ((1, 'p'), (1, '...       8       True   1517-12-31      <object>
                                           3       False  1517-06-30      <object>
<IndexHierarchy>
0                                    p     10      False  1517-01-01
0                                    q     2       True   1517-04-01
<int64>                              <<U1> <int64> <bool> <datetime64[D]>
Frame.set_index(column, *, drop=False, index_constructor=None)[source]

Return a new Frame produced by setting the given column as the index, optionally removing that column from the new Frame.

Parameters:
  • column

  • *

  • drop

  • index_constructor

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.set_index('c', drop=True, index_constructor=sf.IndexDate)
<Frame: x>
<Index>         a       b      <<U1>
<IndexDate: c>
1517-01-01      10      False
1517-04-01      2       True
1517-12-31      8       True
1517-06-30      3       False
<datetime64[D]> <int64> <bool>
Frame.set_index_hierarchy(columns, *, drop=False, index_constructors=None, reorder_for_hierarchy=False)[source]

Given an iterable of column labels, return a new Frame with those columns as an IndexHierarchy on the index.

Parameters:
  • columns – Iterable of column labels.

  • drop – Boolean to determine if selected columns should be removed from the data.

  • index_constructors – Optionally provide a sequence of Index constructors, of length equal to depth, to be used in converting columns Index components in the IndexHierarchy.

  • reorder_for_hierarchy – reorder the rows to produce a hierarchible Index from the selected columns, assuming hierarchability is possible.

Returns:

Frame

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.set_index_hierarchy(['b', 'c'], drop=True, index_constructors=(sf.Index, sf.IndexDate))
<Frame: x>
<Index>                                      a       <<U1>
<IndexHierarchy: ('b', 'c')>
False                        1517-01-01      10
True                         1517-04-01      2
True                         1517-12-31      8
False                        1517-06-30      3
<bool>                       <datetime64[D]> <int64>
Frame.shift(index=0, columns=0, *, fill_value=nan)[source]

Shift columns and/or rows by positive or negative integer counts, where columns and/or rows fall of the axis and introduce missing values, filled by fill_value.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.shift(3, fill_value=sf.FillValueAuto)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          0       False  NaT
1          0       False  NaT
2          0       False  NaT
3          10      False  1517-01-01
<int64>    <int64> <bool> <datetime64[D]>
Frame.sort_columns(*, ascending=True, kind='mergesort', key=None)[source]

Return a new Frame ordered by the sorted columns.

Parameters:
  • ascendings – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • kind – Name of the sort algorithm as passed to NumPy.

  • key – A function that is used to pre-process the selected columns or rows and derive new values to sort by.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.sort_columns(ascending=False)
<Frame: x>
<Index>                c               b      a       <<U1>
<IndexHierarchy>
0                p     1517-01-01      False  10
0                q     1517-04-01      True   2
1                p     1517-12-31      True   8
1                q     1517-06-30      False  3
<int64>          <<U1> <datetime64[D]> <bool> <int64>
Frame.sort_index(*, ascending=True, kind='mergesort', key=None)[source]

Return a new Frame ordered by the sorted Index.

Parameters:
  • ascendings – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • kind – Name of the sort algorithm as passed to NumPy.

  • key – A function that is used to pre-process the selected columns or rows and derive new values to sort by.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.sort_index(ascending=False)
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
1                q     3       False  1517-06-30
1                p     8       True   1517-12-31
0                q     2       True   1517-04-01
0                p     10      False  1517-01-01
<int64>          <<U1> <int64> <bool> <datetime64[D]>
Frame.sort_values(label, *, ascending=True, axis=1, kind='mergesort', key=None)[source]

Return a new Frame ordered by the sorted values, where values are given by single column or iterable of columns.

Parameters:
  • label – A label or iterable of labels to select the columns (for axis 1) or rows (for axis 0) to sort.

  • *

  • ascendings – Boolean, or iterable of Booleans; if True, the lowest ranks correspond to the lowest values; if an iterable, apply per column or row. The default is True.

  • axis – Axis upon which to sort; 0 orders columns based on one or more rows; 1 orders rows based on one or more columns.

  • kind – Name of the sort algorithm as passed to NumPy.

  • key – A function that is used to pre-process the selected columns or rows and derive new values to sort by.

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.sort_values('c')
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
3          3       False  1517-06-30
2          8       True   1517-12-31
<int64>    <int64> <bool> <datetime64[D]>
>>> f.sort_values(['c', 'b'], ascending=False)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
2          8       True   1517-12-31
3          3       False  1517-06-30
1          2       True   1517-04-01
0          10      False  1517-01-01
<int64>    <int64> <bool> <datetime64[D]>
Frame.std(axis=0, skipna=True, ddof=0, out=None)

Return the standard deviaton along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.std()
<Series>
<Index>
a        1.632993161855452
b        1.632993161855452
<<U1>    <float64>
Frame.sum(axis=0, skipna=True, allna=0, out=None)

Sum values along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.sum()
<Series>
<Index>
a        6
b        9
<<U1>    <int64>
Frame.tail(count=5)[source]

Return a Frame consisting only of the bottom elements as specified by count.

Parameters:

count – Number of elements to be returned from the bottom of the Frame

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
0          10      False  1517-01-01
1          2       True   1517-04-01
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
>>> f.tail(2)
<Frame: x>
<Index>    a       b      c               <<U1>
<Index>
2          8       True   1517-12-31
3          3       False  1517-06-30
<int64>    <int64> <bool> <datetime64[D]>
Frame.transpose()[source]

Transpose. Return a Frame with index as columns and vice versa.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.transpose()
<Frame: x>
<Index>    p       q       r       <<U1>
<Index>
a          0       2       4
b          1       3       5
<<U1>      <int64> <int64> <int64>
Frame.unique(*, axis=None)[source]

Return a NumPy array of unqiue values. If the axis argument is provided, uniqueness is determined by columns or row.

>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
1          2.0       True     1517-04-01
2          nan       None     NaT
3          2.0       True     1517-04-01
<int64>    <float64> <object> <datetime64[D]>
>>> f.unique()
[10.0 False datetime.date(1517, 1, 1) 2.0 True datetime.date(1517, 4, 1)
 nan None]
Frame.unique_enumerated(*, retain_order=False, func=None)[source]

{doc} {args}

>>> f = sf.Frame.from_fields(((10, 2, np.nan, 2), (False, True, None, True), ('1517-01-01', '1517-04-01', 'NaT', '1517-04-01')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>    a         b        c               <<U1>
<Index>
0          10.0      False    1517-01-01
1          2.0       True     1517-04-01
2          nan       None     NaT
3          2.0       True     1517-04-01
<int64>    <float64> <object> <datetime64[D]>
>>> f.unique_enumerated(retain_order=True, func=sf.isna_element)
(array([[ 0,  2,  4],
       [ 1,  3,  5],
       [-1, -1, -1],
       [ 1,  3,  5]]), array([10.0, 2.0, False, True, datetime.date(1517, 1, 1),
       datetime.date(1517, 4, 1)], dtype=object))
Frame.unset_columns(*, names=(), drop=False, index_constructors=None)[source]

Return a new Frame where columns are added to the top of the data, and an IndexAutoFactory is used to populate new columns. This operation potentially forces a complete copy of all data.

Parameters:
  • names – An sequence of hashables to be used to name the unset columns. If an Index, a single hashable should be provided; if an IndexHierarchy, as many hashables as the depth must be provided.

  • index_constructors

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.rename(columns='o').unset_columns()
<Frame: x>
<Index>    0        1        <int64>
<Index>
o          a        b
p          0        1
q          2        3
r          4        5
<<U1>      <object> <object>
Frame.unset_index(*, names=(), drop=False, consolidate_blocks=False, columns_constructors=None)[source]

Return a new Frame where the index is added to the front of the data, and an IndexAutoFactory is used to populate a new index. If the Index has a name, that name will be used for the column name, otherwise a suitable default will be used. As underlying NumPy arrays are immutable, data is not copied.

Parameters:
  • names – An iterable of hashables to be used to name the unset index. If an Index, a single hashable should be provided; if an IndexHierarchy, as many hashables as the depth must be provided.

  • consolidate_blocks

  • columns_constructors

>>> f = sf.Frame.from_fields(((10, 2, 8, 3), (False, True, True, False), ('1517-01-01', '1517-04-01', '1517-12-31', '1517-06-30')), index=sf.IndexHierarchy.from_product((0, 1), ('p', 'q')), columns=('a', 'b', 'c'), dtypes=dict(c=np.datetime64), name='x')
>>> f
<Frame: x>
<Index>                a       b      c               <<U1>
<IndexHierarchy>
0                p     10      False  1517-01-01
0                q     2       True   1517-04-01
1                p     8       True   1517-12-31
1                q     3       False  1517-06-30
<int64>          <<U1> <int64> <bool> <datetime64[D]>
>>> f.rename(index=(('d', 'e'))).unset_index()
<Frame: x>
<Index>    d       e     a       b      c               <<U1>
<Index>
0          0       p     10      False  1517-01-01
1          0       q     2       True   1517-04-01
2          1       p     8       True   1517-12-31
3          1       q     3       False  1517-06-30
<int64>    <int64> <<U1> <int64> <bool> <datetime64[D]>
Frame.var(axis=0, skipna=True, ddof=0, out=None)

Return the variance along the specified axis.

Parameters:
  • axis – Axis, defaulting to axis 0.

  • skipna – Skip missing (NaN) values, defaulting to True.

>>> f = sf.Frame(np.arange(6).reshape(3,2), index=('p', 'q', 'r'), columns=('a', 'b'), name='x')
>>> f
<Frame: x>
<Index>    a       b       <<U1>
<Index>
p          0       1
q          2       3
r          4       5
<<U1>      <int64> <int64>
>>> f.var()
<Series>
<Index>
a        2.6666666666666665
b        2.6666666666666665
<<U1>    <float64>

Frame: Constructor | Exporter | Attribute | Method | Dictionary-Like | Display | Assignment | Selector | Iterator | Operator Binary | Operator Unary | Accessor Values | Accessor Datetime | Accessor String | Accessor Transpose | Accessor Fill Value | Accessor Regular Expression | Accessor Hashlib | Accessor Type Clinic | Accessor Reduce