meerkat.sample
meerkat.sample¶
- sample(data: Union[meerkat.dataframe.DataFrame, meerkat.columns.abstract.Column], n: int = None, frac: float = None, replace: bool = False, weights: Union[str, numpy.ndarray] = None, random_state: Union[int, numpy.random.mtrand.RandomState] = None) Union[meerkat.dataframe.DataFrame, meerkat.columns.abstract.Column] [source]¶
Select a random sample of rows from DataFrame or Column. Roughly equivalent to
sample
in Pandas https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sample.html.- Parameters
data (Union[DataFrame, AbstractColumn]) – DataFrame or Column to sample from.
n (int) – Number of samples to draw. If frac is specified, this parameter should not be passed. Defaults to 1 if frac is not passed.
frac (float) – Fraction of rows to sample. If n is specified, this parameter should not be passed.
replace (bool) – Sample with or without replacement. Defaults to False.
weights (Union[str, np.ndarray]) – Weights to use for sampling. If None (default), the rows will be sampled uniformly. If a numpy array, the sample will be weighted accordingly. If a string and data is a DataFrame, the sampled_df will be applied to the rows based on the column with the name specified. If weights do not sum to 1 they will be normalized to sum to 1.
random_state (Union[int, np.random.RandomState]) – Random state or seed to use for sampling.
- Returns
- A random sample of rows from DataFrame or
Column.
- Return type
Union[DataFrame, AbstractColumn]