meerkat.merge
meerkat.merge¶
- merge(left: meerkat.dataframe.DataFrame, right: meerkat.dataframe.DataFrame, how: str = 'inner', on: Union[str, List[str]] = None, left_on: Union[str, List[str]] = None, right_on: Union[str, List[str]] = None, sort: bool = False, suffixes: Sequence[str] = ('_x', '_y'), validate=None) meerkat.dataframe.DataFrame [source]¶
Perform a database-style join operation between two DataFrames.
- Parameters
left (DataFrame) – Left DataFrame.
right (DataFrame) – Right DataFrame.
how (str, optional) – The join type. Defaults to “inner”.
on (Union[str, List[str]], optional) – The columns(s) to join on. These columns must be
ScalarColumn
. Defaults to None, in which case the left_on and right_on parameters must be passed.left_on (Union[str, List[str]], optional) – The column(s) in the left DataFrame to join on. These columns must be
ScalarColumn
. Defaults to None.right_on (Union[str, List[str]], optional) – The column(s) in the right DataFrame to join on. These columns must be
ScalarColumn
. Defaults to None.sort (bool, optional) – Whether to sort the result DataFrame by the join key(s). Defaults to False.
suffixes (Sequence[str], optional) – Suffixes to use in the case their are conflicting column names in the result DataFrame. Should be a sequence of length two, with
suffixes[0]
the suffix for the column from the left DataFrame andsuffixes[1]
the suffix for the right. Defaults to (“_x”, “_y”).validate (_type_, optional) –
The check to perform on the result DataFrame. Defaults to None, in which case no check is performed. Valid options are:
“one_to_one” or “1:1”: check if merge keys are unique in both left and right datasets.
“one_to_many” or “1:m”: check if merge keys are unique in left dataset.
“many_to_one” or “m:1”: check if merge keys are unique in right dataset.
“many_to_many” or “m:m”: allowed, but does not result in checks.
- Returns
The merged DataFrame.
- Return type