meerkat.merge¶

merge(left: meerkat.dataframe.DataFrame, right: meerkat.dataframe.DataFrame, how: str = 'inner', on: Union[str, List[str]] = None, left_on: Union[str, List[str]] = None, right_on: Union[str, List[str]] = None, sort: bool = False, suffixes: Sequence[str] = ('_x', '_y'), validate=None) → meerkat.dataframe.DataFrame[source]¶

Perform a database-style join operation between two DataFrames.

Parameters

left (DataFrame) – Left DataFrame.
right (DataFrame) – Right DataFrame.
how (str, optional) – The join type. Defaults to “inner”.
on (Union[str, List[str]], optional) – The columns(s) to join on. These columns must be ScalarColumn. Defaults to None, in which case the left_on and right_on parameters must be passed.
left_on (Union[str, List[str]], optional) – The column(s) in the left DataFrame to join on. These columns must be ScalarColumn. Defaults to None.
right_on (Union[str, List[str]], optional) – The column(s) in the right DataFrame to join on. These columns must be ScalarColumn. Defaults to None.
sort (bool, optional) – Whether to sort the result DataFrame by the join key(s). Defaults to False.
suffixes (Sequence[str], optional) – Suffixes to use in the case their are conflicting column names in the result DataFrame. Should be a sequence of length two, with suffixes[0] the suffix for the column from the left DataFrame and suffixes[1] the suffix for the right. Defaults to (“_x”, “_y”).
validate (_type_, optional) –
The check to perform on the result DataFrame. Defaults to None, in which case no check is performed. Valid options are:
- “one_to_one” or “1:1”: check if merge keys are unique in both left and right datasets.
- “one_to_many” or “1:m”: check if merge keys are unique in left dataset.
- “many_to_one” or “m:1”: check if merge keys are unique in right dataset.
- “many_to_many” or “m:m”: allowed, but does not result in checks.

Returns

The merged DataFrame.

Return type

DataFrame

🔮 v0.4.11

meerkat.merge

meerkat.merge¶