DataFrame¶

A DataFrame is a collection of equal-length columns (analagous to a DataFrame in Pandas or R). DataFrames in Meerkat are used to manage datasets and per-example artifacts (e.g. model predictions and embeddings).

Below we combine the columns we created above into a single DataFrame. We also add an additional column containing labels for the images. Note that we can pass non-Meerkat data structures like list, np.ndarray, pd.Series, and torch.Tensor directly to the DataFrame constructor and Meerkat will infer the column type. We do not need to first convert to a Meerkat column.

df = mk.DataFrame(
    {
        "img": img_col,
        "label": ["boombox", "truck", "dog"],
        "id": id_col, 
    }
)
df
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 df = mk.DataFrame(
      2     {
      3         "img": img_col,
      4         "label": ["boombox", "truck", "dog"],
      5         "id": id_col, 
      6     }
      7 )
      8 df

NameError: name 'mk' is not defined

Read on to learn how we access the data in DataFrames.