Unstructured Datasets meet Foundation Models.

Meerkat is an open-source Python library, designed to help technical teams interactively wrangle images, videos, text documents and more with foundation models.

Our goal is to make foundation models a more reliable software abstraction for processing unstructured datasets. Read our blogpost to learn more.

Install Meerkat

$ pip install meerkat-ml
Meerkat is a research project, so users should expect rapid updates and rough edges. The current API is subject to change.
Data Frames for Images
A Meerkat DataFrame is a heterogeneous data structure with an API backed by foundation models.
  • Structured fields (e.g. numbers and dates) live alongside unstructured objects (e.g. images), and their tensor representations (e.g. embeddings).
  • Functions like mk.embed abstract away boiler-plate ML code, keeping the focus on the data.
import meerkat as mk 

df = mk.from_csv("paintings.csv")
df["img"] = mk.files("img_path")
df["embedding"] = mk.embed(
Interactivity in Python
Interactive data frame visualizations that allow you to control foundation models as they process your data.
  • Meerkat visualizations are implemented in Python, so they can be composed and customized in notebooks or data scripts.
  • Labeling is critical for instructing and validating foundation models. Labeling GUIs are a priority in Meerkat.
match = mk.gui.Match(df, 
sorted_df = mk.sort(df, 
gallery = mk.gui.Gallery(sorted_df)
mk.gui.html.div([match, gallery])

Built for technical teams

๐Ÿงช๏ธ Data Science Teams

Data frames, visualizations and interactive data analysis over unstructured data in Jupyter Notebooks with pure Python.

๐Ÿ‘จโ€๐Ÿ’ป๏ธ Software Engineering Teams

Fully custom applications in SvelteKit that seamlessly connect to unstructured data and model APIs in Python.

๐Ÿค–๏ธ Machine Learning Teams

Graphical user interfaces to prompt and control foundation models, collect feedback and iterate, all with Python scripting.

