{ "cells": [ { "cell_type": "markdown", "id": "fc2f293d", "metadata": {}, "source": [ "# Tutorial 6: Complex Components (Embedding Based Search Engine)\n", "\n", "In this tutorial, we'll show you how you can build a simple search engine over a dataset, using the CLIP model to drive the search. Users will be able to type in a query to search over images, and will see the dataset images ranked by their similarity to the query.\n", "\n", "\n", "To get started, run the tutorial demo script.\n", "\n", "```{code-block} bash\n", "mk demo match\n", "```\n", "\n", "You should see the tutorial app when you open the link in your browser. Let's break down the code in the demo script.\n", "\n", "## Installing dependencies\n", "This tutorial has additional dependencies that you need to install. Run the following command to install them.\n", "\n", "```{code-block} bash\n", "pip install ftfy regex git+https://github.com/openai/CLIP.git\n", "```\n", "\n", "Once you run the script, it will download the CLIP model and cache it in your home directory. This will take a few minutes.\n", "\n", "## Loading in the dataset" ] }, { "cell_type": "code", "execution_count": 1, "id": "4f995290", "metadata": { "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "import meerkat as mk\n", "import rich" ] }, { "cell_type": "markdown", "id": "ac8b9407", "metadata": {}, "source": [ "The first few lines just load in the `imagenette` dataset, a small 10-class subset of ImageNet." ] }, { "cell_type": "code", "execution_count": 2, "id": "f203f5c4", "metadata": {}, "outputs": [], "source": [ "IMAGE_COLUMN = \"img\"\n", "df = mk.get(\"imagenette\", version=\"160px\")" ] }, { "cell_type": "code", "execution_count": 3, "id": "0e1d21de", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "04212565526b4600b88f7bc476370e21", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Downloading: 0%| | 0.00/114M [00:00, ?B/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting tar archive, this may take a few minutes...\n" ] } ], "source": [ "EMBED_COLUMN = \"img_clip\"\n", "\n", "# Download the precomupted CLIP embeddings for imagenette.\n", "df_clip = mk.DataFrame.read(\n", " \"https://huggingface.co/datasets/meerkat-ml/meerkat-dataframes/resolve/main/embeddings/imagenette_160px.mk.tar.gz\",\n", " overwrite=False,\n", ")" ] }, { "cell_type": "code", "execution_count": 4, "id": "b8cf22fd", "metadata": { "tags": [ "remove-output" ] }, "outputs": [], "source": [ "df = df.merge(df_clip[[\"img_id\", \"img_clip\"]], on=\"img_id\")" ] }, { "cell_type": "code", "execution_count": 5, "id": "076bd8b4", "metadata": { "tags": [ "remove-input" ] }, "outputs": [ { "data": { "text/html": [ "
\n", " | img_id | \n", "path | \n", "noisy_labels_0 | \n", "noisy_labels_1 | \n", "noisy_labels_5 | \n", "noisy_labels_25 | \n", "noisy_labels_50 | \n", "is_valid | \n", "label_id | \n", "label | \n", "label_idx | \n", "split | \n", "img_path | \n", "index | \n", "img | \n", "img_clip | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "n02979186_9036 | \n", "train/n02979186/n02979186_9036.JPEG | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "False | \n", "n02979186 | \n", "cassette player | \n", "482 | \n", "train | \n", "train/n02979186/n02979186_9036.JPEG | \n", "0 | \n", "np.ndarray(shape=(512,)) | \n", "|
1 | \n", "n02979186_11957 | \n", "train/n02979186/n02979186_11957.JPEG | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n03000684 | \n", "False | \n", "n02979186 | \n", "cassette player | \n", "482 | \n", "train | \n", "train/n02979186/n02979186_11957.JPEG | \n", "1 | \n", "np.ndarray(shape=(512,)) | \n", "|
2 | \n", "n02979186_9715 | \n", "train/n02979186/n02979186_9715.JPEG | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n03417042 | \n", "n03000684 | \n", "False | \n", "n02979186 | \n", "cassette player | \n", "482 | \n", "train | \n", "train/n02979186/n02979186_9715.JPEG | \n", "2 | \n", "np.ndarray(shape=(512,)) | \n", "|
3 | \n", "n02979186_21736 | \n", "train/n02979186/n02979186_21736.JPEG | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n03417042 | \n", "False | \n", "n02979186 | \n", "cassette player | \n", "482 | \n", "train | \n", "train/n02979186/n02979186_21736.JPEG | \n", "3 | \n", "np.ndarray(shape=(512,)) | \n", "|
4 | \n", "ILSVRC2012_val_00046953 | \n", "train/n02979186/ILSVRC2012_val_00046953.JPEG | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n02979186 | \n", "n03394916 | \n", "False | \n", "n02979186 | \n", "cassette player | \n", "482 | \n", "train | \n", "train/n02979186/ILSVRC2012_val_00046953.JPEG | \n", "4 | \n", "np.ndarray(shape=(512,)) | \n", "