Create an Index
Define the processing steps for your files to prepare them for search. You use Pipeline Builder to create your indexes.
About This Task
You create indexes just like you create pipelines. Indexes are a series of connected components, each performing a preprocessing step on your files. To learn more, see Indexes.
Indexes are specific to a workspace. A single query pipeline can have multiple indexes.
You can use Pipeline Builder to design your indexes.
Pipeline Builder
About Pipeline Builder
Pipeline Builder is an easy way to build and visualize your pipelines. In Pipeline Builder, you simply drag components from the components library and drop them onto a canvas, where you can customize their parameters and define connections. It helps you visualize your pipeline and offers guidance on component compatibility. You can also switch to the YAML view anytime; everything you do in Pipeline Builder is synchronized with the pipeline YAML configuration.
Using Pipeline Builder
This image shows how to access the basic functionalities in Pipeline Builder. The numbers in the list below correspond to the numbers in the image.

-
Component library. Expand a component group and drag a selected component onto the canvas to add it to your pipeline.
-
A component card. Click the component name to change it.
-
Component connections. Click an input connection on one component and an output connection on another to link them. Hover over a connection point next to an input or output to see a list of popular and compatible components you can connect.
-
Click a component card to access the menu for deleting, duplicating, and accessing the component's documentation.
-
Export your pipeline as a Python or YAML file you can save on your computer.
-
Switch to the YAML view.
After you save an index, you must enable it to start indexing the files in your workspace. Files uploaded after an index was enabled, will be automatically added to the enabled index. The query pipeline can access all files only after indexing is complete.
Prerequisites
- Understanding indexes. To learn more, see Indexes.
- Understanding pipelines and components. For details, see Pipelines and Pipeline Components.
Create an Index in Pipeline Builder
-
Log in to deepset AI Platform, go to Indexes and choose Create Index.
-
Choose a template to start with or click Build your own to create an index from scratch.
-
Give your index a name and add a meaningful description.
-
Click Create index. Your index is saved on the Indexes page.
-
Click More Actions next to your index and choose Edit.
-
If you're creating the index from a template, edit the template if needed and save your index.
-
If you're creating an index from scratch:
- Open the Inputs group and drag the
FilesInput
component onto the canvas. This is always the first component of an index. It represents the files your index will process. - The second component is often
FileTypeRouter
. It's useful if you're planning to index files of different types. You can set it to identify the file type and route it to an appropriate converter. - Choose
Converters
for the file types you want to index. - If you're using multiple Converters, add a
DocumentJoiner
to join the converted documents into a single list. - Add
Preprocessors
as needed. - Connect the components by clicking one component's input and then another component's output. The connection is automatically created and validated.
Tip: Hover your mouse over the connection point to to see compatible connections. - Add
DocumentWriter
as the last component of your index. It writes the processed documents into the document store where a query pipeline can access it. - Add a
DocumentStore
that will use this index and connect it toDocumentWriter
.
Tip:OpenSearchDocumentStore
is the core document store. - Save your index.
- Open the Inputs group and drag the
What To Do Next
- Enable the index to start indexing files. Click Enable on the index page.
- Add the index to your query pipelines. For details, see Edit a Pipeline or Create a Pipeline in Pipeline Builder.
Updated about 7 hours ago