Skip to main content

Data Ingestion

Vectara offers multiple data ingestion methods to accommodate different types of use cases. By choosing the appropriate ingestion method, users can efficiently index their data and leverage our advanced search capabilities.

Vectara Ingest: Sample Data Ingestion Framework

Getting data into Vectara is simple using either our REST or gRPC APIs. We built a full sample ingestion framework ready to go with Vectara Ingest, which includes preconfigured templates that enable you to pull data from many popular data sources such as websites and RSS feeds.

Standard Indexing API

We recommend the Standard Indexing method for indexing a set of semi-structured documents or content into a corpus. Vectara will index and chunk the documents for you.

You can also experiment with this REST endpoint in our interactive API Playground.

File Upload API

The File Upload method exposes an HTTP endpoint to upload and index files into a corpus. We recommend this option when you do not need to define additional user-supplied metadata beyond what is extracted by the Vectara platform. When you upload files like PDFs and Word Documents, Vectara attempts to automatically extract the text and any metadata.

If you want to attach metadata for optimizing searches made against your data, you can format your data as JSON.

Our interactive API Playground enables you to experiment with this File Upload REST endpoint.