FileUpload
POST/v1/upload
The File Upload API can be used to index binary files like PDFs, Word Documents, and similar. Vectara will attempt to automatically extract the text and any metadata from the document like author or title, though you can provide additional metadata as well.
Some tips for this API:
- This operation authenticates with either the Personal API Key, Index API Key, or OAuth 2.0 (in a JWT "Bearer Token"). You can find details of how to set up an API key or use OAuth 2.0 here.
- You can find a full list of supported file formats here.
- To provide additional metadata, set the
doc_metadata
field. You can find some additional details here - PDFs must contain text: Vectara does not currently support indexing scanned images via OCR.
- There is a known issue with the OpenAPI plugin where the generated Python script for file uploads incorrectly uses placeholder values for the file path and filename. Manually replace '/path/to/file' and 'file' in the files array with the actual file path and filename.
Request
Responses
- 200
- 400
- 401
- 403
- 409
- 507
A successful response
An invalid request was sent. e.g. one or more parameters was missing, or the corpus does not exist.
The request was not authenticated
The caller is not authorized to add documents to the corpus
A document already exists in the corpus with the same document ID, yet the contents of the indexed document are different than the file being uploaded. Since the indexer is idempotent, the same document (identified by the document ID) can be uploaded multiple times. The indexer does not support updates yet, so an error is returned when a different document is uploaded for the same document ID Note that when a raw file is uploaded, the file name is used as the document ID.
There is no more indexing quota left for the corpus or customer to index more documents. Upgrade your account, add a credit card, or contact sales.