Skip to main content

Deleting Documents

Endpoint Address

Vectara exposes a REST endpoint at the following URL to delete content from a corpus:
https://api.vectara.io/v1/delete-doc
This page describes the details of interacting with this endpoint.

A request to delete a document from a corpus consists of three key pieces of information: the customer ID, the corpus ID, and the document ID.

message DeleteDocumentRequest {
int64 customer_id = 1;
int64 corpus_id = 2;
string document_id = 3;
}

The reply from the server consists of nothing yet. Note that the operation is not synchronous: the document may still be returned in query results. The platform typically takes 5-10 mins before the document is removed from serving.

The server returns gRPC status codes. For example:

  • INTERNAL: An internal error code indicates a failure inside the platform, and an immediate retry may not succeed.
  • UNAVAILABLE: The service is temporarily unavailable, and the operation should be retried, preferably with a backoff. Note that the deletion operation is idempotent, so it is fine to re-apply.

Example

The code snippet below illustrates how to delete a document from a corpus. For information on how to get the call credentials and metadata, please consult The OAuth 2.0 documentation.

indexingStub.withCallCredentials(credentials(tokenSupplier.get().getOrDie()))
.withDeadlineAfter(30, TimeUnit.SECONDS) // Always set a deadline.
.delete(
DeleteDocumentRequest
.newBuilder()
.setCustomerId(customerId)
.setCorpusId(corpusId)
.setDocumentId("en.wikipedia.org/wiki/California")
.build());