Version: 2.0

Summarize a document

POST /v2/corpora/:corpus_key/documents/:document_id/summarize

Supported API Key Type:

Query ServicePersonal

Organizations often struggle with extracting relevant insights from extensive documentation, such as vendor quotes, financial statements, and technical reports. Manually reviewing these documents is both time-consuming and prone to errors.

The tech preview of the Documentation Summarization API enables users to generate concise summaries that capture essential insights from single documents without having to process entire documents manually. Efficiently process large documents, extract key insights, and interact with real-time data summaries.

Enable streaming for large documents to receive summaries incrementally.
Customize prompt_template to fine-tune summary output for specific domains.
Use standard responses for small documents where streaming is unnecessary.
Monitor streaming events to track the progress of real-time summarization.

note

The documentation length is limited by the context window of your selected LLM.

Response formats

The API supports two response modes:

Standard: Provides a complete summary in one response.
Streaming Provides incremental responses using Server-Sent Events (SSE).

Non-streaming response

In standard mode, the API returns a structured response containing the complete summary of the document. The summary field contains the generated text, enabling users to extract essential information quickly.

Streaming response

For streaming responses, the API returns Server-Sent Events (SSE). The first event begins streaming partial results as soon as they are available, while the final event marks the end of the summarization process.

The streamed response consists of multiple events:

generation_info: Contains the rendered_prompt which is the compiled prompt sent to the LLM for document summarization.
generation_chunk: Returns partial chunk of the generated summary.
generation_end: Marks the completion of the summary generation.
error: Returns an error message if summarization fails.
end: Indicates the end of the streaming session.

Prompt template example

When crafting a prompt, you can access your document with the $vectaraDocument field. This example shows a simple prompt:

{
  "role": "user",
  "content": "Summarize the document: \$vectaraDocument.json()"
}

The document also has the following methods to support custom prompts.

$vectaraDocument.json(): Provides a JSON representation of the whole document.
$vectaraDocument.id(): Specifies the unique identifier of the document (document_id)
$vectaraDocument.metadata(): Specifies metadata from the document.
For example, $vectaraDocument.metadata().get("key") retrieves a specific metadata value by key.
$vectaraDocument.parts(): Returns an array of document parts which you can look through.
For example, #foreach ($part in $vectaraDocument.parts()).
$part.text(): Retrieves the text of the part.
$part.metadata(): Retrieves metadata of a part.
$part.hasTable(): Determines if the part contains a table.
$part.table(): Provides access to the table within the part. For example, use $part.table().json() to retrieve the table in JSON format.

Request

Responses

Document summarization response on success.

Summarize a document

/v2/corpora/:corpus_key/documents/:document_id/summarize

Response formats​

Non-streaming response​

Streaming response​

Prompt template example​

Request​

Responses​