Version: 2.0

Retrieve a document


Retrieve the content and metadata of a specific document, identified by its unique document_id from a specific corpus.


Path Parameters

    corpus_key CorpusKeyrequired

    Possible values: <= 50 characters, Value must match regular expression [a-zA-Z0-9_\=\-]+$

    The unique key identifying the corpus containing the document to retrieve.

    document_id stringrequired

    The document ID of the document to retrieve. This document_id must be percent encoded.

Header Parameters

    Request-Timeout integer

    Possible values: >= 1

    The API will make a best effort to complete the request in the specified seconds or time out.

    Request-Timeout-Millis integer

    Possible values: >= 1

    The API will make a best effort to complete the request in the specified milliseconds or time out.


Successfully retrieved the document.

    id string

    The document ID.

    metadata object

    The document metadata.

    property name* any

    The document metadata.

    tables object[]

    The tables that this document contains. Tables are not available when table extraction is not enabled.

  • Array [
  • id string

    The unique ID of the table within the document.

    title string

    The title of the table.

    data object

    The data of the table.

    headers array[]

    The headers of the table.

    rows array[]

    The rows in the data.

    description string

    The description of the table.

  • ]
  • parts object[]

    Parts of the document that make up the document. However, parts are not available when retrieving a list of documents or when creating a document. This property is only available when retrieving a document by ID.

  • Array [
  • text stringrequired

    The text of the document part.

    metadata object

    The metadata for a document part. These may be used in metadata filters at query time if filter attributes are configured on the corpus.

    property name* any

    The metadata for a document part. These may be used in metadata filters at query time if filter attributes are configured on the corpus.

    context string

    The context text for the document part.

    custom_dimensions object

    The custom dimensions as additional weights.

    property name* double
  • ]
  • storage_usage object

    How much storage the document used. This information is currently not returned when retrieving the document, and only returned when indexing a document.

    bytes_used int64

    Number of bytes used by document counting towards maximum corpus size, and towards any billing plans.

    metadata_bytes_used int64

    Number of metadata bytes used by a document.

    extraction_usage object

    How much extraction quota the document used. This information is currently not returned when retrieving the document, and only returned when indexing a document.

    table_extraction_used int64

    The number of pages from the document that consumed the extraction quota.

Authorization: x-api-key

name: x-api-keytype: apiKeyin: header
curl -L -X GET '' \
-H 'Accept: application/json' \
-H 'x-api-key: <API_KEY_VALUE>'
