Skip to main content
Version: 2.0

List the documents in the corpus

GET 

/v2/corpora/:corpus_key/documents

Retrieve a list of documents stored in a specific corpus. This endpoint provides an overview of document metadata without returning the full content of each document.

Request

Path Parameters

    corpus_key CorpusKeyrequired

    Possible values: <= 50 characters, Value must match regular expression [a-zA-Z0-9_\=\-]+$

    The unique key identifying the queried corpus.

Query Parameters

    limit int32

    Possible values: >= 1 and <= 100

    Default value: 10

    The maximum number of documents to return at one time.

    metadata_filter string

    Filter documents by metadata. Uses the same expression as a query metadata filter, but only allows filtering on document metadata.

    page_key string

    Used to retrieve the next page of documents after the limit has been reached.

Header Parameters

    Request-Timeout integer

    Possible values: >= 1

    The API will make a best effort to complete the request in the specified seconds or time out.

    Request-Timeout-Millis integer

    Possible values: >= 1

    The API will make a best effort to complete the request in the specified milliseconds or time out.

Responses

List of documents.

Schema
    documents object[]

    List of documents.

  • Array [
  • id string

    The document ID.

    metadata object

    The document metadata.

    property name* any

    The document metadata.

    tables object[]

    The tables that this document contains. Tables are not available when table extraction is not enabled.

  • Array [
  • id string

    The unique ID of the table within the document.

    title string

    The title of the table.

    data object

    The data of the table.

    headers array[]

    The headers of the table.

    rows array[]

    The rows in the data.

    description string

    The description of the table.

  • ]
  • parts object[]

    Parts of the document that make up the document. However, parts are not available when retrieving a list of documents or when creating a document. This property is only available when retrieving a document by ID.

  • Array [
  • text stringrequired

    The text of the document part.

    metadata object

    The metadata for a document part. These may be used in metadata filters at query time if filter attributes are configured on the corpus.

    property name* any

    The metadata for a document part. These may be used in metadata filters at query time if filter attributes are configured on the corpus.

    context string

    The context text for the document part.

    custom_dimensions object

    The custom dimensions as additional weights.

    property name* double
  • ]
  • storage_usage object

    How much storage the document used. This information is currently not returned when retrieving the document, and only returned when indexing a document.

    bytes_used int64

    Number of bytes used by document counting towards maximum corpus size, and towards any billing plans.

    metadata_bytes_used int64

    Number of metadata bytes used by a document.

  • ]
  • metadata object

    The standard metadata in the response of a list operation.

    page_key string

    When requesting the next page of this list, this is needed as a query parameter.

Loading...