Skip to main content
Version: 2.0

Create a corpus

POST 

/v2/corpora

Create a corpus, which is a container to store documents and associated metadata.

Request

Body

    key CorpusKeyrequired

    Possible values: <= 50 characters, Value must match regular expression [a-zA-Z0-9_\=\-]+$

    A user-provided key for a corpus.

    name string

    The name for the corpus. This value defaults to the key.

    description string

    Description for the corpus.

    queries_are_answers boolean

    Default value: false

    Queries made to this corpus are considered answers, and not questions.

    documents_are_questions boolean

    Default value: false

    Documents inside this corpus are considered questions, and not answers.

    encoder_id string

    Possible values: Value must match regular expression enc_[0-9]+$

    The encoder used by the corpus. This value defaults to the most recent Vectara encoder.

    filter_attributes object[]

    The new filter attributes of the corpus. If unset then the corpus will not have filter attributes.

  • Array [
  • name stringrequired

    The JSON path of the filter attribute in a document or document part metadata.

    level stringrequired

    Possible values: [document, part]

    Indicates whether this a document or document part metadata filter.

    description string

    Description of the filter. May be omitted.

    indexed boolean

    Default value: true

    Whether an index is created for the filter. Creating an index will improve query latency when using the filter.

    type stringrequired

    Possible values: [integer, real_number, text, boolean, list[integer], list[real_number], list[text]]

    The value type of the filter.

  • ]
  • custom_dimensions object[]

    A custom dimension is an additional numerical field attached to a document part. You can then multiply this numerical field with a query time custom dimension of the same name. This allows boosting (or deboosting) document parts for arbitrary reasons. This feature is only enabled for Scale customers.

  • Array [
  • name stringrequired

    The name of the custom dimension.

    indexing_default double

    Default value of a custom dimension on a document part if the custom dimension value is not specified when the document part is indexed.

    A value of 0 means that custom dimension is not considered.

    querying_default double

    Default value of a custom dimension for a query if the value of the custom dimension is not specified when querying the corpus.

    A value of 0 means that custom dimension is not considered.

  • ]

Responses

The corpus has been created.

Schema
    id string

    Possible values: Value must match regular expression crp_[0-9]+$

    Vectara ID of the corpus.

    key CorpusKey

    Possible values: <= 50 characters, Value must match regular expression [a-zA-Z0-9_\=\-]+$

    A user-provided key for a corpus.

    name string

    Name for the corpus. This value defaults to the key.

    description string

    Corpus description.

    enabled boolean

    Specifies whether the corpus is enabled or not.

    chat_history_corpus boolean

    Indicates that this corpus does not store documents amd stores chats instead.

    queries_are_answers boolean

    Default value: false

    Queries made to this corpus are considered answers, and not questions. This swaps the semantics of the encoder used at query time.

    documents_are_questions boolean

    Default value: false

    Documents inside this corpus are considered questions, and not answers. This swaps the semantics of the encoder used at indexing.

    encoder_id string

    Possible values: Value must match regular expression enc_[0-9]+$

    The encoder used by the corpus.

    filter_attributes object[]

    The new filter attributes of the corpus.

  • Array [
  • name stringrequired

    The JSON path of the filter attribute in a document or document part metadata.

    level stringrequired

    Possible values: [document, part]

    Indicates whether this a document or document part metadata filter.

    description string

    Description of the filter. May be omitted.

    indexed boolean

    Default value: true

    Whether an index is created for the filter. Creating an index will improve query latency when using the filter.

    type stringrequired

    Possible values: [integer, real_number, text, boolean, list[integer], list[real_number], list[text]]

    The value type of the filter.

  • ]
  • custom_dimensions object[]

    The custom dimensions of all document parts inside the corpus.

  • Array [
  • name stringrequired

    The name of the custom dimension.

    indexing_default double

    Default value of a custom dimension on a document part if the custom dimension value is not specified when the document part is indexed.

    A value of 0 means that custom dimension is not considered.

    querying_default double

    Default value of a custom dimension for a query if the value of the custom dimension is not specified when querying the corpus.

    A value of 0 means that custom dimension is not considered.

  • ]
  • limits object
    used_bytes int64

    The number of bytes contained in the corpus.

    max_bytes int64

    The maximum number of bytes the corpus can be.

    max_metadata_bytes int64

    The maximum size that metadata can be on documents.

    index_rate int64

    The maximum per-second addition of new documents to corpus.

    created_at date-time

    Indicates when the corpus was created.

Loading...