Skip to main content
Version: 2.0

Create a new turn in the chat

POST 

/v2/chats/:chat_id/turns

Create a new turn in the chat. Each conversation has a series of turn objects, which are the sequence of message and response pairs tha make up the dialog.

Request

Path Parameters

    chat_id stringrequired

    Possible values: Value must match regular expression cht_.+$

    The ID of the chat.

Body

    query stringrequired

    The chat message or question.

    search object

    Search parameters to retrieve knowledge for the query.

    corpora object[]

    Possible values: >= 1

    The corpora that you want to search.

  • Array [
  • custom_dimensions object

    The custom dimensions as additional weights.

    property name* double
    metadata_filter string

    The filter string to narrow the search to according to metadata attributes. The query against this corpus will be confined to document parts that match the metadata_filter. Only metadata set as filter_attributes on the corpus can be filtered. Filter syntax is similiar to a SQL where clause. See metadata filters documentation for more information.

    lexical_interpolation float

    Possible values: <= 1

    How much to weigh lexical scores compared to the embedding score. 0 means lexical search is not used at all, and 1 means only lexical search is used.

    semantics SearchSemantics

    Possible values: [default, query, response]

    Default value: default

    Indicates whether to consider a query against this corpus as a query or a response.

    corpus_key CorpusKeyrequired

    Possible values: <= 50 characters, Value must match regular expression [a-zA-Z0-9_\=\-]+$

    A user-provided key for a corpus.

  • ]
  • offset int32

    Specifies how many results into the result to skip. This is useful for pagination.

    limit int32

    Possible values: >= 1

    Default value: 10

    The maximum number of results returned.

    context_configuration object

    Configuration on the presentation of each document part in the result set.

    characters_before int32

    The number of characters before the matching document part that are shown. This is useful to show the context of the document part in the wider document. Ignored if sentences_before is set. Vectara will capture the full sentence that contains the captured characters, so as to not lose the meaning caused by a truncated word or sentence.

    characters_after int32

    The number of characters after the matching document part that are shown. This is useful to show the context of the document part in the wider document. Ignored if sentences_after is set. Vectara will capture the full sentence that contains the captured characters, so as to not lose the meaning caused by a truncated word or sentence.

    sentences_before int32

    The number of sentences before the matching document part that are shown. This is useful to show the context of the document part in the wider document.

    sentences_after int32

    The number of sentences after the matching document part that are shown. This is useful to show the context of the document part in the wider document.

    start_tag string

    The tag that wraps the document part at the start. This is often used to provide a start HTML/XML tag or some other delimiter you can use in an application to understand where to provide highlighting in your UI and understand where the context before ends and the document part begins.

    end_tag string

    The tag that wraps the document part at the end. This is often used to provide a start HTML/XML tag or some other delimiter you can use in an application to understand where to provide highlighting in your UI and understand where the context before ends and the document part begins.

    reranker object

    Rerank results of the search. Rerankers are very powerful tools to better order search results. By default the search will use the most powerful reranker available to the customer's plan. To disable reranking set the reranker type to "none".

    oneOf
    type string

    Default value: customer_reranker

    When type is is customer_reranker, you can specify the reranker_name of a reranker. reranker_id is deprecated. The retrieval engine will then rerank results using that reranker.

    reranker_id stringdeprecated

    Possible values: Value must match regular expression rnk_(?!272725718)\d+

    The ID of the reranker. Current reranker that may be used by Scale customers is rnk_272725719. Do not specify the MMR reranker ID here, and instead use the MMR reranker object type. Deprecated: Use reranker_name instead.

    reranker_name string

    The name of the reranker. Do not specify the MMR reranker name here. Instead use the MMR reranker object type.

    generation object

    The parameters to control generation.

    generation_preset_name string

    Possible values: non-empty

    The preset values to use to feed the query results and other context to the model.

    A generation_preset is an object with a bundle of properties that specifies:

    • The prompt_template that is rendered then sent to the LLM.
    • The LLM used.
    • model_parameters such as temperature.

    All of these properties except the model can be overriden by setting them in this object. Even when a prompt_template is set, the generation_preset_name is used to set the model used.

    If generation_preset_name is not set the Vectara platform will use the default model and prompt.

    prompt_name stringdeprecated

    Possible values: non-empty

    Use generation_preset_name instead of prompt_name.

    max_used_search_results int32

    Default value: 5

    The maximum number of search results to be available to the prompt.

    prompt_template string

    Vectara manages both system and user roles and prompts for the generative LLM out of the box by default. However, Scale customers can override the prompt_template via this variable. The prompt_template is in the form of an Apache Velocity template. For more details on how to configure the prompt_template, see the long-form documentation. See pricing for more details on becoming a Scale customer.

    prompt_text stringdeprecated

    This is property is deprecated in favor clearer naming. Use prompt_template. This property will be ignored if prompt_template is set.

    max_response_characters int32

    Controls the length of the generated output. This is a rough estimate and not a hard limit: the end output can be longer or shorter than this value. This is generally implemented by including the max_response_characters in the prompt, and the LLM's instruction following capability dictates how closely the generated output is limited.

    So, this value This is currently a Scale-only feature. See pricing for more details on becoming a Scale customer.

    response_language Language

    Possible values: [auto, eng, deu, fra, zho, kor, ara, rus, tha, nld, ita, por, spa, jpn, pol, tur, vie, ind, ces, ukr, ell, heb, fas, hin, urd, swe, ben, msa, ron]

    Default value: auto

    Languages that the Vectara platform supports.

    model_parameters object

    The parameters for the model. These are currently a Scale-only feature. See pricing for more details on becoming a Scale customer. WARNING: This is an experimental feature, and breakable at any point with virtually no notice. It is meant for experimentation to converge on optimal parameters that can then be set in the prompt definitions.

    max_tokens int32

    Possible values: >= 1

    The maximum number of tokens to be returned by the model.

    temperature float

    The sampling temperature to use. Higher values make the output more random, while lower values make it more focused and deterministic.

    frequency_penalty float

    Higher values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

    presence_penalty float

    Higher values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

    citations object

    Style the generator should use when making citations.

    style string

    Possible values: [none, numeric, html, markdown]

    The citation style to be used in summary. Can be one of:

    • numeric - Citations formatted as simple numerals: [1], [2] ...
    • none - Citations removed from text.
    • html - Citation formatted as url like <a href="url_pattern">text_pattern</a>.
    • markdown - Formatted as [text_pattern](url_pattern).
    url_pattern string

    The url pattern if the citation_style is set to html or markdown. The pattern can access metadata attributes in the document or part. e.g. https://my.doc/foo/{doc.id}/{part.id}

    The default url_pattern is an empty string.

    text_pattern string

    The text pattern if the citation_style is set to html or markdown. This pattern sets the href for html or the text within [] in markdown, and defaults to N being the index of result if it is not set.

    The default citation style looks like [N](<url_pattern>) for markdown.

    You can use metadata attributes in the text_pattern. For example, the pattern {doc.title} with citation style markdown would result in final citation output like [Title](<url_pattern>) when the document's metadata includes {"title":"Title"}.

    enable_factual_consistency_score boolean

    Default value: true

    Enable returning the factual consistency score with query results.

    chat object

    Parameters to control chat behavior.

    store boolean

    Default value: true

    Indicates whether to store chat message and response message.

    stream_response boolean

    Default value: false

    Indicates whether the response should be streamed or not.

Responses

A response to a chat request.

Schema
    chat_id string

    If the chat response was stored, the ID of the chat.

    turn_id string

    If the chat response was stored, the ID of the turn.

    answer string

    The message from the chat model for the chat message.

    response_language Language

    Possible values: [auto, eng, deu, fra, zho, kor, ara, rus, tha, nld, ita, por, spa, jpn, pol, tur, vie, ind, ces, ukr, ell, heb, fas, hin, urd, swe, ben, msa, ron]

    Default value: auto

    Languages that the Vectara platform supports.

    search_results object[]

    The ranked search results that the chat model used.

  • Array [
  • text string

    The document part altered by the context configuration that matches the query.

    score double

    The score of the individual result.

    part_metadata object

    The metadata for the document part.

    property name* any

    The metadata for the document part.

    document_metadata object

    The metadata for the document that contains the document part.

    property name* any

    The metadata for the document that contains the document part.

    document_id string

    The ID of the document that contains the document part.

    request_corpora_index int32

    A query request can search over multiple corpora at a time. This property is set to the index in the list of corpora in the original search request that this search result originated from.

    If the query request is only over one corpus, this property is 0.

  • ]
  • factual_consistency_score float

    The probability that the summary is factually consistent with the results.

    rendered_prompt string

    The rendered prompt sent to the LLM. Useful when creating customer prompt_text templates. Only available to Scale customers.

    rephrased_query string

    If you are on the Scale plan, you can view the actual query made to backend that was rephrased by the LLM from the input query.

Loading...