Skip to main content
Version: 2.0

Upload a file to the corpus

POST 

/v2/corpora/:corpus_key/upload_file

Upload a file, such as a PDF or Word document, to the specified corpus for automatic text extraction and metadata parsing.

This endpoint expects a multipart/form-data request with the following fields:

  • metadata: An optional JSON object containing additional metadata to associate with the document.
    Example: metadata={"key": "value"}
  • chunking_strategy: An optional JSON object that sets the chunking method for text extraction.
    • By default, the platform uses sentence-based chunking (one chunk per sentence).
    • Example for explicit sentence chunking: chunking_strategy={"type":"sentence_chunking_strategy"}
    • Example for max chars chunking: chunking_strategy={"type":"max_chars_chunking_strategy","max_chars_per_chunk":512}
  • table_extraction_config: An optional JSON object to control table extraction from supported file types (e.g., PDF).
    Example: table_extraction_config={"extract_tables": true}
  • file: The file to upload. Attach your file as the value for this field.
  • filename: The desired name for the uploaded file. Specify as part of the file field in your request.

Request

Responses

The extracted document has been parsed and added to the corpus.