Skip to main content
Version: 2.0

Vectara and Unstructured.io

Unstructured is a well known Python package for parsing and dealing with unstructured data. Vectara is integrated into Unstructured’s ingest library, allowing developers to quickly and easily build data ingestion into Vectara that involves complex parsing of input documents such as PDFs, PPT, DOC and many other document types.

Although Vectara itself supports direct import of many documents via our File Upload API, the ingest service from Unstructured provides additional capabilities, where needed, such as table processing, image extraction, and more.

Integration benefits

  • Facilitates quick and easy data ingestion into Vectara for complex document types.
  • Provides advanced capabilities like table processing and image extraction.
  • Complements Vectara's File Upload API with additional parsing features.

This blog post demonstrates how to use Unstructured’s Ingest capability with Vectara.