> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cassidyai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Advanced Document Extraction

> Extract text from PDFs, Word docs, PowerPoint, Keynote, and images using AI vision (OCR) in a Workflow.

Extract text from documents and images using AI vision (OCR). Use the **Advanced Document Extraction** action when a previous step in your [Workflow](/workflows/overview) outputs a **file** — such as a PDF attachment or a downloaded document — and you need that content as text for later steps like [Extract Information](/reference/actions/extract-information) or [Generate Text](/reference/actions/generate-text).

<Info>
  Use **Advanced Document Extraction** when your document has complex layouts
  like tables or multi-column structures, poor scan quality, non-Latin scripts
  with diacritics, or when accuracy is very important.
</Info>

<Steps>
  <Step title="Add the action">
    In the Workflow builder, click **+** between blocks and search for **Advanced Document Extraction** or **OCR**, then select it from the action library.

    <Frame>
      <img src="https://mintcdn.com/cassidy/1XIhoA23Ch7hxVJV/images/reference/advanced-document-extraction-img-1.png?fit=max&auto=format&n=1XIhoA23Ch7hxVJV&q=85&s=97d8a2e8a52bdede96299ace2fea52cd" alt="Action library with Advanced Document Extraction selected" width="388" height="142" data-path="images/reference/advanced-document-extraction-img-1.png" />
    </Frame>

    <Frame>
      <img src="https://mintcdn.com/cassidy/1XIhoA23Ch7hxVJV/images/reference/advanced-document-extraction-img-2.png?fit=max&auto=format&n=1XIhoA23Ch7hxVJV&q=85&s=ab63c544aadfd56b071c34846cc1dc00" alt="Advanced Document Extraction action added to the Workflow" width="1032" height="636" data-path="images/reference/advanced-document-extraction-img-2.png" />
    </Frame>
  </Step>

  <Step title="Select the document">
    In the **Document** field, reference the file you want to read. Use **#** to [reference variables](/guides/prompt-engineering#workflow-prompts) from a previous step.

    Common sources include:

    * **Image uploads** from a [Manual trigger](/reference/triggers/manual) file input (images are not auto-parsed like PDFs and Word docs)
    * **Email attachments** from an [Email trigger](/reference/triggers/email)
    * **Downloaded files** from **Download File from URL** or integration download actions (Google Drive, SharePoint, Box, etc.)
    * **Files inside a [Loop](/reference/actions/loop)** when processing multiple attachments or documents one at a time
  </Step>

  <Step title="Use the output in later steps">
    The action outputs the extracted text as a single text variable. Reference it in subsequent actions to summarize, categorize, extract fields, or generate a response.
  </Step>
</Steps>

## Supported file types

| Type          | Formats                                                             |
| ------------- | ------------------------------------------------------------------- |
| **Documents** | PDF, Word (`.doc`, `.docx`), PowerPoint (`.pptx`), Keynote (`.key`) |
| **Images**    | PNG, JPEG, WEBP, GIF                                                |

<Warning>
  If the file type isn't supported, the action stops the Workflow with an error.
  Convert the file to a supported format before passing it in.
</Warning>

<Accordion title="Advanced: custom instructions">
  Under **Advanced Settings**, add **Custom Instructions** to tell the AI what to focus on or how to format the output. For example:

  * `Extract all table data as markdown tables`
  * `Focus on the invoice number, date, and line items only`
  * `Preserve headings and bullet lists from the original document`

  Custom instructions are useful when you need a specific structure rather than a full verbatim transcription.
</Accordion>

<Tip>
  Pair **Advanced Document Extraction** with [Extract
  Information](/reference/actions/extract-information) to turn unstructured
  document text into structured fields you can send to CRM actions,
  spreadsheets, or email steps.
</Tip>

## Related

* [Extract Information](/reference/actions/extract-information)
* [Get Knowledge Base File](/reference/actions/get-knowledge-base-file)
* [Email trigger](/reference/triggers/email)
* [Create files with Workflows](/guides/create-files)
