Skip to main content
Importing websites

Add website content to your knowledge base

Jake Rosenthal avatar
Written by Jake Rosenthal
Updated over 4 months ago

How to import websites

Importing websites into your knowledge base is a quick and easy way to provide Cassidy with access to your company's online content, such as blog posts, product pages, and FAQs. By doing so, you can enable the AI to answer questions and provide information based on your website's content.

Here's how to import websites into your knowledge base:

  1. Navigate to the Knowledge Base: Click on the "Knowledge Base" section in the left sidebar of your Cassidy homepage.

  2. Select a folder: Choose the appropriate location where you want to upload your folder. This can be the main Knowledge Base area or an existing folder or collection.

  3. Click the "+ New" button: In the top-right corner of the respective folder or main area, click on the "+ New" button to open the upload options.

  4. Choose "Import Website": From the dropdown menu, select "Import Website" to begin the process of adding a website to your knowledge base.

  5. Enter the website URL: In the "Starting URL" field, paste the URL of the website you want to import. This can be your company's main website or a specific page, such as an FAQ or product page.

  6. Select the crawl mode: Choose the appropriate crawl mode for your website:

    • "Domain" mode: This mode will crawl and import all pages within the specified domain. It is recommended for most users who want to import their entire website.

    • "Subdomain" mode: This mode will crawl and import all pages within the specific subdomain.

    • "Page" mode: This mode will only import the specific page you entered in the "Starting URL" field. Use this mode if you want to import a single page or a small subset of your website.

    • "Custom (Advanced)": This mode uses glob patterns to specify a set of URLs that you want to crawl. See below for more information.

  7. Select the Max Pages to Import: Choose the maximum number of pages you want to import from the website. This helps manage the size of your knowledge base and ensures that only the most relevant content is imported.

  8. Select Sync Frequency: Choose how often you want Cassidy to automatically sync and update the imported website content. Options include:

    • Weekly (contact us to enable this option)

    • Monthly

    • Disabled (if you don't need the website to be synced automatically)

  9. Click "Save & Begin Import": After entering the URL and selecting the crawl mode, click the "Save & Begin Import" button to start the website import process.

  10. Wait for the import to complete: Cassidy will begin crawling and importing the specified website content. The time required for the import process may vary depending on the size of the website and the number of pages being imported.

  11. Verify the imported website: Once the import is complete, you will see a new entry in your knowledge base representing the imported website (and its pages). Click on the entry to view the imported content and ensure that it has been successfully added to your knowledge base.

After importing a website, you can move it into relevant folders or collections, just like you would with uploaded documents. This helps keep your knowledge base structured and makes it easier for your team to find the information they need.

Remember to regularly update your imported website content to ensure that Cassidy has access to the most up-to-date information. You can do this by setting an appropriate Sync Frequency or manually re-importing the website.

By importing websites into your knowledge base, you can leverage Cassidy's AI capabilities to provide accurate and timely answers based on your company's online content, improving the efficiency and effectiveness of your team.


URL Glob Patterns (for Custom crawl mode)

URL glob patterns are a way to specify a set of URLs that you want to crawl. To match multiple URLs, you can use special characters called wildcards.

Supported Wildcards

  • * - Matches any sequence of characters (except /)

  • ** - Matches any sequence of characters (including /)

Examples

https://example.com/docs/**

  • https://example.com/docs/first

  • https://example.com/docs/first/nested

  • https://example.com/docs/second

  • https://example.com/help

https://example.com/docs/*

  • https://example.com/docs/first

  • https://example.com/docs/second

  • https://example.com/docs/first/nested

  • https://example.com/help

Did this answer your question?