# Use web sync to manage a knowledge base You can use the Zoom Virtual Agent web sync on your existing knowledge base to update your chatbot to deliver accurate and relevant responses. Web sync is a web crawler. Web crawlers search and automatically index website content. This topic provides details on how to use web sync to create a knowledge base, improve the accuracy of the web sync, and how to create scripts to optimize the content. ## Before you begin For information about Zoom Virtual Agent prerequisites, see [Getting started with Zoom Virtual Agent](https://support.zoom.com/hc/en/article?id=zm_kb&sysparm_article=KB0058248). ## Create a knowledge base with web sync To provide current and accurate information for a website's knowledge base, web sync enables you to use one of these options: - Use a sitemap to sync the knowledge base - Use link discovery to provide a source URL and rules to guide the website system to sync relevant pages - Upload specific URLs to sync the knowledge base For each of these options, you can use content selectors or custom scripts to refine the content you want to extract for your knowledge base. They can target specific parts of the page you want to use, such as titles, URLs, or specific content for knowledge base articles. Web sync is appropriate when a Zoom integration is unavailable or when only a limited number of web pages exist to create a knowledge base. ### Sitemap A sitemap contains all the information about the content on your website. To sync your knowledge base using a sitemap, you can enter a specific URL in the Zoom crawler to find a sitemap, or you can directly enter a sitemap address. After the crawler finds the sitemap, you can select which URLs to include in the sync. Using a sitemap to sync your knowledge base is a good option for organizations that: - Maintain a semantically and hierarchically sound sitemap - Use a content management system without a Zoom Integration ### Link discovery Link discovery initiates a crawl with the URL of a specific page. Then the crawl follows links on the starting page that share the same directories and sub directories. You must set the URL conditions that determine which pages to store as knowledge base articles. Using link discovery to sync your relevant pages is useful for organizations when: - All articles link from one page - The content management system doesn't have a Zoom Integration - A sitemap doesn't exist or if the sitemap is incomplete ### Manual URL upload The simplest way to sync your knowledge base is to manually upload the URLs you want to crawl and store as knowledge base articles. You can directly enter the URLs or you can add them to a .csv file. Note that if you change the URLs, the sync breaks and you must update the URLs or add new URLs. Using manual URL upload works best for organizations that offer a single page FAQ. For more information about how to use web sync to create a knowledge base, see [Creating a knowledge base through web sync](https://support.zoom.com/hc/en/article?id=zm_kb&sysparm_article=KB0060767). ## Improve the accuracy of web sync You can use content selectors or custom scripts to refine the content you want for your knowledge base. ### Content selectors A basic function of Zoom web sync is web crawling for pages that have content for the knowledge base articles. You can use content selectors to precisely target parts of the page you want to use. General content selectors include: - **Page title trimmer** Removes leading or trailing words from the title. - **Content selector** CSS selector that identifies the article content. For example, enter .article-content as the selector. - **Ignore selector** CSS Selector to identify items to remove such as author and dates. - **Dismiss click selector** Only available when Javascript support is enabled. CSS Selector identifies items to click to dismiss pop ups, overlays, cookie consent, and so on. - **Dismiss click delay** Only available when Javascript support is enabled. The amount of time to wait after the dismiss click. - **Page load delay** Only available when Javascript support is enabled. The amount of time it takes for a web page to fully load and become interactive after a user initiates a request to access it. Some web pages require JavaScript to load and need JavaScript enablement to select the content. Enabling JavaScript support allows the crawler to interpret and execute JavaScript code when you access and process web content. Note that enabling JavaScript causes the sync to be slower. So we recommend you only enable JavaScript support if it's necessary. #### Javascript click simulation You can use JavaScript click simulation for page navigation or vertical accordions. Use JavaScript click simulation when your site uses a JavaScript powered navigation with the tabs or categories. You are not required to use JavaScript click simulation with a site that has tabs. Those could be full page loads you don't need. ##### Navigation If the page uses tabs, categories, menus, or other secondary navigation, you can use these options to reveal additional content on the same page. - **Navigation click selector** Only available when Javascript support is enabled. CSS Selector identifies items to click such as tabs, categories, menus, and so on. - **Navigation click delay** Only available when Javascript support is enabled. The amount of time to wait after the navigation click. ##### Accordion JavaScript Accordion is a container control with vertically collapsible panels (or vertical accordion). It contains stacked headers that expand or collapse one or more panels at a time in an available space. If the page uses accordions, such as some FAQ formats, you can use these options to reveal additional content on the same page. - **Accordion content extraction** Only available when Javascript support is enabled. Choose how you want the page to be extracted as an article. Each accordion can be extracted as an article or the entire page can be extracted as an article. - **Accordion click selector** Only available when Javascript support is enabled. CSS Selector identifies items to click to expand and close accordions. - **Accordion click delay** Only available when Javascript support is enabled. The amount of time to wait after each accordion click. - **Accordion behavior** Specify whether one or multiple items can be expanded at the same time. - **Accordion title selector** Only available when Javascript support is enabled. CSS Selector to identify a custom title to replace the page title. ### Custom scripts Use custom scripts to improve your knowledge base content. You can enable custom scripts in the advanced knowledge base settings. You can use a custom JavaScript function to customize most article fields, such as title, content, URL, tags, and category. To edit your custom script: 1. In **AI Management**, navigate to **Knowledge Base**. 2. Choose the Knowledge base you would like to update. 3. Select **Settings** - **Advanced** - **Custom Script**. Then select **Edit**. 4. After you complete your edits, choose **Save**. This section describes the various ways you can create and implement custom scripts, as well as example custom scripts. #### Custom JavaScript You can use a custom JavaScript function to customize the content or other article details, such as title and URL. The `main` function executes on each article. The argument is an article object with the following properties: ```json { "content": string, // HTML contents "url": string, // full URL "title": string, // title "language": string, // language code (e.g. en, en-US, es) "tags": [ // array of tag objects: { "id": string, // tag ID "name": string, // tag name }, { // ... } ], "category": { // array of category hierarchy objects (root is first): "id": string, // category ID "name": string, // category name }. "externalId": string // external ID or URL of the source } ``` The return value is an object with the updated article properties. You can omit unchanged properties. You can return the value null to prevent the article from being synced. You can also return an array of objects if the source should be split into multiple articles. #### Examples ##### Debug with the console log This script enables you to perform basic debugging tasks using the console log. ```javascript function main({ content, url, title }) { console.log("hello"); } ``` ##### Modify HTML contents This script enables you to modify the HTML contents of an element. ```javascript function main({ content, url, title }) { // load content into cheerio (https://cheerio.js.org) const $ = cheerio.load(content); $("#example-id1").html("
example new content
"); return { // return modified contents from cheerio content: $.html(), }; } ``` ##### Remove elements This script enables you to remove specific elements. ```javascript function main({ content, url, title }) { // load content into cheerio (https://cheerio.js.org) const $ = cheerio.load(content); $("#example-id2,.example-class").remove(); return { // return modified contents from cheerio content: $.html(), }; } ``` ##### Add new elements This script enables you to add a new element. ```javascript function main({ content, url, title }) { // load content into cheerio (https://cheerio.js.org) const $ = cheerio.load(content); $.root().append("extra example content
"); return { // return modified contents from cheerio content: $.html(), }; } ``` ##### Customize titles We provide a title trimmer feature. However, sometimes the default single match and the remove pattern is not enough. You can use the custom script to create a more advanced custom title trimmer, as seen in the example below. ```javascript function main({ title }) { return { title: title.replace("Example Inc | ", "").replace(" | Support", ""), }; } ``` When a website supports multiple languages, you might need a different title trimmer based on the article language, as seen in the following example: ```javascript function main({ title, language }) { return { title: title.replace(" | Support", "").replace(" | Soporte", ""), }; // const trimmers = { // 'en': ' | Support', // 'es': ' | Soporte' // }; // return { // title: title.replace(trimmers[language], '') // }; } ``` ##### Replace the hostname in the URL This script enables you to replace the hostname in the URL. ```javascript function main({ content, url, title }) { return { url: url.replace("old.example.com", "new.example.com"), }; } ``` ### Other custom scripts Custom scripts are available on any knowledge base type, but they are most useful for web knowledge bases. Note that scripts cannot access sensitive information. Here is an example of a basic custom script. As you implement this script, be aware of the following: - Must define a function named `main` - Pass the input article object as the only argument - View web preview or sync log in the console.log output - Get the imported cheerio library (jQuery subset) to use - Return an object with updated fields ```javascript function main(article) { console.log("input article", article); const { content, url, title } = article; const $ = cheerio.load(content); // modify some HTML elements $("h1").text("My Example Domain"); $("a").remove(); $("body").append("extra content
"); return { // return modified contents from cheerio content: $.html(), // content: 'Hello World!', // customize other fields title: title + "!", url: url.replace("www", "foo"), }; } ``` #### Extracting table content with a custom script Zoom Virtual Agent does not maintain table elements after a sync. But you can use a custom script to extract, modify, and present table content. Each table is different, such as those with column headers or row headers. You could opt to include them or not. Here are some examples of different table setups and the custom script that can help modify them. ##### Extract table cells individually ```javascript function main({ content }) { const $ = cheerio.load(content); $("table").each(function (i, tableEl) { const $table = $(tableEl); const divs = []; $table.find("td").each(function (j, tdEl) { const cellHtml = $(tdEl).html(); divs.push($("").html(cellHtml)); }); $table.replaceWith(divs); }); return { content: $.html(), }; } ``` ##### Extract table rows with headers ```javascript function main({ content }) { const $ = cheerio.load(content); $("table").each(function (i, tableEl) { const $table = $(tableEl); const headers = $table .find("thead th") .map((j, thEl) => { return $(thEl).text(); }) .toArray(); const divs = []; $table.find("tbody tr").each(function (j, trEl) { const prefixedCells = $(trEl) .find("td") .map((k, tdEl) => { const cellText = $(tdEl).text(); return `${headers[k]}: ${cellText}`; }) .toArray(); divs.push($("