OopenJiuwen-botfeat(kb): add new kb type weblink for frontend and backend

Knowledge Base Management

Knowledge base is an important way for the openJiuwen platform to manage local knowledge. Users can enhance the agent's knowledge retrieval RAG capabilities by managing local knowledge bases.

Knowledge Base Types

openJiuwen supports multiple knowledge base types:

Type	Description
Document	Build a knowledge base by uploading local files (e.g., PDF, Word, TXT)
Weblink	Build a knowledge base by adding web URLs (e.g., web pages, WeChat articles)

The type must be selected when creating a knowledge base and cannot be changed afterward.

Create Knowledge Base

Prerequisites

A usable model has been configured in the Embedding Model tab of the Model Management module. For how to configure Embedding models, please refer to the Model Management related sections.

Operation Steps

Log in to the openJiuwen platform.
Navigate to the Knowledge Base Management module in the left sidebar of the platform.
Click the Create Knowledge Base button.
In the create knowledge base dialog:
- Enter the Knowledge Base Name and Description (optional)
- Select the Knowledge Base Type: Document or Weblink
- Select a model from the Embedding Model dropdown (Note: The Embedding model cannot be changed after the knowledge base is created)
- Click Create
For Document knowledge bases: On the created knowledge base card, click the Edit button.
On the edit knowledge base page, click Add Document.
In the add document dialog, select the files you want to upload to the knowledge base by dragging or clicking Select Files (multiple files can be selected), then click Next.

On the document parameters page, configure document parsing and indexing parameters, then click Next.

Document Parameters

The document parameter configuration descriptions are as follows:

Parameter Name	Description	Configuration Instructions
Parsing Strategy	Controls the document parsing method	- Quick Parsing: Uses default parsing strategy to quickly process documents, suitable for most scenarios - Note: Currently only quick parsing mode is supported
Segmentation Strategy	Controls the document text segmentation method	- Auto Segmentation and Cleaning: System automatically performs text segmentation and cleaning, suitable for most scenarios - Custom: Manually configure segmentation parameters for precise control of segmentation effects - Note: After selecting "Custom", you need to configure sub-parameters: Maximum Tokens and Segmentation Overlap Percentage
Maximum Tokens	Maximum number of tokens per segment (sub-parameter)	- Function: Controls the length of each text segment - Range: 16-1024 - Default Value: 512 - Display Condition: Only displayed when segmentation strategy is set to "Custom" - Recommendation: Set according to document type and retrieval needs. Too small may lose context, too large may affect retrieval accuracy
Segmentation Overlap Percentage	Overlap ratio between adjacent segments (sub-parameter)	- Function: Controls the overlap degree between segments to maintain context coherence - Range: 0-50 - Default Value: 10 - Display Condition: Only displayed when segmentation strategy is set to "Custom" - Recommendation: Usually set to 10-20, can be adjusted according to document characteristics
Document Graph Construction	Whether to build document graph	- Function: After enabling, document graph index can be built to improve complex relationship retrieval effects - Note: Enabling document graph will increase index construction time and consume additional LLM tokens - Note: After enabling, you need to configure the sub-parameter LLM model
LLM Model	Large language model used for document graph construction (sub-parameter)	- Function: Model used to extract entities and relationships during document graph index construction - Display Condition: Only displayed when document graph construction is enabled, and must be selected - Recommendation: Choose a model with stable performance and support for long text

After that, documents will be processed one by one. You can click Refresh Status to get the latest document status, and the page will automatically refresh document status. You can cancel automatic refresh by clicking Stop Auto-refreshing Document Status.
Indexed documents will display Indexed, and documents with document graph construction enabled will have a Graph Enhanced label, while those without will not. If you still need to upload documents, you can continue by clicking Add Document in the upper right corner.

Document Indexing Complete

Weblink Knowledge Base

Weblink knowledge bases allow you to build a knowledge base by adding web URLs. They are suitable for web pages, WeChat public account articles, and other online content. The system fetches web content, parses it, segments it, and builds indexes for agent retrieval.

Create a Weblink Knowledge Base

Log in to the openJiuwen platform and go to Knowledge Base Management.
Click the Create Knowledge Base button.
In the create knowledge base dialog:
- Enter the Knowledge Base Name and Description (optional)
- Select Weblink as the Knowledge Base Type
- Select an Embedding Model
- Click Create
On the created knowledge base card, click the Edit button to open the editor.

Add Web Links

On the knowledge base editor page, click the Add Link button.
In the "Add Web Links" dialog, enter one URL per line. Both http:// and https:// links are supported (e.g., web pages, WeChat public account articles).
- Format: URLs must start with http:// or https://
- Limit: Up to 50 URLs per batch
- After entering URLs, click Add and Next

On the "Link Parameters" page, configure parsing and indexing parameters, then complete to start processing.

Parameter Name	Description	Configuration Instructions
Parsing Strategy	Controls how web pages are parsed	- Quick Parsing: Uses default parsing for fast processing, suitable for most scenarios - Note: Currently only quick parsing is supported
Segmentation Strategy	Controls text segmentation	- Auto Segmentation and Cleaning: Automatic segmentation and cleaning - Custom: Manually configure segmentation; set Maximum Tokens and Overlap %
Maximum Tokens	Max tokens per segment (sub-parameter)	- Range: 16-1024 - Default: 512
Segmentation Overlap %	Overlap ratio between segments (sub-parameter)	- Range: 0-50 - Default: 10
Document Graph Construction	Whether to build document graph	- Function: Enables graph index for better complex-relation retrieval - Note: Increases build time and LLM token usage - Note: Requires selecting an LLM model when enabled
LLM Model	LLM for document graph (sub-parameter)	- Function: Extracts entities and relations during graph index build - Display: Shown only when graph construction is enabled; required when enabled

Links are processed one by one. You can click Refresh Status to get the latest status; the page also auto-refreshes. Use Stop Auto-refreshing Link Status to disable auto-refresh.
Indexed links show Indexed. Links with document graph enabled show a Graph Enhanced label. To add more links, click Add Link.

Manage Links

Rename: Click a link name in the list to edit it.
Delete: Select one or multiple links, then delete.
Refresh: Click Refresh to update a single link; click Refresh All for batch update. On first load or refresh, the system tries to parse the page title from the URL and update the link name.

Notes

The knowledge base type cannot be changed after creation.
Ensure target URLs are publicly accessible; otherwise content may not be fetched.
WeChat public account articles and similar pages must be viewable in a browser; the system parses them as web pages.
Link processing runs asynchronously; processing time depends on page size and complexity.