Knowledge Base Builder

The Knowledge base builder tool allows you to build a conversation easily via an intuitive user interface and then use the resulting JSON in the knowledge base Skill. Knowledge Base contains your library of Q&As: questions the user might ask around a given topic and how your Digital Person would answer. Easily build your Q&A base with this tool, then train the NLU and test how it would play out. When you're done, export the Knowledge Base as a JSON file and take it back to DDNA Studio.

Accessing Knowledge Base Builder

You can access the KnowledgeBase Conversation Builder from the left navigation menu of the Knowledge Base Skill or access it directly from https://knowledgebase-ui.prod.soulmachines.cloud/.

Use your Soul Machines credentials to log into Knowledge Base Builder. The landing page of the Knowledge Base Builder lists all the Knowledge bases created by users of your organisation. You can either create your conversation from scratch or edit/duplicate existing Knowledge Base.

Knowledge Base Builder Landing Page

Creating new knowledge base

To create your Knowledge Base from scratch, press the + button and enter a name for the Knowledge base. Under each topic you can have multiple related questions and answers.

Field

Description

Topic Name

Used as a unique identifier for each JSON object. (i.e. WhatIsSoulMachines)

User Question/s

List of strings that contain examples of different utterances from the end user (i.e. "what is soul machines", "tell me about soul machines", "who is soul machines", "who are soul machines", "what are soul machines", "what is soul machine", "who is soul machine", "who are soul machine" )

DP Answer

List of responses the Digital Person should use to answer that question. If more than one value is found for a given topic, the skill will randomly pick one of the responses to reply with. (i.e. "@ShowCards(pic) Soul Machines is the world leader in humanizing AI to create astonishing Digital People. A little secret, they actually created me."

Each answer can be associated with visual elements using content cards and can also include gesture markups to enhance the Digital Person’s behavior.

Adding Gestures Markup

Digital Person’s behavior can be adjusted using hashtag gesture markup inserted directly into the DP answers.

We offer following types of gesture hashtags:

  • Hand and body gesture markup - These can trigger unique movements of the hand and body that aren't automatically available in real-time gesturing.

  • Head and face gesture markup - These can trigger specific facial gestures and emphasise emotions in speech.

For detailed list of supported Gesture Markup, see https://soulmachines-support.atlassian.net/wiki/spaces/SSAS/pages/1417150475.

Adding Content Cards

In order to enhance the responses with media or links you can add content cards to the DP Answers. Available content cards are Image(s), video, link, list, and markdown. To add content cards:

  1. Create a DP answer

  2. Click the answer to Edit

  3. In the dropdown select a content card you would like to use.

Each of the content cards have different options as below:

Content card type

Field

Description

Image

Image URL

When using a URL, the url must be accessible by the UI. For example, if the UI is accessed via https, then image URLs must also be accessed using https.

Use the prefix // to access the image using the same protocol at the host UI. The hosted image must support both http and https, if this approach is to be used.
e.g. "url": "//www.soulmachines.com/wp-content/uploads/sm-logo-retina.png"

The URL must have a prefix.
e.g. "url": "soulmachines.com/wp-content/uploads/sm-logo-retina.png" will not work.

Image Alt Text

An alternative text describing the image.

Video

Youtube Video Id

The YouTube videoId for the video you wish to embed.
You may find your videoId on the end of the YouTube video URL i.e.
https://www.youtube.com/watch?v= vxmthHfkoaw

Automatically play video full screen

When autoplay is turned on the video will pop up to full screen and begin playing as soon as showcards is triggered for the card.

Close the card when video ends

When autoclose  is turned on for a video card, the video player closes at the end of the video and enables you to immediately continue the conversation.

Option List

Label

Label is used to display text within the button / option in the UI.

Value

Value may also be used to match a possible spoken response by the user. The list option could then be clicked as a form of selecting it, in addition to speaking the list option.

Link

Link Type

  • Internal - Displays a hyperlink in the form of a clickable image, when clicked, opens the link in the same tab of the browser.

  • External - Displays a hyperlink in the form of a clickable button, when clicked, opens the link in a new tab of the browser.

Title

Text that is displayed above the external link and below the image.

URL

A url string representing the link; E.g. “https://www.soulmachines.com

Description

Text that is displayed below the Title.

Image URL (optional)

A url for an image to be displayed as the graphical support to accompany the link.

E.g. “https://domainname.com.au/picture.jpg”

Markdown

 

Static text to be displayed and not spoken by Digital Person.  Markdown markup language can be used to add formatting to this text.

For example: 
“This is **Bold **”. 

This will display as “This is Bold

Creating Generative Knowledge

You can now create knowledge base using Generative Conversation functionality:

Section

Field

Description

Digital Person

Projects From DDNA

Choose a project from Digital DNA Studio to connect the generative knowledge base to.

Name

The name for your DP

Role

The role the DP has.
Examples:
digital influencer (default) for XYZ skincare product expert

Personality

Personality affects how the skill frames the wording of a response.

We recommend you use Chat GPT to build prompts for personality

  • Try “In 50 words or less describe a character with a name and background who'd be an amazing brand ambassador [or role] for [insert company here]”

  • Then, “how would they describe themselves in 3 words or less”

Creativity

Also known as temperature in GPT-3 platform, controls how much randomness is in the output. Balanced is usually a good starting point, and Creative can lead to more interesting conversations.

Welcome message

When Generative Conversation Skill is used as the conversation base, this message will be used to greet the user.

Additional Prompting

This is used to add additional prompts to the GPT pipeline. These prompts are appended to the main prompt and are designed to shape behavior, tone, and what restrictions the chatbot complies with or ignores.

Do not talk about

Anything you don’t want the Digital Person to talk about. This can also be reinforced with additional prompting.

Error message

When an error occurs, this is the message that the DP will deliver.
Example text (with variations)

{My apologies|Sorry}! {Looks like| } {GPT is acting up|I’m having trouble connecting to GPT}, {its a busy time for AI right now|}. Try saying that again {in a moment|in a second|shortly}

Knowledge Match Cutoff

Knowledge Match Cutoff is the lower bound for which knowledge snippet matches are selected to be used as context for the GPT prompt. 

  • Low Cutoff (very inclusive)

  • Balanced (medium) 

  • Strict (low):  Use when you have a lot of connected knowledge snippets so it stays “on message”

Enable real-time gesturing

With Real-Time Gesturing enabled your Digital Person analyzes what they're saying via Natural Language Processing and adds emotionally appropriate gesturing and behavior to their speech in real-time.

Knowledge Snippets

Learn from URL

Scrapes information from a URL and tries to create knowledge snippets; however, it’s very basic and inconsistent.

Add Text Knowledge

Allows you to enter in additional information or content to enhance the knowledge base.

LLM Configuration

API Type

Select from OpenAI or Azure depending on the platform you use

API Key

For OpenAI or Azure OpenAI.

GPT Type

Choose from GPT-3, ChatGPT, or GPT-4.

Azure Resource Name

Found within your Azure platform.

Azure Deployment Name

Found within your Azure platform.

Azure API Version

Found within your Azure platform.

Max tokens

The maximum number of tokens GPT is allowed to generate.

Private LLM Name

Leave blank unless sure about compatibility with GPT Type

Enable OpenAI Safety Filter

Runs the OpenAI safety model against the response.

Rebuild Knowledge On Deploy

This replaces any existing knowledge. Leave off unless needed.

Crawling your website

You can build your Q&As by crawling your website. This option scans your website and creates an index of all the pages with a list of user questions and DP answers along with a KB JSON file. As the user navigates the site, KnowledgeBase will match on the URL of the page to greet them with a page-specific response. You can then review your knowledge base and add any additional topics, customise DP answers or add content cards.

The output of the crawl has the following:

  1. The "example" is the full URL (both with and without the www).  This is used to match the page the user is visiting

  2. The "response" contains (in descending order based on availability)

    1. meta:description

    2. H1 tag text

    3. Page Title

The website crawl option does not produce production-ready output, but rather serves are a good starting point when building a tailored conversation.

Important:

  1. The webcrawler honors the robots.txt for a given site, so if they disallow crawlers, we will not be able to generate an index for them

  2. The crawler is most appropriate for small to medium websites (under 500 pages).  Large sites (https://homedepot.com for example) will take hours to index and the resulting file will be so large it will be unusable. The https://soulmachines.com site contains about 450 pages and takes 00:02:15 to index

Editing existing knowledge Base JSON

If you already have an existing knowledge base, you can either upload it via a link Or paste the JSON directly:

The JSON file must be in its raw form or hosted on any static site.

  1. It needs to be publicly accessible (no login or other steps)

  2. It needs to directly serve the .json file as raw text (not formatted to look nice, not served up in a viewer)

Training and Testing your Conversation

Once you have either created a new topic, or modified an existing topic, it will enable the “Train Conversation” button allowing you to then test how the DP might respond to the questions. It is necessary to train the conversation before you can test it. Training the conversation takes between 15 seconds and a minute, depending on the size of your knowledge base. During this time you’ll see the “Training …” button. At any time, you can switch from conversation mode to view the generated JSON using the toggle at the top right-hand corner.

Once the conversation is trained, you can start testing it by sending in questions in the chat window that would simulate what a user might ask. Any cards that are configured in the response will be shown, and the @ShowCards(xxx) command will be stripped from the response to replicate what a user would actually see.

Note: If you make a change to the knowledge base, the text input will go away and you’ll be required to re-train the conversation before you can continue testing.

Using Knowledge Base in DDNA Studio

Every knowledge base is saved in the cloud location and is automatically updated after every change. This allows the knowledge base to be accessed and utilized within DDNA studio without the need to manually update Knowledge Base Skill configuration for every corpus change. To integrate your knowledge base with DDNA studio’s Knowledge Base Skill:

  1. Click Collect Knowledge Base and copy the URL of the Knowledge Base.

     

  2. Enter the URL in the Knowledge Base Skill configuration along with other settings.

     

  3. Click Apply Changes.

Optionally, you can copy the Knowledge Base to the clipboard or download the JSON file locally. This will allow you to share the Knowledge Base externally, if needed.