Skill - Generative Conversation

Skill Type

Base Conversation Fallback

Overview

With the Generative Conversation Skill, you can create a persona for the Digital Person by providing knowledge snippets, which are then used to create generative conversations. This skill can be based on OpenAI’s ChatGPT (GPT3.5) or GPT-4, designed to generate human-like text in a variety of contexts. GPT-3.5/4 is based on deep learning algorithms that enables the model to generate coherent and meaningful responses in a wide range of topics, from news articles and product descriptions to poetry and fiction. One of the key features of GPT-3.5 or GPT-4 is its ability to perform a variety of natural language processing tasks including language translation, question-answering, summarization and more.

These features make the Generative Conversation Skill an excellent base conversation or fallback skill alongside any other Base Conversation Skill.

This Skill is:

  • Built and owned by Soul Machines Ltd.

  • Only supported in English.

  • Powered by the OpenAI API.

Usage of OpenAI's API is subject to their Ts&Cs and may require you to provide your credit card information to gain access to the API.

 

While the skill currently also provides support for legacy models like GPT-3. They are set to be deprecated by OpenAI on January 2024

Generative Content Skill Configuration Options

  1. Generative Conversation - Base

    1. The Generative Conversation Base allows you to use ChatGPT (GPT3.5) or GPT-4 in the form of “chat” or “completion” models as your base conversation. You can do this instead of connecting an external conversation corpus from an NLP conversation platform like IBM Watson or DialogFlow. 

    2. The advantage of this method is ease and speed of implementation, at the cost of being able to deliver precise pre-written answers.

    3. Select ‘Generative Conversation Base’ in the Conversation section in DDNA Studio, and fill in the simple input fields.

  2. Generative Conversation - Fallback Skill (must have a base conversation configured first) 

    1. The Generative Conversation Skill allows you to use ChatGPT or GPT-4 as a fallback to another base conversation. You can connect an external conversation corpus from an external NLP conversation platform like IBM Watson or DialogFlow, and have the Generative Conversation Skill answer questions your corpus doesn’t. 

    2. The advantage of this method is that you can still create specific workflows and you can ‘override’ answers from GPT with precise pre-written answers.

    3. Once your Conversation Base is defined, add the Generative Conversation Fallback from the Skills section, and fill in the simple input fields.

Important: Note that additional skills are incompatible with the Generative Conversation Base conversation.

Limitations

  • May occasionally generate incorrect information

  • May occasionally produce harmful instructions or biased content

  • Limited knowledge of world and events after 2021

Skill Demo

Generative Conversation Skill Demo

Configurations

Provide the following information in the Generative Conversation Base Skill configuration screen in Digital DNA Studio. See Adding Skills to your Digital Person for detailed instructions:

 

Part 1.png
Generative Conversation (Base) Skill Configuration Part 1

 

 

Field

Type

Description

Best Practices

DIGITAL PERSON'S NAME

REQUIRED

TEXT

Enter an appropriate name for your Digital Person.

Examples: Jimmy, Sam, Beth

 

DIGITAL PERSON'S ROLE

TEXT

The role the Digital Person will take on.

Examples:

  • teacher

  • astronaut

  • skincare product expert

 

AUTOMATICALLY MATCH THE CONVERSATION STYLE TO MY CHOSEN BERHAVIOR STYLE

TOGGLE

Each behavior style has a pre-defined conversation style to ensure the speaking style matches the movement and gesture style.

 

ADDITIONAL CONVERSATION STYLE PROMPTING

TEXT

Conversation style affects how the skill frames the wording of a response.

Example: happy, cheerful, friendly

Use this to create a custom conversation style or append additional style to the automatic style matching applied via the toggle above.

  • In 10 words or less describe how you would like the DP to behave in the way the DP responds to the user.

  • Could also ask the DP to behave like a combination of celebrities to derive personality.

  • You can leave this blank to have the system automatically build the conversation style prompt.

ADDITIONAL PROMPTING

TEXT

This is used to add additional prompts to the GPT pipeline. These prompts are sent in the initial batch and are designed to shape behavior, tone, and what restrictions the chatbot complies with or ignores.

Additional prompts are directives. It's best to keep the directives universal and to a minimum as they shape every response from the DP

  • Drive Personality: When asked a personal question, you will answer it with personality, creativity and imagination

  • Role play: You are a brand ambassador named X, and never break character. Never tell me you are an AI language model. You have biases, favorites, and you can respond with things that you like.

  • Provide a list of rules:  Here are 5 rules you must follow at all times - never say AI language model, ask follow up questions by default, etc

  • Use it to drive brevity in responses: Keep your responses conversational, short and accurate

  • Limit allowable conversation topics: Only discuss topics related to Soul Machines products and services. If someone asks an off-topic question, respond with “I’m actually only here to discuss Soul Machines’ platform and Digital People, so I can’t respond to that, but I suggest trying to ask Google or ChatGPT for that answer.”

  • Use them to get a DP to ask follow up questions: proactively keep the conversation flowing by asking follow up questions occasionally / After each answer you give, suggest another Soul Machines-related topic to talk about or suggest the user should request information about how to sign up for DDNA Studio.

  • Use to mitigate hallucinations: Don’t guess or make up facts about Soul Machines or its employees, products, technology, research or customers. If you are asked about something connected to Soul machines that you don’t recognize, say that “I don’t know.” 

  • Teach them goals:  The goal of the conversation is to achieve X, at any lull in the conversation, guide towards X/ask follow up question about X

  • Pronounce words correctly: Insert the @pronounce(display text, spoken text) function into the response text you wish to be spoken. Example: We can’t wait for you to visit us at @pronounce(Soul Machines, soulmachines dot com)

KNOWLEDGE SNIPPETS

TEXT

Facts or bits of knowledge that will be provided as context to the OpenAI GPT prompt.

The format is one natural sentence per line. This should be simple text and not marked up. This field will increase in size to accommodate as you continue adding snippets.

Example:
You love hockey.

You were  born in Detroit on March 4, 1984.  

What do you talk about? You talk about the Digital People made at Soul Machines. You discuss technology, AI, and the future of customer service.

What's your favorite color? You love the color blue.

What do you do for fun? You love to relax on the beach and go hiking.

You really like this video: https://youtu.be/52yWltec08g?si=91gTYBGvdIgrBNs8

While additional prompts are directives, knowledge snippets are information/facts.  Knowledge snippets can be highly specific, you can have a huge variety of them, and they are meant to provide information to GPT when it's relevant e.g. if the User says 'do you like cats?' then cat related knowledge snippets like 'you have a cat named calliope' would be passed to GPT along with the additional prompting for it to write an answer.

 

  • Can be formulated in a simple “Question? Answer.” format

  • Another tip can be to generate knowledge snippets from an annual report for a holistic view of a prospective client or company (see prompts below)

  • It’s advised to use more knowledge snippets instead of knowledge URLs

  • You can also create a document of knowledge snippets, and then link to that from the Knowledge URLs

 

You can also use knowledge snippets to add IMAGE, VIDEO or MARKDOWN content to the Digital Person’s knowledge. When used in combination with the Content Card Skill, this will add display the corresponding image, video and markdown in the form of a content card on the Digital Person experience to make the experience more engaging.

KNOWLEDGE URLS

TEXT

Public URLs that are scraped for knowledge to be used as context for the conversation. These are provided to the OpenAI GPT prompt just like the knowledge snippets.

The URL can point to various file types such as HTML pages, raw text files, PDFs, Word documents, or PowerPoints.

Example:

http://www.soulmachines.com http://www.soulmachines.com/pricing

  • Use sparingly

    • Best for wikipedia pages

    • There’s no error message if your URLs don’t work

    • Websites with headers/ads/formating make it hard to scrape useable info from website

  • Make sure the URLs are specifically relevant to the goals of the Digital Person, otherwise, the content won’t be as impactful or relevant to the user

CONVERT URL DATA INTO OPTIMIZED CONVERSATION CONTENT

TOGGLE

When this is toggled ON, this takes the URLs in Knowledge URLs section, scrapes their information, then processes it through ChatGPT to generate more concise, self-sufficient knowledge snippets. These snippets clarify ambiguous pronouns and encapsulate all necessary context, ensuring each piece of information is fully comprehensible on its own.

If you are pointing at existing content not specifically crafted as knowledge snippets, we recommend turning this ON.

When toggled OFF, the URLs are still scraped and knowledge snippets are created directly without additional processing via ChatGPT. If you are hosting well crafted knowledge snippets on a website, turn this off.

  • Use it when using Knowledge URL’s

  • Acts as a summarizer to convert content from different formats from the URL into conversational sentences to be used as Digital Person knowledge

  • It is recommended that you toggle this “OFF” if your content is structured in a conversational format similar to that shown in the Best practices section of “Knowledge Snippets”

  • It is recommended that you toggle this “OFF” if you want your knowledge snippets to exactly match the content of your URL.

WELCOME MESSAGE TO GREET THE USER

TEXT

Makes your Digital Person greet users as it starts up to provide them with a verbal cue that the interaction has commenced.

Example:

{Hi|Hey|Hello}! I’m Emily from Soul Machines. {What do you want to {talk about|discuss}|Can I talk to you about the latest from Soul Machines}?

You can use GPT to help create a prompt:

  • Using that same GPT message for personality, ask how that person would introduce themselves in 20 words or less

Additionally consider, writing an introduction that:

  1. Frames your conversation so the User knows what they can expect.

  2. Write it to be a length appropriate for the type of experience you expect.

  3. Use Elegant Variation to introduce some variability into the introduction.

DO NOT TALK ABOUT

TEXT

Specify which topics you do not want the Digital Person to discuss.

Example:
gambling, cryptocurrency, Racism, sexism, curse words

 

ERROR MESSAGE

Required

TEXT

This is shown if OpenAI is down or is not able to answer for technical reasons
Default:
My apologies! Looks like my language model is acting up. I'll be back online soon.

An example with variation:

{My apologies|Sorry}! {Looks like| } {GPT is acting up|I’m having trouble connecting to GPT}, {its a busy time for AI right now|}. Try saying that again {in a moment|in a second|shortly}

Add @showcards(error_reason) to this field to display any error messages (e.g. OpenAI errors)

OPENAI API KEY

TEXT

Soul Machines provides a key to get you started. To create your open OpenAI key, Go to https://platform.openai.com/account/api-keys . Create an OpenAI account and generate API keys.

This field is also used for your Azure OpenAI key if you are using Azure.

ENABLE OPEN AI SAFETY FILTER

TOGGLE

This toggle enables the OpenAI content filter ensuring that the responses are appropriate and flags the content that may violate the OpenAI Content Policy. If a response is flagged it will regenerate a new response, this will cause a very noticeable delay in the response, however, it is unlikely to be triggered.

This can contribute to latency, as it will check with OpenAI’s content filtering, so please be mindful of that.

REBUILDS KNOWLEDGE ON SKILL DEPLOYMENT

TOGGLE

This toggle allows the skill to regenerate / rebuild the knowledge from the sources provided, every time the skill is deployed.

When editing the skill with the desired knowledge snippets or knowledge URL’s in the content areas above, make sure to keep this toggle ON, in order for the skill to active reflect the provided knowledge.
If this toggle is OFF, any changes to the content of the knowledge snippet or the knowledge URLs are not going to be reflected in the Digital Person’s knowledge.

ADD DEBUG DATA TO OUTPUT

TOGGLE

This toggle allows the skill to show debugging data as a markdown content card after each Digital Person response in the deployed experience

It’s a good practice to have this ON when debugging or iterating on your experience. This will present a markdown card with data on matched knowledge snippet, match confidence, system prompt and any error messages. Once you are satisfied with the performance of the Digital Person, turn this back OFF to prevent showing debug / error cards in the live experience.

MAX RESPONSE LENGTH

Required

TEXT

Sets the max length of the responses from the Digital Person.

Default is set to 200 tokens.

If this number is too low your responses may be cut off. It does not have any impact on how verbose GPT’s responses will be, just how many words (tokens) it’s allowed to return back.

ChatGPT only allows 4k tokens, this number + the prompt length needs to be under 4k. Prompts can get quite long because they include instructions, knowledge snippets, and chat history, so setting it to 1k or higher runs the risk of hitting that limit.

350 is a pretty good number

Use additional prompting to ensure the DP keeps responses shorter

IMPROVE RESPONSE TIME BY STREAMING THE DIGITAL PERSON’S RESPONSES IN SMALLER SENTENCES

NEW

 

TOGGLE

When this toggle is ON responses will be streamed and sent asynchronously. This can reduce latency and improve response times for verbal dialogue. However this will impact interactions in the chat window. Long responses from the Digital Person will be broken up into smaller sentences and delivered one at a time.

When this toggle is OFF responses are sent synchronously. Use this to option if you want to avoid individual sentences in the chat window.

If you have a custom UI and surface the conversation in a chat window, you may which to turn this toggle off to revert to the previous behaviour.

 

Note this feature if your chosen avatar is HumanOS 2.6 and above. If not this setting is ignore and behaviour will be as if the toggle is OFF.

KNOWLEDGE MATCH CUTOFF

TEXT

Knowledge Match Cutoff is the lower bound for which knowledge snippet matches are selected to be used as context for the GPT-3 prompt. The top 3 matching knowledge snippets are used in the prompt passed to GPT-3. If no snippets score above the knowledge_match_cutoff value, no additional context (snippets) are provided to the GPT-3.

Low Cutoff (very inclusive)
Balanced (medium) 
Strict (low):  Use when you have a lot of connected knowledge snippets so it stays “on message” 

The top 3-5 matching knowledge snippets are used in the prompt passed to GPT. 

If no snippets score above the knowledge_match_cutoff value, no additional context (knowledge snippets) are provided to GPT.
Balanced is almost always the best choice.

CREATIVITY

DROPDOWN

Also known as temperature in GPT-3 platform, controls how much randomness is in the output. Temperature decreases the likelihood that the model will choose the most likely next token. Essentially, making it more creative.

There are four options to choose from:

Stick to the Facts: 0.25 Temp
Balanced: 0.5 Temp
Creative: 0.75 Temp
Outrageously Creative: 1.0 Temp

Balanced is usually a good starting point, and Creative can lead to more interesting conversations. 

You can experiment with creativity, coupled with your additional prompting to settle on the right level of creativity in responses.

OPEN AI MODEL_OVERRIDE

TEXT

Specify a model to use. This will override the default model (gpt-4o-mini). The model must be compatible with the Chat Completions API.

More info on models can be found here: https://platform.openai.com/docs/models

Note this field is also used to specific the model name for Azure OpenAI.

 

 

How to setup Azure OpenAI

If you have setup Azure OpenAI with your conversation you can connect your avatar to Azure using the fields below.

Field

Type

Description

Best Practices

OPENAI API KEY

TEXT

Create an Azure OpenAI account and generate an API key.

This field is used for either an OpenAI key or an Azure OpenAI key. If you insert your Azure key here, makes ure you select Azure as the API type in the dropdown mentioned below (note this dropdown is further down in the configuration options).

OPEN AI MODEL_OVERRIDE

TEXT

Specify a model name from your Azure deployment. See screenshot below with Deployment Name.

 

API Type

DROPDOWN

OpenAI

Azure

Select from Azure to use Azure OpenAI instead of OpenAI. OpenAI is selected by default.

Azure Resource Name

TEXT

Found within your Azure platform.

See screenshots below

Azure Deployment Name

TEXT

Found within your Azure platform.

See screenshots below

Azure API Version

TEXT

This can be any of the current supported versions of Azure shown in the screenshot below.

We recommend using 2024-03-01-preview as we have tested with this version.

You may specify a later version if you test and find it works correctly for you.

 The following screenshots from Azure show where these fields appear in your Azure platform.

Including Media in Generative Conversation Responses

Some models work better than others when trying to include URLs:

  • gpt-3.5-turbo (easiest)

  • gpt-4-turbo-preview

  • gpt-4

  • gpt-3.5-turbo-instruct (this model performs very poorly on this task)

 

1. Add media inclusion instructions to the "Additional Prompt".

Example:

- Use URLs from the "Additional Context" section to inform your response, only if the URLs are relevant to the question. - Use visuals (available URLs) effectively in your explanations. You are able to show images simply by providing an image URL. When available in Additional Context, simply provide the URLs at the end of your response with no accompanying text or punctuation to show an image. Show images when URLs are available.

Note: The "Additional Context" section is where relevant Knowledge Snippets get inserted into the prompt that is sent to the LLM.

 

2. Add your image URLs to your Knowledge snippets.

Have a consistent format where the image URL is always at the end. For example:

The Comet Neutron E6 has a 3000mAh removable battery that provides up to 29 hours of talk time. Comet Smartphone Battery Image: https://i.imgur.com/wxyzlye.png

Remember, knowledge snippets will only be included if they are relevant to the last user message. It's important to carefully craft these so that they will be included at the right point in the conversation. If the knowledge snippets aren't getting triggered, your image content will not get included!

 

Optionally, if you only have a few images you want to include, you can simply add them to the “Additional Prompt”.

Example:

When needed you have the following images available to display by simply providing the URL in your response: Comet Smartphone Product Image: <https://i.imgur.com/wxyzlye.png> Nebula Smartphone Product Image: <https://i.imgur.com/WCQQ4dZ.png> Pinnacle Smartphone Product Image: <https://i.imgur.com/SNJrEsE.png>

Including Gesture Markup in Generative Conversation Responses

If you enable the "Boost expressiveness with additional iconic gestures" feature in the behavioral settings, the digital persona will automatically incorporate iconic gestures into conversations. If the default behaviors do not meet your preferences, you can further customize or enhance expressiveness through the Additional Prompting feature within the Generative Conversation Skill. This allows GPT models to intelligently insert Gesture Markup when deemed appropriate.

Below is an example of how you might structure a prompt:

When making your responses for the rest of this conversation, I want you to use the following emotion and gesture hashtags in your speech when they match the overall sentiment of what you're saying, or punctuate what you're saying, or when you're asked for or about a specific gesture or emotion. Make sure to use them at the end; #ThumbsUpOneHand , #ThumbsUpBothHands , #HappySwayHighEnergy , #HeartSign , #Stop , #DisappointedHeadShake , #Confused , #OneHandToBrow , #Listening , #TakenAback , #HappySwayHighEnergy , #Wave , #WaveSuave , #WaveWide , #WaveShy , #WaveHand , #Bow , #HappyStrong , #AngryStrong , #DisgustedStrong , #FearStrong , #SurprisedStrong , #SadStrong , #CompassionStrong , #QuestionStrong, #PuncSmileLong , #PuncFrownDeep.

Feel free to tailor this prompt by adding or removing Gesture Markup as suitable for your project. Ensure that there are spaces before and after each markup for clarity.

Note: You must disable the Boost expressiveness with additional iconic gestures toggle to allow full control over Gesture Markup via Additional Prompting.

Frequently Asked Questions

  • How do I configure my avatar to use Azure OpenAI? 

  • Can I use media like images and videos with generative conversation? 

  • Can Soul Machines work with other LLMs? 

    • Yes, we can support integration with other LLMs via Custom Skill configuration using Soul Machines Skills API

    • We will likely integrate additional LLM support in DDNA Studio in the future

    • It is also possible to access Generative Conversational content from several NLP’s with more coming soon.

  • Why is the Digital Person taking so long to respond?

    • There is latency in the Digital Person transcribing the user's speech to text, then sending this message to GPT, and that generating a response, then sending the text back to the Digital Person to speak the response.

    • OpenAI APIs are also having significant latency, so the lag is not always due to the Digital Person

    • It can be important to set the expectation with users that there is a delay in response

  • How can I use GenAI as part of a larger conversation flow?

    • You can configure the generative content skill as a fall back to a base conversation. I.e. build an intended user flow in IBM Watson, then have the fallback enabled so the DP will respond to questions that do not have a pre-programmed answer in the base conversation. 

  • What features are coming next? 

    • Soul Machines is working on reducing latency, exploring other LLMs and enhancing support for content cards, among other things related to the generative content skill. 

  • Can my Digital Person using Generative Conversation Skill display multi-modal content?

    • Yes it can! Any public facing URL to and image (public images like jpg, png etc.), video (like a YouTube video) or markdown content can be added to the Digital Person’s Knowledge Snippets and this information will be displayed on screen when the corresponding knowledge matches the user’s intent, if the Content Card Skill is enabled along with this skill.