Skill - Generative Conversation
Skill Type
Base Conversation Fallback
Overview
With the Generative Conversation Skill, you can create a persona for the Digital Person by providing knowledge snippets, which are then used to create generative conversations. This skill can be based on OpenAI’s ChatGPT (GPT3.5) or GPT-4, designed to generate human-like text in a variety of contexts. GPT-3.5/4 is based on deep learning algorithms that enables the model to generate coherent and meaningful responses in a wide range of topics, from news articles and product descriptions to poetry and fiction. One of the key features of GPT-3.5 or GPT-4 is its ability to perform a variety of natural language processing tasks including language translation, question-answering, summarization and more.
These features make the Generative Conversation Skill an excellent base conversation or fallback skill alongside any other Base Conversation Skill.
This Skill is:
Built and owned by Soul Machines Ltd.
Only supported in English.
Powered by the OpenAI API.
Usage of OpenAI's API is subject to their Ts&Cs and may require you to provide your credit card information to gain access to the API.
While the skill currently also provides support for legacy models like GPT-3. They are set to be deprecated by OpenAI on January 2024
Generative Content Skill Configuration Options
Generative Conversation - Base
The Generative Conversation Base allows you to use ChatGPT (GPT3.5) or GPT-4 in the form of “chat” or “completion” models as your base conversation. You can do this instead of connecting an external conversation corpus from an NLP conversation platform like IBM Watson or DialogFlow.
The advantage of this method is ease and speed of implementation, at the cost of being able to deliver precise pre-written answers.
Select ‘Generative Conversation Base’ in the Conversation section in DDNA Studio, and fill in the simple input fields.
Generative Conversation - Fallback Skill (must have a base conversation configured first)
The Generative Conversation Skill allows you to use ChatGPT or GPT-4 as a fallback to another base conversation. You can connect an external conversation corpus from an external NLP conversation platform like IBM Watson or DialogFlow, and have the Generative Conversation Skill answer questions your corpus doesn’t.
The advantage of this method is that you can still create specific workflows and you can ‘override’ answers from GPT with precise pre-written answers.
Once your Conversation Base is defined, add the Generative Conversation Fallback from the Skills section, and fill in the simple input fields.
Important: Note that additional skills are incompatible with the Generative Conversation Base conversation.
Limitations
May occasionally generate incorrect information
May occasionally produce harmful instructions or biased content
Limited knowledge of world and events after 2021
Skill Demo
Configurations
Provide the following information in the Generative Conversation Base Skill configuration screen in Digital DNA Studio. See Adding Skills to your Digital Person for detailed instructions:
Field | Type | Description | Best Practices |
---|---|---|---|
DIGITAL PERSON'S NAME REQUIRED |
| Enter an appropriate name for your Digital Person. Examples: Jimmy, Sam, Beth |
|
DIGITAL PERSON'S ROLE |
| The role the Digital Person will take on. Examples:
|
|
AUTOMATICALLY MATCH THE CONVERSATION STYLE TO MY CHOSEN BERHAVIOR STYLE |
| Each behavior style has a pre-defined conversation style to ensure the speaking style matches the movement and gesture style. |
|
ADDITIONAL CONVERSATION STYLE PROMPTING |
| Conversation style affects how the skill frames the wording of a response. Example: | Use this to create a custom conversation style or append additional style to the automatic style matching applied via the toggle above.
|
TIME ZONE FOR THE DIGITAL PERSON |
| The time zone for the Digital Person. Example:
| This feature allows the Digital Person to respond to time-related questions based on the current time in a specified time zone. The time zone must be provided as a valid TZ Identifier string (e.g., US/Pacific). If you prefer not to use the current time, you can opt out by entering None. This may be useful if you do not want the Digital Person to be aware of the date and time, or if you wish to specify a different date and time separately in the Additional Prompting section.. |
| This is used to add additional prompts to the GPT pipeline. These prompts are sent in the initial batch and are designed to shape behavior, tone, and what restrictions the chatbot complies with or ignores. | Additional prompts are directives. It's best to keep the directives universal and to a minimum as they shape every response from the DP
| |
KNOWLEDGE SNIPPETS |
| Facts or bits of knowledge that will be provided as context to the OpenAI GPT prompt. The format is one natural sentence per line. This should be simple text and not marked up. This field will increase in size to accommodate as you continue adding snippets. Example:
| While additional prompts are directives, knowledge snippets are information/facts. Knowledge snippets can be highly specific, you can have a huge variety of them, and they are meant to provide information to GPT when it's relevant e.g. if the User says 'do you like cats?' then cat related knowledge snippets like 'you have a cat named calliope' would be passed to GPT along with the additional prompting for it to write an answer.
You can also use knowledge snippets to add IMAGE, VIDEO or MARKDOWN content to the Digital Person’s knowledge. When used in combination with the Content Card Skill, this will add display the corresponding image, video and markdown in the form of a content card on the Digital Person experience to make the experience more engaging. |
KNOWLEDGE URLS |
| Public URLs that are scraped for knowledge to be used as context for the conversation. These are provided to the OpenAI GPT prompt just like the knowledge snippets. The URL can point to various file types such as HTML pages, raw text files, PDFs, Word documents, or PowerPoints. Example:
|
|
CONVERT URL DATA INTO OPTIMIZED CONVERSATION CONTENT |
| When this is toggled ON, this takes the URLs in Knowledge URLs section, scrapes their information, then processes it through ChatGPT to generate more concise, self-sufficient knowledge snippets. These snippets clarify ambiguous pronouns and encapsulate all necessary context, ensuring each piece of information is fully comprehensible on its own. If you are pointing at existing content not specifically crafted as knowledge snippets, we recommend turning this ON. When toggled OFF, the URLs are still scraped and knowledge snippets are created directly without additional processing via ChatGPT. If you are hosting well crafted knowledge snippets on a website, turn this off. |
|
WELCOME MESSAGE TO GREET THE USER |
| Makes your Digital Person greet users as it starts up to provide them with a verbal cue that the interaction has commenced. Example:
| You can use GPT to help create a prompt:
Additionally consider, writing an introduction that:
|
DO NOT TALK ABOUT |
| Specify which topics you do not want the Digital Person to discuss. Example: |
|
ERROR MESSAGE Required |
| This is shown if OpenAI is down or is not able to answer for technical reasons | An example with variation: {My apologies|Sorry}! {Looks like| } {GPT is acting up|I’m having trouble connecting to GPT}, {its a busy time for AI right now|}. Try saying that again {in a moment|in a second|shortly} Add @showcards(error_reason) to this field to display any error messages (e.g. OpenAI errors) |
OPENAI API KEY |
| Soul Machines provides a key to get you started. To create your open OpenAI key, Go to https://platform.openai.com/account/api-keys . Create an OpenAI account and generate API keys. | This field is also used for your Azure OpenAI key if you are using Azure. |
ENABLE OPEN AI SAFETY FILTER |
| This toggle enables the OpenAI content filter ensuring that the responses are appropriate and flags the content that may violate the OpenAI Content Policy. If a response is flagged it will regenerate a new response, this will cause a very noticeable delay in the response, however, it is unlikely to be triggered. | This can contribute to latency, as it will check with OpenAI’s content filtering, so please be mindful of that. |
REBUILDS KNOWLEDGE ON SKILL DEPLOYMENT |
| This toggle allows the skill to regenerate / rebuild the knowledge from the sources provided, every time the skill is deployed. | When editing the skill with the desired knowledge snippets or knowledge URL’s in the content areas above, make sure to keep this toggle ON, in order for the skill to active reflect the provided knowledge. |
ADD DEBUG DATA TO OUTPUT |
| This toggle allows the skill to show debugging data as a markdown content card after each Digital Person response in the deployed experience | It’s a good practice to have this ON when debugging or iterating on your experience. This will present a markdown card with data on matched knowledge snippet, match confidence, system prompt and any error messages. Once you are satisfied with the performance of the Digital Person, turn this back OFF to prevent showing debug / error cards in the live experience. |
MAX RESPONSE LENGTH Required |
| Sets the max length of the responses from the Digital Person. Default is set to 200 tokens. | If this number is too low your responses may be cut off. It does not have any impact on how verbose GPT’s responses will be, just how many words (tokens) it’s allowed to return back. ChatGPT only allows 4k tokens, this number + the prompt length needs to be under 4k. Prompts can get quite long because they include instructions, knowledge snippets, and chat history, so setting it to 1k or higher runs the risk of hitting that limit. 350 is a pretty good number Use additional prompting to ensure the DP keeps responses shorter |
IMPROVE RESPONSE TIME BY STREAMING THE DIGITAL PERSON’S RESPONSES IN SMALLER SENTENCES NEW
|
| When this toggle is ON responses will be streamed and sent asynchronously. This can reduce latency and improve response times for verbal dialogue. However this will impact interactions in the chat window. Long responses from the Digital Person will be broken up into smaller sentences and delivered one at a time. When this toggle is OFF responses are sent synchronously. Use this to option if you want to avoid individual sentences in the chat window. | If you have a custom UI and surface the conversation in a chat window, you may which to turn this toggle off to revert to the previous behaviour.
Note this feature if your chosen avatar is HumanOS 2.6 and above. If not this setting is ignore and behaviour will be as if the toggle is OFF. |
KNOWLEDGE MATCH CUTOFF |
| Knowledge Match Cutoff is the lower bound for which knowledge snippet matches are selected to be used as context for the GPT-3 prompt. The top 3 matching knowledge snippets are used in the prompt passed to GPT-3. If no snippets score above the knowledge_match_cutoff value, no additional context (snippets) are provided to the GPT-3. | Low Cutoff (very inclusive) The top 3-5 matching knowledge snippets are used in the prompt passed to GPT. If no snippets score above the knowledge_match_cutoff value, no additional context (knowledge snippets) are provided to GPT. |
CREATIVITY |
| Also known as temperature in GPT-3 platform, controls how much randomness is in the output. Temperature decreases the likelihood that the model will choose the most likely next token. Essentially, making it more creative. There are four options to choose from: Stick to the Facts: 0.25 Temp | Balanced is usually a good starting point, and Creative can lead to more interesting conversations. You can experiment with creativity, coupled with your additional prompting to settle on the right level of creativity in responses. |
OPEN AI MODEL_OVERRIDE |
| Specify a model to use. This will override the default model ( More info on models can be found here: https://platform.openai.com/docs/models | Note this field is also used to specific the model name for Azure OpenAI. |
How to setup Azure OpenAI
If you have setup Azure OpenAI with your conversation you can connect your avatar to Azure using the fields below.
Field | Type | Description | Best Practices |
---|---|---|---|
OPENAI API KEY |
| Create an Azure OpenAI account and generate an API key. | This field is used for either an OpenAI key or an Azure OpenAI key. If you insert your Azure key here, makes ure you select Azure as the API type in the dropdown mentioned below (note this dropdown is further down in the configuration options). |
OPEN AI MODEL_OVERRIDE |
| Specify a model name from your Azure deployment. See screenshot below with Deployment Name. |
|
API Type |
|
| Select from Azure to use Azure OpenAI instead of OpenAI. OpenAI is selected by default. |
Azure Resource Name |
| Found within your Azure platform. | See screenshots below |
Azure Deployment Name |
| Found within your Azure platform. | See screenshots below |
Azure API Version |
| This can be any of the current supported versions of Azure shown in the screenshot below. | We recommend using 2024-03-01-preview as we have tested with this version. You may specify a later version if you test and find it works correctly for you. |
The following screenshots from Azure show where these fields appear in your Azure platform.
Including Media in Generative Conversation Responses
Some models work better than others when trying to include URLs:
gpt-3.5-turbo (easiest)
gpt-4-turbo-preview
gpt-4
gpt-3.5-turbo-instruct (this model performs very poorly on this task)
1. Add media inclusion instructions to the "Additional Prompt".
Example:
- Use URLs from the "Additional Context" section to inform your response, only if the URLs are relevant to the question.
- Use visuals (available URLs) effectively in your explanations.
You are able to show images simply by providing an image URL. When available in Additional Context, simply provide the URLs at the end of your response with no accompanying text or punctuation to show an image. Show images when URLs are available.
Note: The "Additional Context" section is where relevant Knowledge Snippets get inserted into the prompt that is sent to the LLM.
2. Add your image URLs to your Knowledge snippets.
Have a consistent format where the image URL is always at the end. For example:
The Comet Neutron E6 has a 3000mAh removable battery that provides up to 29 hours of talk time. Comet Smartphone Battery Image: https://i.imgur.com/wxyzlye.png
Remember, knowledge snippets will only be included if they are relevant to the last user message. It's important to carefully craft these so that they will be included at the right point in the conversation. If the knowledge snippets aren't getting triggered, your image content will not get included!
Optionally, if you only have a few images you want to include, you can simply add them to the “Additional Prompt”.
Example:
When needed you have the following images available to display by simply providing the URL in your response:
Comet Smartphone Product Image: <https://i.imgur.com/wxyzlye.png>
Nebula Smartphone Product Image: <https://i.imgur.com/WCQQ4dZ.png>
Pinnacle Smartphone Product Image: <https://i.imgur.com/SNJrEsE.png>
Including Gesture Markup in Generative Conversation Responses
If you enable the "Boost expressiveness with additional iconic gestures" feature in the behavioral settings, the digital persona will automatically incorporate iconic gestures into conversations. If the default behaviors do not meet your preferences, you can further customize or enhance expressiveness through the Additional Prompting feature within the Generative Conversation Skill. This allows GPT models to intelligently insert Gesture Markup when deemed appropriate.
Below is an example of how you might structure a prompt:
When making your responses for the rest of this conversation, I want you to use the following emotion and gesture hashtags in your speech when they match the overall sentiment of what you're saying, or punctuate what you're saying, or when you're asked for or about a specific gesture or emotion. Make sure to use them at the end; #ThumbsUpOneHand , #ThumbsUpBothHands , #HappySwayHighEnergy , #HeartSign , #Stop , #DisappointedHeadShake , #Confused , #OneHandToBrow , #Listening , #TakenAback , #HappySwayHighEnergy , #Wave , #WaveSuave , #WaveWide , #WaveShy , #WaveHand , #Bow , #HappyStrong , #AngryStrong , #DisgustedStrong , #FearStrong , #SurprisedStrong , #SadStrong , #CompassionStrong , #QuestionStrong, #PuncSmileLong , #PuncFrownDeep.
Feel free to tailor this prompt by adding or removing Gesture Markup as suitable for your project. Ensure that there are spaces before and after each markup for clarity.
Note: You must disable the Boost expressiveness with additional iconic gestures toggle to allow full control over Gesture Markup via Additional Prompting.
Frequently Asked Questions
How do I configure my avatar to use Azure OpenAI?
See How to setup Azure OpenAI above for details.
Can I use media like images and videos with generative conversation?
Yes, we support images and video through the content cards skill.
See Including Media in Generative Conversation Responses above for details on how to configure your generative conversation to use media effectively.
Can Soul Machines work with other LLMs?
Yes, we can support integration with other LLMs via Custom Skill configuration using Soul Machines Skills API
We will likely integrate additional LLM support in DDNA Studio in the future
It is also possible to access Generative Conversational content from several NLP’s with more coming soon.
Why is the Digital Person taking so long to respond?
There is latency in the Digital Person transcribing the user's speech to text, then sending this message to GPT, and that generating a response, then sending the text back to the Digital Person to speak the response.
OpenAI APIs are also having significant latency, so the lag is not always due to the Digital Person
It can be important to set the expectation with users that there is a delay in response
How can I use GenAI as part of a larger conversation flow?
You can configure the generative content skill as a fall back to a base conversation. I.e. build an intended user flow in IBM Watson, then have the fallback enabled so the DP will respond to questions that do not have a pre-programmed answer in the base conversation.
What features are coming next?
Soul Machines is working on reducing latency, exploring other LLMs and enhancing support for content cards, among other things related to the generative content skill.
Can my Digital Person using Generative Conversation Skill display multi-modal content?
Yes it can! Any public facing URL to and image (public images like jpg, png etc.), video (like a YouTube video) or markdown content can be added to the Digital Person’s Knowledge Snippets and this information will be displayed on screen when the corresponding knowledge matches the user’s intent, if the Content Card Skill is enabled along with this skill.