- Created by Jon Borromeo, last modified on Mar 25, 2021
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 8 Next »
Soul Machines uses a bi-directional WebSocket connection to send a message back and forth between our servers and your Orchestration Server. Each message is JSON encoded and sent as a web socket ‘text’ message over HTTPS. Each JSON encoded message includes some standard fields for the kind of message communicated.
There are three kinds of messages: event, request and response. In this guide, we only cover the messages relevant to connecting to the NLP platform.
Event
Each event includes the category (‘scene’), kind (‘event’), name and body. Events are sent by the Digital Hero to all connected servers and the connected client web browser (if any).
Request
Each request includes the category (‘scene’), kind (‘request’), name, transaction and body. Requests are sent from the Server to the Scene. If the transaction is not null, then a response is sent for the request. The transaction is an integer that is unique across all requests and is included in the response. If no transaction is included, then no response is sent—the request is one way.
Response
Each response includes the category (‘scene’), kind (‘response’), name, transaction (from the matching request), status (the C sdk status integer where >= 0 is success, < 0 is failure) and body. Responses are sent from the Scene to the Server.
The following section describes the new custom conversation messages that are used in new Orchestration Server implementations.
Custom Conversation Messages
For new Orchestration Server implementations (Web SDK 11 and above), Soul Machines introduced the new CustomConversation
extension that is used to enable an Orchestration Server to act as a conversation backend—referred to as "Orchestration Conversation". This is useful in scenarios where you want to implement a custom conversation setup, enabling full control over conversation messages.
The "Orchestration Conversation" setup also enables control over conversation variables, sent and received in conversation messages, and are processed in the same way as for other conversation backend types (e.g., Watson, Dialogflow, etc.).
Old Conversation Messages vs New Conversation Messages
In the old conversation messaging system, the Speech-To-Text (STT) results came in via the recognizeResults
message, now you can get them from the conversationRequest
(input.text).
To instruct the Digital Hero to speak, a startSpeaking
message is sent, now you send a conversationResponse
(output.text) instead.
The Orchestration Server can actually send conversationResponse
messages without a prior request. This type of "spontaneous" conversation message can be used when the Orchestration Server wants to command the Digital Hero to speak spontaneously (without any prior request).
The Orchestration Server implementation needs to ensure that all conversationRequest
messages are matched with a corresponding conversationResponse
message, the output text can be empty if necessary.
Sample Conversation Messages:
// Example conversationRequest message (sent by the SDK): { "category": "scene", "kind": "event", "name": "conversationRequest", "body": { "personaId": 1, "input" : { "text" : "Can I apply for a credit card?" }, "variables" : { // conversation variables } } // Example conversationResponse message (sent by the orch server): { "category": "scene", "kind": "request", "name": "conversationResponse", "transaction": None, "body": { "personaId": 1, "output" : { "text": "Yes, I can help you apply for your credit card. Please tell me which type of card you are after, Visa or Mastercard?" }, "variables" : { // conversation variables } } }
The output.text
is required in the conversationResponse
message, while the input.text
, variables
, metadata
and fallback
properties are optional.
Sample Conversation Messages for Soul Machines’ Digital DNA Studio Content Blocks:
This is an example of a conversationResponse
message with a component options
Content Block. In this case, conversationOptions
is an arbitrary ID that the author can define, which needs to match for the “\@ShowCards” command and the variable definition.
The variable name needs to be prefixed with “public-”. Examples of other Content Block types can be found in the Digital DNA User Guide, under the section Displaying Content in the Digital DNA Studio Default UI.
{ "category": "scene", "kind": "request", "name": "conversationResponse", "body": { "personaId": 1, "output": { "text": "You can choose from one of the following options @showcards(conversationOptions)" }, "variables": { "public-conversationOptions": { "component": "options", "data": { "options": [ { "label": "option A" }, { "label": "option B" }, { "label": "option C" } ] } } } } }
Preceding Conversation Messages
In previous Orchestration Server implementations (Web SDK 10 and below), the legacy messaging system is used. Note that this guide only covers the messages relevant to connecting to the NLP platform.
recognizeResults
Each time a user speaks to a Digital Hero, your STT service transcribes their utterance from audio to text. The output from your STT service for each utterance is a series of intermediate recognizeResults
messages, as well as one final recognizeResults
message. These messages are sent from the Soul Machines Servers to your Orchestration Server via the WebSocket connection.
Below is an example of a “final”recognizeResults
message. The final messages can be identified by those messages where the “final” attribute is set to “true.” The transcript text from each “final”message must be sent to your NLP for a response, all other recognizeResults
messages received (e.g. where “final” = ”false”) can be ignored.
{ "category":"scene", "kind":"event", "name":"recognizeResults”, "body":{ "results":[{ "alternatives":[{ "confidence":0.8000, "transcript": "Can I apply for a credit card?"}], "final":true }], "status":0} }
startSpeaking Request
To instruct your Digital Hero to speak a response from your NLP, you need to send a startSpeaking
command from your Orchestration Server via the WebSocket connection.
Below is an example of startSpeaking
command message.
{ "category": "scene", "kind": "request", "name": "startSpeaking", "transaction": null | <int>, "body": { "personaId": 1, "text" : ”Yes, I can help you apply for your credit card. Please tell me which type of card you are after, Visa or Mastercard?" } }
The text for the startSpeaking
message comes from your NLP response messages. The format of the NLP response message varies between vendors. Please refer to your NLP vendor documentation for more information.
- No labels