Page Comparison

Overview

Soul Machines uses a bi-directional WebSocket connection to send a message back and forth between our servers and your Orchestration Server. Each message is JSON encoded and sent as a web socket ‘text’ message over HTTPSHTTPS or WSS. Each JSON encoded message includes some standard fields for the kind of message communicated.

There are three kinds of messages: event, request, and response. In this guide, we only cover the messages relevant to connecting to the NLP platform.

recognizeResults

Each time a user speaks to a Digital Hero, your STT service transcribes their utterance from audio to text. The output from your STT service for each utterance is a series of intermediate recognizeResults messages, as well as one final recognizeResults message. These messages are sent from the Soul Machines Servers to your Orchestration Server via the WebSocket connection.

Below is an example of a “final”recognizeResults message. The final messages can be identified by those messages where the “final” attribute is set to “true.” The transcript text from each “final”message must be sent to your NLP for a response, all other recognizeResults messages received (e.g. where “final” = ”false”) can be ignored.

Code Block{ "category":"scene", "kind":"event", "name":"recognizeResults”, "body":{ "results":[{ "alternatives":[

Event

Each event includes the category (‘scene’), kind (‘event’), name, and body. Events are sent by the Digital Person to all connected servers and the connected client web browser (if any).

Request

Each request includes the category (‘scene’), kind (‘request’), name, transaction, and body. Requests are sent from the Server to the Scene. If the transaction is not null, then a response is sent for the request. The transaction is an integer that is unique across all requests and is included in the response. If no transaction is included, then no response is sent—the request is one way.

Response

Each response includes the category (‘scene’), kind (‘response’), name, transaction (from the matching request), status (the C SDK status integer where >= 0 is success, < 0 is failure), and body. Responses are sent from the Scene to the Server.

The following section describes the conversation messages that must be used by any Orchestration Server implementation.

Conversation Messages

The Speech-To-Text (STT) results are sent via the conversationRequest (see input.text field) message.

To instruct the Digital Person to speak you must send a conversationResponse (output.text).

The Orchestration Server can send conversationResponse messages without a prior request. This type of "spontaneous" conversation message can be used when the Orchestration Server wants to command the Digital Person to speak spontaneously (without any prior request).

The Orchestration Server implementation needs to ensure that all conversationRequest messages are matched with a corresponding conversationResponse message, the output text can be empty if necessary.

Sample Conversation Messages:

Code Block

// Example conversationRequest message (sent by the SDK):
{
	"category": "scene",
	"kind": "event",
	"name": "conversationRequest",
	"body": {
    	"personaId": 1,
    	"input" : {
        	"confidencetext":0.8000,
  : "Can I apply for a credit card?"
    	},
    	"transcriptvariables" : {
        	// conversation variables
 "tell me a joke"	}],
}



// Example "final":true }],
    "status":0}
}

startSpeaking Request

To instruct your Digital Hero to speak a response from your NLP, you need to send a startSpeaking command from your Orchestration Server via the WebSocket connection.

Below is an example of startSpeaking command message.

Code Block

{
  conversationResponse message (sent by the orch server):
{
    	"category": "scene",
    	"kind":     "request",
    	"name":     "startSpeakingconversationResponse",
    	"transaction": nullNone,
| <int>,   	"body": {
        	"personaId": 1,
        	"output" : {
        	  	"text" : ”Yes"Yes, I can help you apply for your credit card. Please tell me which type of card you are after, Visa or Mastercard?"
        	},
    	    	"variables" : {
        	    	// conversation variables
    	    	}
      }
}

The output.text for is required in the startSpeaking message comes from your NLP response messages. The format of the NLP response message varies between vendors. Please refer to your NLP vendor documentation for more information.

Related Sections

conversationResponse message, while the input.text, variables, metadata, and fallback properties are optional.

Sample Conversation Messages for Soul Machines’ Digital DNA Studio Content Blocks:

This is an example of a conversationResponse message with a type options Content Block. In this case, conversationOptions is an arbitrary ID that the author can define, which needs to match for the “\@ShowCards” command and the variable definition.

The variable name needs to be prefixed with “public-”. Examples of other Content Block types can be found in the section Displaying Content in the Digital DNA Studio Default UI.

Code Block

{
   "category": "scene",
   "kind": "request",
   "name": "conversationResponse",
   "body": {
       "personaId": 1,
       "output": {
           "text": "You can choose from one of the following options @showcards(conversationOptions)"
       },
       "variables": {
           "public-conversationOptions": {
               "type": "options",
               "data": {
                   "options": [
                       { "label": "option A" },
                       { "label": "option B" },
                       { "label": "option C" }
                   ]
               }
           }
       }
   }
}

Versions Compared

Old Version 4

New Version Current

Key

Overview

recognizeResults

Event

Request

Response

Conversation Messages

startSpeaking Request

Related Sections

Related Sections