Cerence listening win = Seeing win19 Jan 2021 20:41
Cerence's patent IS EXTREMELY RELEVANT!
Contextual utterance resolution in multimodal systems
https://patents.google.com/patent/WO2020146134A1/en
What is claimed is:
1. A method of disambiguating a vocal utterance in a multimodal input system, the method comprising:
a processor storing context data;
a processor receiving a vocal utterance; and
a processor applying context data to determine the meaning of the utterance.
2. The method of claim 1, wherein the method includes a processor in an electronic vehicle system storing context data, receiving a vocal utterance and applying context data to determine the meaning of the utterance.
3. The method of claim 1, wherein the system is responsive to gaze input to determine the meaning of the utterance.
4. The method of claim 1, wherein the system is responsive to at least one of a stylus input, a haptic input, and a text input to determine the meaning of the utterance.
5. The method of claim 1, wherein the system is responsive to text input.
6. The method of claim 1, wherein the system employs embedded and cloud processing.
7. The method of claim 1, wherein the processor employs natural language processing to determine a word of the vocal utterance.
8. The method of claim 1, wherein the processor employs a recent antecedent interaction with the system as a context factor in applying context to the determination of the meaning of the utterance.
9. The method of claim 1, wherein the processor employs gaze data as a context factor in applying context to the determination of the meaning of the utterance.
10. The method of claim 1, wherein the processor employs current media playing data as a context factor in applying context to the determination of the meaning of the utterance.
11. The method of claim 1, wherein the processor employs the status of an associated system as a context factor in applying context to the determination of the meaning of the utterance.
12. The method of claim 11, wherein the processor employs the status of a vehicle as a context factor in applying context to the determination of the meaning of the utterance.
13. The method of claim 12, wherein the processor employs a sensor reading as an indication of the status of the vehicle.
14. The method of claim 1, wherein the processor employs a speech analysis technique including at least one of: voice activity detection, automatic speech recognition and natural language understanding.
15. A multimodal input system for disambiguating a vocal utterance, comprising: a processor to store context data;
a processor to receive a vocal utterance; and
a processor to apply context data to determine the meaning of the utterance.
16. The system of claim 15, wherein the system includes a processor in an electronic vehicle system to store context data, to receive a vocal utterance, and to apply context data to determine the meaning of the utterance.
...