At Ubicomp 2007, there was a book stand by Springer just outside the conference room. On the last day, the volunteer behind the stand told me that I could choose one of the books that were still lying there. I didn’t see anything interesting at first. Since a few people at our institute are working on multimodal systems, I picked the book SmartKom: Foundations of Multimodal Dialogue Systems.


During the holidays, I read the first part of the book and noticed the book was relevant for me after all. SmartKom was a large four-year project about multimodal dialogue systems. They developed a system that provides symmetric multimodality in a mixed-initiative dialogue system with an embodied conversational agent. There is also a follow-up project that should ends in 2007: SmartWeb. SmartWeb goes beyond SmartKom in supporting open-domain question answering using the entire (Semantic) Web as its knowledge base.

Symmetric multimodality means that every input mode (e.g. speech, gesture, facial expression) is also available for output, and vice versa. Multimodal interaction is one way to make interaction between humans and computers more intuitive. Human dialogue is not only based on speech but also on nonverbal communication such as gesture, gaze, facial expression, and body posture. One of the major characteristics of human-human interaction is the coordinated use of different modalities (e.g. allowing all modalities to refer to or depend upon each other). Symmetric multimodality combined with a mixed-initiative conversational agent results in more intuitive interaction. The SmartKom systems reduces recognition errors by modality fusion. By considering multiple input modalities together (e.g. speech, facial expression and gesture), the system can more correctly estimate the user’s intention.

See also  What is the best gaming chair for adults?

SmartKom has been used in several application scenarios: in public telephone booths, home entertainment systems, mobile systems and in a car environment. The last part of the book discusses techniques to evaluate multimodal dialogue systems, which should be an interesting read.

Leave a Comment