[August 09, 2000] MPML (Multimodal Presentation Markup Language) is an XML-based markup language "developed to enable the description of multimodal presentation based on charactor agents in easier way. MPML allows users to write attractive multimodal presentations easily." MPML is under development by Zong Yuan at Ishizuka Lab (Department of Information and Communication Engineering, School of Engineering, University of Tokyo. Functionally, the Multimodal Presentation Markup Language bears several similarities to Synchronized Multimedia Integration Language (SMIL). Description of MPML is is provided in several workshop papers and published articles. In summary: "As a new style of effective information presentation and a new method of multimodal information content production on the WWW, multimodal presentations using interactive lifelike agents with verbal conversation capability appear to be very attractive and important. For this purpose, we have developed MPML (Multimodal Presentation Markup Language), which allows many users to write attractive multimodal presentations easily. MPML is a markup language that conforms to XML (Extensible Markup Language) and which supports functions for controlling verbal presentations and scripting agent behaviors." [IPSJ TRANSACTION 41/4]
The development Web site provides an MPML Player (ViewMpml): "This MPML Player calls Internet Explorer ActiveX Server, using ActiveX technology, when running. The MPML Player utilizes Microsoft Agent to perform the Multimodal Presentation."
A broader research topic at the School of Engineering is 'Multimodal Anthropomorphic Agent System and Media Processing.' As a promising new style of human interface beyond currently dominating GUI (Graphical User Interface), we are working on a research and development of a multimodal anthropomorphic interface agent called VSA (Visual Software Agent), which has a realistic moving face, a vision function, speech communication capability and an access function to inoformation sources in the Internet. The VSA, which is technically realized by integrating several technologies such as realtime image synthesis/recognition, speech recognition/synthesis, dialogue management, access technologies to the Internet and WWW (World Wide Web), etc., enables a friendly multimodal human interface environment close to daily face-to-face dialogue. We have constructed an experimental guidance system with this VSA system and multimedia presentation. An important technology that has made VSA very practical is connections with WWW browsers (initially Mosaic, then Netscape). With this connection, we can access vast WWW information sources through speech dialogues with our VSA as well as mouse operations. Since 1998, not only as an interface but also as an attractive multimodal content creation tool, we are developing MPML (multimodal presentation markup language) and its related systems using animated character agents." See the 1998 paper referencing MPML, designed to "allow an easy and uniform high-level description for the multimodal presentations employing various lifelike agents: 'Figure 27 illustrates a simple MPML script description. As shown in Fig. 28, the MPML script is converted to operate our VSA/VPA, Microsoft Agent and other agent systems such as ASHOW of ETL and TVML of NHK and Hitachi. With this tool, we expect that everyone will be to able to easily write his/her (interactive) multimodal presentations like writing HTML texts, and dispatch them through the WWW'." [Multimodal Anthropomorphic Agent Connected with WWW Information Space]
References:
Multimodal Presentation Markup Language (MPML) Ver. 1.030 Specification
[August 09, 2000] "A Multimodal Presentation Markup Language MPML with Controlling Functions of Character Agent." By Takayuki Tsutsui (School of Engineering, University of Tokyo ) and Mitsuru Ishizuka (NTT DATA Corporation). Pages 1124-1133 in Transactions of the Information Processing Society of Japan [IPSJ TRANSACTION] Volume 41, Number 4 (April 2000). "As a new style of effective information presentation and a new method of multimodal information content production on the WWW, multimodal presentations using interactive lifelike agents with verbal conversation capability appear to be very attractive and important. For this purpose, we have developed MPML (Multimodal Presentation Markup Language), which allows many users to write attractive multimodal presentations easily. MPML is a markup language that conforms to XML (Extensible Markup Language) and which supports functions for controlling verbal presentations and scripting agent behaviors. In this paper, we present MPML and its related tools for producing multimodal presentations."
"MPML: A Multimodal Presentation Markup Language with Character Agent Control Functions." By Mitsuru Ishizuka, Takayuki Tsutsui, Santi Saeyor, Hiroshi Dohi, Yuan Zong, and Helmut Prendinger. Paper presented at the workshop "Achieving Human-Like Behavior in Interactive Animated Agents" in conjunction with The Fourth International Conference on Autonomous Agents (June 3, 2000, Barcelona, Spain). [cache]
"Emotion Expression Functions attached to Multimodal Presentation Markup Language (MPML)." By Yuan Zong, Hiroshi Dohi, and Mitsuru Ishizuka. Paper presented at the workshop "Achieving Human-Like Behavior in Interactive Animated Agents" in conjunction with The Fourth International Conference on Autonomous Agents (June 3, 2000, Barcelona, Spain). [cache]
"Multimodal Anthropomorphic Agent Connected with WWW Information Space." By Mitsuru Ishizuka and Hiroshi Dohi. "Multimodal anthropomorphic (or lifelike) agent interfaces are emerging as a promising new style of human interface beyond current GUI (graphical user interface). In this paper, we present an outline of our multimodal anthropomorphic agent system called VSA (visual software agent), which has a realistic moving face and functions of eye, ear, mouth and dialogues. A notable feature of VSA is a connection with a WWW browser; thereby, VSA allows us to access the vast information sources in the WWW through its friendly multimodal interface close to daily face-to-face communication. As an extension of VSA, we also introduce our VPA (visual page agent), aiming at a tool for making new multimodal (semi-interactive) Web contents."