Summary

As we move forward to a future of ubiquitous computing, it is important for interfaces to incorporate multimodal input such as touch, speech, hand gestures and body movements. Recent advancement of Generative AI has enabled interfaces to incorporate natural language understanding and direct manipulation. In this talk, I will present my recent research on advancing multimodal interfaces with AI. This includes Large Language Model (LLM)-based speech for text composition; Natural Language Processing (NLP)-assisted text selection; as well as Large Vision Language Model (LVLM)-based VR scene authoring tool. These solutions build on empirical understanding of input modalities and challenge existing assumptions while leveraging new technologies. I hope they will spark fruitful discussions with the audience about developing future interfaces leveraging AI.

Vita

Dresden Talk Can Liu Liu Can is an HCI researcher, currently serve as an associate professor at the School of Creative Media, City University of Hong Kong. She leads a research group in Hong Kong named the ERFI lab (Laboratory of Empirical Research for Future Interfaces). Her research interests include speech-based multimodal interfaces, mobile computing, spatial computing, and group collaboration interfaces. Her work has been supported by the Hong Kong Research Grants Council, the National Natural Science Foundation of China, the Google Faculty Award, and the Berkeley Artificial Intelligence Research Lab. Before joining CityU, she earned her PhD in Computer Science from the Université Paris-Saclay and worked as a postdoctoral researcher at University College London. She has published more than 30 papers in top-tier HCI conferences and journals, including ACM CHI, CSCW, IMWUT, and DIS. Additionally she actively participates in program committees of HCI venues and help organizing HCI activities both in China and internationally.