Abstract: Multimodal intent recognition utilizes heterogeneous modalities such as visual, auditory, and textual cues to infer user intent, serving as a pivotal component in human-machine interaction.
Abstract: Multi-View Stereo (MVS) reconstructs detailed 3D structures from multi-view images by establishing spatial correspondences. While learning-based methods have significantly advanced the MVS ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results