Abstract: The accurate extraction of feature points has a significant impact on the pose estimation of visual SLAM system, in weak textured environments such as underground parking lots, robust ...
Abstract: Visual question answering (VQA) is a multimodal task which answer a question related to an image. Existing VQA methods tend to focus on the target object on the visual level and ignore the ...