This is the official implementaion of paper 'Adaptive Keyframe Sampling for Long Video Understanding', which is accepted in CVPR 2025. Multimodal large language models (MLLMs) have enabled open-world ...
Abstract: With the massive growth of video data on the internet, users need a strategy to quickly browse video content. Extracting the key information from the redundant information of video data is a ...
Abstract: To empower mobile robots with usable maps as well as highest state estimation accuracy and robustness, we present OKVIS2-X: a state-of-the-art multisensor simultaneous localization and ...