Abstract: Traditional video compression methods perform well at high bitrates but struggle to preserve fine-grained semantic information at low bitrates. Recently, with the blossoming of Multimodal ...