Abstract: Text-based Visual Question Answering (TextVQA) is a subfield of Visual Question Answering (VQA) that is able to read the text in a given image. Existing work on TextVQA usually improves ...