Abstract: Pre-trained vision-language (V-L) models such as CLIP have shown excellent generalization ability to downstream tasks. However, they are sensitive to the choice of input text prompts and ...
install Install one, more, or all versions from a python-build-standalone release. update (or upgrade) Update one, more, or all versions to another release. remove (or uninstall) Remove/uninstall one, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results