hezar.models.image2text.vit_roberta package¶
Submodules¶
- hezar.models.image2text.vit_roberta.vit_roberta_image2text module
ViTRobertaImage2TextViTRobertaImage2Text.compute_loss()ViTRobertaImage2Text.decoderViTRobertaImage2Text.encoderViTRobertaImage2Text.forward()ViTRobertaImage2Text.generate()ViTRobertaImage2Text.image_processorViTRobertaImage2Text.is_generativeViTRobertaImage2Text.loss_func_nameViTRobertaImage2Text.post_process()ViTRobertaImage2Text.preprocess()ViTRobertaImage2Text.required_backends
- hezar.models.image2text.vit_roberta.vit_roberta_image2text_config module
DecoderConfigDecoderConfig.add_cross_attentionDecoderConfig.attention_probs_dropout_probDecoderConfig.bos_token_idDecoderConfig.classifier_dropoutDecoderConfig.eos_token_idDecoderConfig.gradient_checkpointingDecoderConfig.hidden_actDecoderConfig.hidden_dropout_probDecoderConfig.hidden_sizeDecoderConfig.initializer_rangeDecoderConfig.intermediate_sizeDecoderConfig.is_decoderDecoderConfig.layer_norm_epsDecoderConfig.max_position_embeddingsDecoderConfig.num_attention_headsDecoderConfig.num_hidden_layersDecoderConfig.pad_token_idDecoderConfig.position_embedding_typeDecoderConfig.type_vocab_sizeDecoderConfig.use_cacheDecoderConfig.vocab_size
EncoderConfigEncoderConfig.attention_probs_dropout_probEncoderConfig.encoder_strideEncoderConfig.hidden_actEncoderConfig.hidden_dropout_probEncoderConfig.hidden_sizeEncoderConfig.image_sizeEncoderConfig.initializer_rangeEncoderConfig.intermediate_sizeEncoderConfig.layer_norm_epsEncoderConfig.num_attention_headsEncoderConfig.num_channelsEncoderConfig.num_hidden_layersEncoderConfig.patch_sizeEncoderConfig.qkv_bias
GenerationConfigGenerationConfig.bos_token_idGenerationConfig.decoder_start_token_idGenerationConfig.early_stoppingGenerationConfig.eos_token_idGenerationConfig.length_penaltyGenerationConfig.max_lengthGenerationConfig.no_repeat_ngram_sizeGenerationConfig.num_beamsGenerationConfig.pad_token_idGenerationConfig.return_dict_in_generate
ViTRobertaImage2TextConfig