hezar.models.image2text.vit_roberta.vit_roberta_image2text module¶
- class hezar.models.image2text.vit_roberta.vit_roberta_image2text.ViTRobertaImage2Text(config: ViTRobertaImage2TextConfig, **kwargs)[source]¶
Bases:
Model
ViT + RoBERTa for image to text
- compute_loss(logits: Tensor, labels: Tensor) Tensor [source]¶
Compute loss on the model outputs against the given labels
- Parameters:
inputs – Input tensor to compute loss on
targets – Target tensor
- Returns:
Loss tensor
- property decoder¶
- property encoder¶
- forward(pixel_values, decoder_input_ids=None, decoder_attention_mask=None, encoder_outputs=None, past_key_values=None, decoder_inputs_embeds=None, use_cache=None, output_attentions=None, output_hidden_states=None, **kwargs)[source]¶
Forward inputs through the model and return logits, etc.
- Parameters:
model_inputs – The required inputs for the model forward
- Returns:
A dict of outputs like logits, loss, etc.
- generate(pixel_values, generation_config=None, **kwargs)[source]¶
Generation method for all generative models. Generative models have the is_generative attribute set to True. The behavior of this method is usually controlled by generation part of the model’s config.
- Parameters:
model_inputs – Model inputs for generation, usually the same as forward’s model_inputs
**kwargs – Generation kwargs
- Returns:
Generated output tensor
- image_processor = 'image_processor'¶
- is_generative: bool = True¶
- post_process(model_outputs: Tensor, **kwargs)[source]¶
Process model outputs and return human-readable results. Called in self.predict()
- Parameters:
model_outputs – model outputs to process
**kwargs – extra arguments specific to the derived class
- Returns:
Processed model output values and converted to human-readable results
- preprocess(inputs: List[str] | List[np.ndarray] | List['Image'] | List[torch.Tensor], **kwargs)[source]¶
Given raw inputs, preprocess the inputs and prepare them for model’s forward().
- Parameters:
raw_inputs – Raw model inputs
**kwargs – Extra kwargs specific to the model. See the model’s specific class for more info
- Returns:
A dict of inputs for model forward
- required_backends: List[Backends | str] = [Backends.TRANSFORMERS, Backends.TOKENIZERS, Backends.PILLOW]¶
- tokenizer_name = 'bpe_tokenizer'¶