hezar.models.backbone.vit.vit module¶
- class hezar.models.backbone.vit.vit.ViT(config: ViTConfig, **kwargs)[source]¶
Bases:
Model
- forward(pixel_values=None, bool_masked_pos=None, head_mask=None, output_attentions=None, output_hidden_states=None, interpolate_pos_encoding=None)[source]¶
Forward inputs through the model and return logits, etc.
- Parameters:
model_inputs – The required inputs for the model forward
- Returns:
A dict of outputs like logits, loss, etc.
- image_processor = 'image_processor'¶
- post_process(model_outputs: Dict[str, Tensor])[source]¶
Process model outputs and return human-readable results. Called in self.predict()
- Parameters:
model_outputs – model outputs to process
**kwargs – extra arguments specific to the derived class
- Returns:
Processed model output values and converted to human-readable results
- preprocess(inputs: List[str | ndarray | Image | Tensor], **kwargs)[source]¶
Given raw inputs, preprocess the inputs and prepare them for model’s forward().
- Parameters:
raw_inputs – Raw model inputs
**kwargs – Extra kwargs specific to the model. See the model’s specific class for more info
- Returns:
A dict of inputs for model forward