hezar.configs module¶

Configs are at the core of Hezar. All core modules like Model, Preprocessor, Trainer, etc. take their parameters as a config container which is an instance of Config or its derivatives. A Config is a Python dataclass with auxiliary methods for loading, saving, uploading to the hub, etc.

Examples

>>> from hezar.configs import ModelConfig
>>> config = ModelConfig.load("hezarai/bert-base-fa")

>>> from hezar.models import BertMaskFillingConfig
>>> bert_config = BertMaskFillingConfig(vocab_size=50000, hidden_size=768)
>>> bert_config.save("saved/bert", filename="model_config.yaml")
>>> bert_config.push_to_hub("hezarai/bert-custom", filename="model_config.yaml")

class hezar.configs.Config[source]¶

Bases: object

Base class for all configs in Hezar.

All configs are simple dataclasses that have some customized functionalities to manage their attributes. There are also some Hezar specific methods: load, save and push_to_hub.

config_type: str = 'base'¶

dict()[source]¶

Returns the config object as a dictionary (works on nested dataclasses too)

Returns:: The config object as a dictionary

classmethod fields()[source]¶

classmethod from_dict(dict_config: Dict | DictConfig, **kwargs)[source]¶: Load config from a dict-like object. Nested configs are also recursively converted to their classes if possible.

get(key, default=None)[source]¶

keys()[source]¶

Load config from Hub or locally if it already exists on disk (handled by HfApi)

Parameters:

hub_or_local_path – Local or Hub path for the config
filename – Configuration filename
subfolder – Optional subfolder path where the config is in
repo_type – Repo type e.g, model, dataset, etc
cache_dir – Path to cache directory
**kwargs – Manual config parameters to override

Returns:

A Config instance

name: str = None¶

Push the config file to the hub

Parameters:

repo_id (str) – Repo name or id on the Hub
filename (str) – config file name
subfolder (str) – subfolder to save the config
repo_type (str) – Type of the repo e.g, model, dataset, space
skip_none_fields (bool) – Whether to skip saving None values or not
private (bool) – Whether the repo type should be private or not (ignored if the repo exists)
commit_message (str) – Push commit message

save(save_dir: str | PathLike, filename: str, subfolder: str | None = None, skip_none_fields: bool | None = True)[source]¶

Save the *config.yaml file to a local path

Parameters:

save_dir – Save directory path
filename – Config file name
subfolder – Subfolder to save the config file
skip_none_fields (bool) – Whether to skip saving None values or not

update(d: dict, **kwargs)[source]¶

Update config with a given dictionary or keyword arguments. If a key does not exist in the attributes, prints a warning but sets it anyway.

Parameters:

d – A dictionary
**kwargs – Key/value pairs in the form of keyword arguments

Returns:

The config object itself but the operation happens in-place anyway

Bases: Config

Base dataclass for all dataset configs

Parameters:

path (str) – Path to the dataset either on the Hub or local. Supported syntax is either <path> or <path>:<name> where <name> is the parameter name in the load_dataset()
task (str) – A supported task for the dataset
max_size (int | float) – Maximum number of data samples. Overwrites the main length of the dataset when calling len(dataset). If set to a float value between 0 and 1, will be interpreted as a fraction value, e.g, 0.3 means 30% of the whole length.
hf_load_kwargs (dict) – Keyword arguments to pass to the HF datasets.load_dataset()

config_type: str = 'dataset'¶

hf_load_kwargs: dict = None¶

max_size: int | float = None¶

name: str = None¶

path: str = None¶

task: TaskType | List[TaskType] = None¶

class hezar.configs.EmbeddingConfig(bypass_version_check: bool = False)[source]¶

Bases: Config

Base dataclass for all embedding configs

bypass_version_check: bool = False¶

config_type: str = 'embedding'¶

name: str = None¶

class hezar.configs.MetricConfig(objective: Literal['maximize', 'minimize'] | None = None, output_keys: List | Tuple | None = None, n_decimals: int = 4)[source]¶

Bases: Config

Base dataclass config for all metric configs

config_type: str = 'metric'¶

n_decimals: int = 4¶

name: str = None¶

objective: Literal['maximize', 'minimize'] = None¶

output_keys: List | Tuple = None¶

class hezar.configs.ModelConfig[source]¶

Bases: Config

Base dataclass for all model configs

config_type: str = 'model'¶

name: str = None¶

class hezar.configs.PreprocessorConfig[source]¶

Bases: Config

Base dataclass for all preprocessor configs

config_type: str = 'preprocessor'¶

name: str = None¶

class hezar.configs.TrainerConfig(output_dir: str, task: str | TaskType, device: str = 'cuda', num_epochs: int | None = None, init_weights_from: str | None = None, resume_from_checkpoint: bool | str | PathLike | None = None, max_steps: int | None = None, num_dataloader_workers: int = 0, dataloader_shuffle: bool = True, seed: int = 42, optimizer: str | OptimizerType | None = None, learning_rate: float = 2e-05, weight_decay: float = 0.0, lr_scheduler: str | LRSchedulerType | None = None, lr_scheduler_kwargs: Dict[str, Any] | None = None, lr_scheduling_steps: int | None = None, batch_size: int | None = None, eval_batch_size: int | None = None, gradient_accumulation_steps: int = 1, distributed: bool = False, mixed_precision: PrecisionType | str | None = None, use_cpu: bool = False, do_evaluate: bool = True, evaluate_with_generate: bool = True, metrics: List[str | MetricConfig] | None = None, metric_for_best_model: str = 'loss', save_enabled: bool = True, save_freq: int = 'deprecated', save_steps: int | None = None, log_steps: int | None = None, checkpoints_dir: str = 'checkpoints', logs_dir: str = 'logs')[source]¶

Bases: Config

Base dataclass for all trainer configs

Parameters:

task (str, TaskType) – The training task. Must be a valid name from TaskType.
output_dir (str) – Path to the directory to save trainer properties.
device (str) – Hardware device e.g, cuda:0, cpu, etc.
num_epochs (int) – Number of total epochs to train the model.
init_weights_from (str) – Path to a model from disk or Hub to load the initial weights from. Note that this only loads the model weights and ignores other checkpoint-related states if the path is a checkpoint. To resume training from a checkpoint use the resume parameter.
resume_from_checkpoint (bool, str, os.PathLike) – Resume training from a checkpoint. If set to True, the trainer will load the latest checkpoint, otherwise if a path to a checkpoint is given, it will load that checkpoint and all the other states corresponding to that checkpoint.
max_steps (int) – Maximum number of iterations to train. This helps to limit how many batches you want to train in total.
num_dataloader_workers (int) – Number of dataloader workers, defaults to 4 .
dataloader_shuffle (bool) – Control dataloaders shuffle argument.
seed (int) – Control determinism of the run by setting a seed value. defaults to 42.
optimizer (OptimizerType) – Name of the optimizer, available values include properties in OptimizerType enum.
learning_rate (float) – Initial learning rate for the optimizer.
weight_decay (float) – Optimizer weight decay value.
lr_scheduler (LRSchedulerType) – Optional learning rate scheduler among LRSchedulerType enum.
lr_scheduler_kwargs (Dict[str, Any]) – LR scheduler instructor kwargs depending on the scheduler type
lr_scheduling_steps (int) – Number of steps to perform scheduler stepping. If left as None, will default to the steps in one full epoch.
batch_size (int) – Training batch size.
eval_batch_size (int) – Evaluation batch size, defaults to batch_size if None.
gradient_accumulation_steps (int) – Number of updates steps to accumulate before performing a backward/update pass, defaults to 1.
distributed (bool) – Whether to use distributed training (via the accelerate package)
mixed_precision (PrecisionType | str) – Mixed precision type e.g, fp16, bf16, etc. (disabled by default)
use_cpu (bool) – Whether to train using the CPU only even if CUDA is available.
do_evaluate (bool) – Whether to run evaluation when calling Trainer.train
evaluate_with_generate (bool) – Whether to use generate() in the evaluation step or not. (only applicable for generative models).
metrics (List[str | MetricConfig]) – A list of metrics. Depending on the valid_metrics in the specific MetricsHandler of the Trainer.
metric_for_best_model (str) – Reference metric key to watch for determining the best model. Recommended to have a {train. | evaluation.} prefix (e.g, evaluation.f1, train.accuracy, etc.) but if not, defaults to evaluation.{metric_for_best_model}.
save_freq (int) (DEPRECATED) – Deprecated and renamed to save_steps.
save_enabled (bool) – Whether to save checkpoints at all. False disables even the saves in-between the epochs.
save_steps (int) – Save the trainer outputs every save_steps steps. Leave as None to ignore saving in-between training steps. If set to a float value between 0 and 1, it will be interpreted as a fraction of the total steps.
log_steps (int) – Save training metrics every log_steps steps. If set to a float value between 0 and 1, it will be interpreted as a fraction of the total steps.
checkpoints_dir (str) – Path to the checkpoints’ folder. The actual files will be saved under {output_dir}/{checkpoints_dir}.
logs_dir (str) – Path to the logs’ folder. The actual log files will be saved under {output_dir}/{logs_dir}.

batch_size: int = None¶

checkpoints_dir: str = 'checkpoints'¶

config_type: str = 'trainer'¶

dataloader_shuffle: bool = True¶

device: str = 'cuda'¶

distributed: bool = False¶

do_evaluate: bool = True¶

eval_batch_size: int = None¶

evaluate_with_generate: bool = True¶

gradient_accumulation_steps: int = 1¶

init_weights_from: str = None¶

learning_rate: float = 2e-05¶

log_steps: int = None¶

logs_dir: str = 'logs'¶

lr_scheduler: str | LRSchedulerType = None¶

lr_scheduler_kwargs: Dict[str, Any] = None¶

lr_scheduling_steps: int = None¶

max_steps: int = None¶

metric_for_best_model: str = 'loss'¶

metrics: List[str | MetricConfig] = None¶

mixed_precision: PrecisionType | str | None = None¶

name: str = 'trainer'¶

num_dataloader_workers: int = 0¶

num_epochs: int = None¶

optimizer: str | OptimizerType = None¶

output_dir: str¶

resume_from_checkpoint: bool | str | PathLike = None¶

save_enabled: bool = True¶

save_freq: int = 'deprecated'¶

save_steps: int = None¶

seed: int = 42¶

task: str | TaskType¶

use_cpu: bool = False¶

weight_decay: float = 0.0¶