hezar.configs module¶
Configs are at the core of Hezar. All core modules like Model, Preprocessor, Trainer, etc. take their parameters as a config container which is an instance of Config or its derivatives. A Config is a Python dataclass with auxiliary methods for loading, saving, uploading to the hub, etc.
Examples
>>> from hezar.configs import ModelConfig
>>> config = ModelConfig.load("hezarai/bert-base-fa")
>>> from hezar.models import BertMaskFillingConfig
>>> bert_config = BertMaskFillingConfig(vocab_size=50000, hidden_size=768)
>>> bert_config.save("saved/bert", filename="model_config.yaml")
>>> bert_config.push_to_hub("hezarai/bert-custom", filename="model_config.yaml")
- class hezar.configs.Config[source]¶
- Bases: - object- Base class for all configs in Hezar. - All configs are simple dataclasses that have some customized functionalities to manage their attributes. There are also some Hezar specific methods: load, save and push_to_hub. - config_type: str = 'base'¶
 - dict()[source]¶
- Returns the config object as a dictionary (works on nested dataclasses too) - Returns:
- The config object as a dictionary 
 
 - classmethod from_dict(dict_config: Dict | DictConfig, **kwargs)[source]¶
- Load config from a dict-like object. Nested configs are also recursively converted to their classes if possible. 
 - classmethod load(hub_or_local_path: str | PathLike, filename: str | None = None, subfolder: str | None = None, repo_type: str | None = None, cache_dir: str | None = None, **kwargs) Config[source]¶
- Load config from Hub or locally if it already exists on disk (handled by HfApi) - Parameters:
- hub_or_local_path – Local or Hub path for the config 
- filename – Configuration filename 
- subfolder – Optional subfolder path where the config is in 
- repo_type – Repo type e.g, model, dataset, etc 
- cache_dir – Path to cache directory 
- **kwargs – Manual config parameters to override 
 
- Returns:
- A Config instance 
 
 - name: str = None¶
 - push_to_hub(repo_id: str, filename: str, subfolder: str | None = None, repo_type: str | None = 'model', skip_none_fields: bool | None = True, private: bool | None = False, commit_message: str | None = None)[source]¶
- Push the config file to the hub - Parameters:
- repo_id (str) – Repo name or id on the Hub 
- filename (str) – config file name 
- subfolder (str) – subfolder to save the config 
- repo_type (str) – Type of the repo e.g, model, dataset, space 
- skip_none_fields (bool) – Whether to skip saving None values or not 
- private (bool) – Whether the repo type should be private or not (ignored if the repo exists) 
- commit_message (str) – Push commit message 
 
 
 - save(save_dir: str | PathLike, filename: str, subfolder: str | None = None, skip_none_fields: bool | None = True)[source]¶
- Save the *config.yaml file to a local path - Parameters:
- save_dir – Save directory path 
- filename – Config file name 
- subfolder – Subfolder to save the config file 
- skip_none_fields (bool) – Whether to skip saving None values or not 
 
 
 - update(d: dict, **kwargs)[source]¶
- Update config with a given dictionary or keyword arguments. If a key does not exist in the attributes, prints a warning but sets it anyway. - Parameters:
- d – A dictionary 
- **kwargs – Key/value pairs in the form of keyword arguments 
 
- Returns:
- The config object itself but the operation happens in-place anyway 
 
 
- class hezar.configs.DatasetConfig(path: str | None = None, task: TaskType | List[TaskType] | None = None, max_size: int | float | None = None, hf_load_kwargs: dict | None = None)[source]¶
- Bases: - Config- Base dataclass for all dataset configs - Parameters:
- path (str) – Path to the dataset either on the Hub or local. Supported syntax is either <path> or <path>:<name> where <name> is the parameter name in the load_dataset() 
- task (str) – A supported task for the dataset 
- max_size (int | float) – Maximum number of data samples. Overwrites the main length of the dataset when calling len(dataset). If set to a float value between 0 and 1, will be interpreted as a fraction value, e.g, 0.3 means 30% of the whole length. 
- hf_load_kwargs (dict) – Keyword arguments to pass to the HF datasets.load_dataset() 
 
 - config_type: str = 'dataset'¶
 - hf_load_kwargs: dict = None¶
 - max_size: int | float = None¶
 - name: str = None¶
 - path: str = None¶
 
- class hezar.configs.EmbeddingConfig(bypass_version_check: bool = False)[source]¶
- Bases: - Config- Base dataclass for all embedding configs - bypass_version_check: bool = False¶
 - config_type: str = 'embedding'¶
 - name: str = None¶
 
- class hezar.configs.MetricConfig(objective: Literal['maximize', 'minimize'] | None = None, output_keys: List | Tuple | None = None, n_decimals: int = 4)[source]¶
- Bases: - Config- Base dataclass config for all metric configs - config_type: str = 'metric'¶
 - n_decimals: int = 4¶
 - name: str = None¶
 - objective: Literal['maximize', 'minimize'] = None¶
 - output_keys: List | Tuple = None¶
 
- class hezar.configs.ModelConfig[source]¶
- Bases: - Config- Base dataclass for all model configs - config_type: str = 'model'¶
 - name: str = None¶
 
- class hezar.configs.PreprocessorConfig[source]¶
- Bases: - Config- Base dataclass for all preprocessor configs - config_type: str = 'preprocessor'¶
 - name: str = None¶
 
- class hezar.configs.TrainerConfig(output_dir: str, task: str | TaskType, device: str = 'cuda', num_epochs: int | None = None, init_weights_from: str | None = None, resume_from_checkpoint: bool | str | PathLike | None = None, max_steps: int | None = None, num_dataloader_workers: int = 0, dataloader_shuffle: bool = True, seed: int = 42, optimizer: str | OptimizerType | None = None, learning_rate: float = 2e-05, weight_decay: float = 0.0, lr_scheduler: str | LRSchedulerType | None = None, lr_scheduler_kwargs: Dict[str, Any] | None = None, lr_scheduling_steps: int | None = None, batch_size: int | None = None, eval_batch_size: int | None = None, gradient_accumulation_steps: int = 1, distributed: bool = False, mixed_precision: PrecisionType | str | None = None, use_cpu: bool = False, do_evaluate: bool = True, evaluate_with_generate: bool = True, metrics: List[str | MetricConfig] | None = None, metric_for_best_model: str = 'loss', save_enabled: bool = True, save_freq: int = 'deprecated', save_steps: int | None = None, log_steps: int | None = None, checkpoints_dir: str = 'checkpoints', logs_dir: str = 'logs')[source]¶
- Bases: - Config- Base dataclass for all trainer configs - Parameters:
- task (str, TaskType) – The training task. Must be a valid name from TaskType. 
- output_dir (str) – Path to the directory to save trainer properties. 
- device (str) – Hardware device e.g, cuda:0, cpu, etc. 
- num_epochs (int) – Number of total epochs to train the model. 
- init_weights_from (str) – Path to a model from disk or Hub to load the initial weights from. Note that this only loads the model weights and ignores other checkpoint-related states if the path is a checkpoint. To resume training from a checkpoint use the resume parameter. 
- resume_from_checkpoint (bool, str, os.PathLike) – Resume training from a checkpoint. If set to True, the trainer will load the latest checkpoint, otherwise if a path to a checkpoint is given, it will load that checkpoint and all the other states corresponding to that checkpoint. 
- max_steps (int) – Maximum number of iterations to train. This helps to limit how many batches you want to train in total. 
- num_dataloader_workers (int) – Number of dataloader workers, defaults to 4 . 
- dataloader_shuffle (bool) – Control dataloaders shuffle argument. 
- seed (int) – Control determinism of the run by setting a seed value. defaults to 42. 
- optimizer (OptimizerType) – Name of the optimizer, available values include properties in OptimizerType enum. 
- learning_rate (float) – Initial learning rate for the optimizer. 
- weight_decay (float) – Optimizer weight decay value. 
- lr_scheduler (LRSchedulerType) – Optional learning rate scheduler among LRSchedulerType enum. 
- lr_scheduler_kwargs (Dict[str, Any]) – LR scheduler instructor kwargs depending on the scheduler type 
- lr_scheduling_steps (int) – Number of steps to perform scheduler stepping. If left as None, will default to the steps in one full epoch. 
- batch_size (int) – Training batch size. 
- eval_batch_size (int) – Evaluation batch size, defaults to batch_size if None. 
- gradient_accumulation_steps (int) – Number of updates steps to accumulate before performing a backward/update pass, defaults to 1. 
- distributed (bool) – Whether to use distributed training (via the accelerate package) 
- mixed_precision (PrecisionType | str) – Mixed precision type e.g, fp16, bf16, etc. (disabled by default) 
- use_cpu (bool) – Whether to train using the CPU only even if CUDA is available. 
- do_evaluate (bool) – Whether to run evaluation when calling Trainer.train 
- evaluate_with_generate (bool) – Whether to use generate() in the evaluation step or not. (only applicable for generative models). 
- metrics (List[str | MetricConfig]) – A list of metrics. Depending on the valid_metrics in the specific MetricsHandler of the Trainer. 
- metric_for_best_model (str) – Reference metric key to watch for determining the best model. Recommended to have a {train. | evaluation.} prefix (e.g, evaluation.f1, train.accuracy, etc.) but if not, defaults to evaluation.{metric_for_best_model}. 
- save_freq (int) (DEPRECATED) – Deprecated and renamed to save_steps. 
- save_enabled (bool) – Whether to save checkpoints at all. False disables even the saves in-between the epochs. 
- save_steps (int) – Save the trainer outputs every save_steps steps. Leave as None to ignore saving in-between training steps. If set to a float value between 0 and 1, it will be interpreted as a fraction of the total steps. 
- log_steps (int) – Save training metrics every log_steps steps. If set to a float value between 0 and 1, it will be interpreted as a fraction of the total steps. 
- checkpoints_dir (str) – Path to the checkpoints’ folder. The actual files will be saved under {output_dir}/{checkpoints_dir}. 
- logs_dir (str) – Path to the logs’ folder. The actual log files will be saved under {output_dir}/{logs_dir}. 
 
 - batch_size: int = None¶
 - checkpoints_dir: str = 'checkpoints'¶
 - config_type: str = 'trainer'¶
 - dataloader_shuffle: bool = True¶
 - device: str = 'cuda'¶
 - distributed: bool = False¶
 - do_evaluate: bool = True¶
 - eval_batch_size: int = None¶
 - evaluate_with_generate: bool = True¶
 - gradient_accumulation_steps: int = 1¶
 - init_weights_from: str = None¶
 - learning_rate: float = 2e-05¶
 - log_steps: int = None¶
 - logs_dir: str = 'logs'¶
 - lr_scheduler: str | LRSchedulerType = None¶
 - lr_scheduler_kwargs: Dict[str, Any] = None¶
 - lr_scheduling_steps: int = None¶
 - max_steps: int = None¶
 - metric_for_best_model: str = 'loss'¶
 - metrics: List[str | MetricConfig] = None¶
 - mixed_precision: PrecisionType | str | None = None¶
 - name: str = 'trainer'¶
 - num_dataloader_workers: int = 0¶
 - num_epochs: int = None¶
 - optimizer: str | OptimizerType = None¶
 - output_dir: str¶
 - resume_from_checkpoint: bool | str | PathLike = None¶
 - save_enabled: bool = True¶
 - save_freq: int = 'deprecated'¶
 - save_steps: int = None¶
 - seed: int = 42¶
 - use_cpu: bool = False¶
 - weight_decay: float = 0.0¶