hezar.registry module

Hezar uses a registry system in a way that for any core module like model, dataset, etc. there is an entry in its specific registry. These registries are simple python dictionaries that map a module’s name to its class and its config class. These registries are initialized here and filled automatically when you import hezar or a registry itself.

Examples

>>> # read models registry
>>> from hezar.registry import models_registry
>>> print(models_registry)
{'distilbert_mask_filling': {'module_class': <class 'hezar.models.mask_filling.distilbert.distilbert_mask_filling.DistilBertMaskFilling'>,
'config_class': <class 'hezar.models.mask_filling.distilbert.distilbert_mask_filling_config.DistilBertMaskFillingConfig'>},
'description': 'Optional model description here...'}
>>> # add a model class to models_registry
>>> from hezar.models import Model, register_model
>>> @register_model(name="my_awesome_model", config_class=MyAwesomeModelConfig, description="My Awesome Model!")
>>> class MyAwesomeModel(Model):
...    def __init__(config: MyAwesomeModelConfig):
...        ...

Keep in mind that registries usually don’t need to be used directly. There is a bunch of functions to build modules using a module’s registry name in hezar.builders module. See the file builders.py for more info.

Note: In case of adding a new registry container, make sure to add to __all__ below!

class hezar.registry.Registry(module_class: type, config_class: type = None, description: str | None = None)[source]

Bases: object

config_class: type = None
description: str | None = None
module_class: type
hezar.registry.register_dataset(dataset_name: str, config_class: Type[DatasetConfig], description: str = None)[source]

A class decorator that adds the dataset class and the config class to the datasets_registry

Parameters:
  • dataset_name – Dataset’s registry name e.g, text_classification.

  • config_class – Dataset’s config class e.g, TextClassificationDatasetConfig. This parameter must be the config class itself not a config instance!

  • description – Optional dataset description

hezar.registry.register_embedding(embedding_name: str, config_class: Type[EmbeddingConfig], description: str = None)[source]

A class decorator that adds the embedding class and the config class to the embeddings_registry

Parameters:
  • embedding_name – Embedding’s registry name e.g, word2vec_cbow.

  • config_class – Embedding’s config class e.g, Word2VecCBOWConfig. This parameter must be the config class itself not a config instance!

  • description – Optional embedding description

hezar.registry.register_metric(metric_name: str, config_class: Type[MetricConfig], description: str = None)[source]

A class decorator that adds the metric class and the config class to the metrics_registry

Parameters:
  • metric_name – Metric registry name e.g, f1

  • config_class – Metric config class

  • description – Optional metric description

hezar.registry.register_model(model_name: str, config_class: Type[ModelConfig], description: str = None)[source]

A class decorator that adds the model class and the config class to the models_registry

Parameters:
  • model_name – Model’s registry name e.g, bert_sequence_labeling

  • config_class – Model’s config class e.g, BertSequenceLabelingConfig. This parameter must be the config class itself not a config instance!

  • description – Optional model description

hezar.registry.register_preprocessor(preprocessor_name: str, config_class: Type[PreprocessorConfig], description: str = None)[source]

A class decorator that adds the preprocessor class and the config class to the preprocessors_registry

Parameters:
  • preprocessor_name – Preprocessor’s registry name e.g, bpe_tokenizer.

  • config_class – Preprocessor’s config class e.g, BPEConfig. This parameter must be the config class itself not a config instance!

  • description – Optional preprocessor description