hezar.preprocessors.preprocessor module¶
- class hezar.preprocessors.preprocessor.Preprocessor(config: PreprocessorConfig, **kwargs)[source]¶
Bases:
object
Base class for all data preprocessors.
- Parameters:
config – Preprocessor properties
- classmethod load(hub_or_local_path, subfolder: str | None = None, force_return_dict: bool = False, cache_dir: str | None = None, **kwargs)[source]¶
Load a preprocessor or a pipeline of preprocessors from a local or Hub path. This method automatically detects any preprocessor in the path. If there’s only one preprocessor, returns it and if there are more, returns a dictionary of preprocessors.
This method must also be overriden by subclasses as it internally calls this method for every possible preprocessor found in the repo.
- Parameters:
hub_or_local_path – Path to hub or local repo
subfolder – Subfolder for the preprocessor.
force_return_dict – Whether to return a dict even if there’s only one preprocessor available on the repo
cache_dir – Path to cache directory
**kwargs – Extra kwargs
- Returns:
A Preprocessor subclass or a dict of Preprocessor subclass instances
- preprocessor_subfolder = 'preprocessor'¶