hezar.data.datasets.text_classification_dataset module¶
- class hezar.data.datasets.text_classification_dataset.TextClassificationDataset(config: TextClassificationDatasetConfig, split=None, preprocessor=None, **kwargs)[source]¶
- Bases: - Dataset- A text classification dataset class. As of now this class is intended for datasets existing on the Hub! - Parameters:
- config (TextClassificationDatasetConfig) – Dataset config object. 
- split – Which split to use. 
- preprocessor – Dataset’s preprocessor 
- **kwargs – Extra config parameters to assign to the original config. 
 
 
- class hezar.data.datasets.text_classification_dataset.TextClassificationDatasetConfig(path: str | None = None, task: TaskType = TaskType.TEXT_CLASSIFICATION, max_size: int | float | None = None, hf_load_kwargs: dict | None = None, label_field: str | None = None, text_field: str | None = None, max_length: int | None = None)[source]¶
- Bases: - DatasetConfig- Configuration class for text classification datasets. - Parameters:
- path (str) – Path to the dataset. 
- label_field (str) – Field name for labels in the dataset. 
- text_field (str) – Field name for text in the dataset. 
- max_length (int) – Maximum length of text. 
 
 - label_field: str = None¶
 - max_length: int = None¶
 - name: str = 'text_classification'¶
 - path: str = None¶
 - text_field: str = None¶