hezar.metrics.bleu module

class hezar.metrics.bleu.BLEU(config: BLEUConfig, **kwargs)[source]

Bases: Metric

BLEU metric for evaluating text generation models like translation, summarization, etc.

compute(predictions: Iterable[str] | str = None, targets: Iterable[str] | str = None, weights=(0.25, 0.25, 0.25, 0.25), n_decimals=None, output_keys=None, **kwargs)[source]

Computes the BLEU score for the given predictions against targets.

Parameters:
  • predictions (Iterable[str] | str) – Predicted sentences or tokens.

  • targets (Iterable[str] | str) – Ground truth sentences or tokens.

  • weights (tuple) – Weights for n-gram precision, default is (0.25, 0.25, 0.25, 0.25).

  • n_decimals (int) – Number of decimals for the final score.

  • output_keys (tuple) – Filter the output keys.

Returns:

A dictionary of the metric results, with keys specified by output_keys.

Return type:

dict

required_backends: List[str | Backends] = [Backends.NLTK]
class hezar.metrics.bleu.BLEUConfig(objective: str = 'maximize', output_keys: tuple = ('bleu',), n_decimals: int = 4)[source]

Bases: MetricConfig

Configuration class for BLEU metric.

Parameters:
  • name (MetricType) – The type of metric, BLEU in this case.

  • output_keys (tuple) – Keys to filter the metric results for output.

name: str = 'bleu'
objective: str = 'maximize'
output_keys: tuple = ('bleu',)