Dataset

We use torchtext to manage data loading and processing. For more information about torchtext, please go to: https://github.com/pytorch/text

Fields

class seq2seq.dataset.fields.SourceField(**kwargs)

Wrapper class of torchtext.data.Field that forces batch_first and include_lengths to be True.

class seq2seq.dataset.fields.TargetField(**kwargs)

Wrapper class of torchtext.data.Field that forces batch_first to be True and prepend <sos> and append <eos> to sequences in preprocessing step.

Variables:
  • sos_id – index of the start of sentence symbol
  • eos_id – index of the end of sentence symbol
SYM_EOS = '<eos>'
SYM_SOS = '<sos>'
build_vocab(*args, **kwargs)