Dataset¶
We use torchtext to manage data loading and processing. For more information about torchtext, please go to: https://github.com/pytorch/text
Fields¶
-
class
seq2seq.dataset.fields.SourceField(**kwargs)¶ Wrapper class of torchtext.data.Field that forces batch_first and include_lengths to be True.
-
class
seq2seq.dataset.fields.TargetField(**kwargs)¶ Wrapper class of torchtext.data.Field that forces batch_first to be True and prepend <sos> and append <eos> to sequences in preprocessing step.
Variables: - sos_id – index of the start of sentence symbol
- eos_id – index of the end of sentence symbol
-
SYM_EOS= '<eos>'¶
-
SYM_SOS= '<sos>'¶
-
build_vocab(*args, **kwargs)¶