Dataset¶
We use torchtext to manage data loading and processing. For more information about torchtext, please go to: https://github.com/pytorch/text
Fields¶
-
class
seq2seq.dataset.fields.
SourceField
(**kwargs)¶ Wrapper class of torchtext.data.Field that forces batch_first and include_lengths to be True.
-
class
seq2seq.dataset.fields.
TargetField
(**kwargs)¶ Wrapper class of torchtext.data.Field that forces batch_first to be True and prepend <sos> and append <eos> to sequences in preprocessing step.
Variables: - sos_id – index of the start of sentence symbol
- eos_id – index of the end of sentence symbol
-
SYM_EOS
= '<eos>'¶
-
SYM_SOS
= '<sos>'¶
-
build_vocab
(*args, **kwargs)¶