Datasets¶

The datasets can be downloaded from the following links.

Note: the Ubuntu data is NOT the same as the previous Ubuntu dataset from Lowe et. al (2015) <https://arxiv.org/abs/1506.08909>. It is a new resource, described in the following paper:

@Article{arxiv18disentangle,
  author    = {Jonathan K. Kummerfeld, Sai R. Gouravajhala, Joseph Peper, Vignesh Athreya, Chulaka Gunasekara, Jatin Ganhotra, Siva Sankalp Patel, Lazaros Polymenakos, and Walter S. Lasecki},
  title     = {Analyzing Assumptions in Conversation Disentanglement Research Through the Lens of a New Dataset and Model},
  journal   = {ArXiv e-prints},
  archivePrefix = {arXiv},
  eprint    = {1810.11118},
  primaryClass = {cs.CL},
  year      = {2018},
  month     = {October},
  url       = {https://arxiv.org/pdf/1810.11118.pdf},
}

Training and Validation¶

Sub-Task

Training

Validation

Other

1

Ubuntu_st1_train

Advising_st1_train

Ubuntu_st1_validation

Advising_st1_validation

None

2

Ubuntu_st2_train

Ubuntu_st2_validation

Candidate_pool

3

Advising_st3_train

Advising_st3_validation

None

4

Ubuntu_st4_train

Advising_st4_train

Ubuntu_st4_validation

Advising_st4_validation

None

5

Same as subtask 1

Same as subtask 1

Linux_manpages

Course_information

Additionally, for the Advising data, we are providing a form of the data with the original dialogs and their paraphrases before remixing. This can be used for training in any subtask, and can be downloaded here. The global candidate pool for the sub-task 2, should be shared across training, validation and test datasets for sub-task 2.

Test¶

Sub-Task

Test

1

Ubuntu_st1_test

Advising_st1_case1_test

Advising_st1_case2_test

2

Ubuntu_st2_test

3

Advising_st3_case1_test

Advising_st3_case2_test

4

Ubuntu_st4_test

Advising_st4_case1_test

Advising_st4_case2_test

5

Same as subtask 1

Ground truth for test datasets¶

Sub-Task

Test

1

Ubuntu_st1_ground_truth

Advising_st1_case1_ground_truth

Advising_st1_case2_ground_truth

2

Ubuntu_st2_ground_truth

3

Advising_st3_case1_ground_truth

Advising_st3_case2_ground_truth

4

Ubuntu_st4_ground_truth

Advising_st4_case1_ground_truth

Advising_st4_case2_ground_truth

5

Same as subtask 1