For goal-oriented document-grounded dialogs, it often involves complex contexts for identifying the most relevant information, which requires better understanding of the inter-relations between conversations and documents. Meanwhile, many online user-oriented documents use both semi-structured and unstructured contents for guiding users to access information of different contexts. Thus, we create a new goal-oriented document-grounded dialogue dataset that captures more diverse scenarios derived from various document contents from multiple domains such ssa.gov and studentaid.gov. For data collection, we propose a novel pipeline approach for dialogue data construction, which has been adapted and evaluated for several domains.
34 PAPERS • NO BENCHMARKS YET
The DSTC7 Task 1 dataset is a dataset and task for goal-oriented dialogue. The data originates from human-human conversations, which is built from online resources, specifically the Ubuntu Internet Relay Chat (IRC) channel and an Advising dataset from the University of Michigan.
11 PAPERS • 1 BENCHMARK
HINT3 is a dataset for intent detection. It consists of 3 different datasets each containing a diverse set of intents in a single domain - mattress products retail, fitness supplements retail and online gaming named SOFMattress, Curekart and Powerplay11.
10 PAPERS • NO BENCHMARKS YET
In MutualFriends, two agents, A and B, each have a private knowledge base, which contains a list of friends with multiple attributes (e.g., name, school, major, etc.). The agents must chat with each other to find their unique mutual friend.
7 PAPERS • NO BENCHMARKS YET
The Permuted bAbi dialog task is an adaptation of the "Dialog bAbI tasks data" dataset released by Facebook. It is used for evaluating end-to-end dialog systems in the restaurant domain. This dataset introduces multiple valid next utterances to the original-bAbI dialog tasks, which allows evaluation of end-to-end goal-oriented dialog systems in a more realistic setting.
1 PAPER • NO BENCHMARKS YET