Tải bản đầy đủ (.pdf) (47 trang)

Quality of Telephone-Based Spoken Dialogue Systems phần 9 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.16 MB, 47 trang )

Definition of Interaction Parameters
371
372
Definition of Interaction Parameters
373
374
Definition of Interaction Parameters
375
376
Definition of Interaction Parameters
377
378
Definition of Interaction Parameters
379
This page intentionally left blank
Appendix B
Template Sentences for Synthesis Evaluation,
Exp.
5.1 and 5.2
Original German Version
1
2
3
4
5
Sie möchten also am weekday foodtype essen gehen? (weekday: Montag, Dienstag, Mitt-
woch, Donnerstag, Freitag, Samstag, Sonntag; foodtype: vegetarisch, italienisch, französisch,
griechisch, spanisch, orientalisch, asiatisch)
Das Restaurant location hat am weekday Ruhetag. (location: am Schauspielhaus, in der
Innenstadt, am Hauptbahnhof, am Stadtpark, am Kunstmuseum, am Stadion, am Opernhaus;
weekday: Montag, Dienstag, Mittwoch, Donnerstag, Freitag, Samstag, Sonntag)


Wann möchten Sie location foodtype essen gehen? (location: am Schauspielhaus, in der
Innenstadt, am Hauptbahnhof, am Stadtpark, am Kunstmuseum, am Stadion, am Opernhaus;
foodtype: vegetarisch, italienisch, französisch, griechisch, spanisch, orientalisch, asiatisch)
Das Lokal price und öffnet um time Uhr. (price: ist billig, ist preiswert, ist teuer, hat
gehobene Preise, ist in der unteren Preisklasse, ist in der mittleren Preisklasse, ist in der
oberen Preisklasse; time: dreizehn, sieben, fünfzehn, achtzehn, zwanzig, vierzehn, siebzehn)
Die Gerichte im foodtype Restaurant beginnen bei price Mark. (foodtype: vegetarischen,
italienischen, französischen, griechischen, spanischen, orientalischen, asiatischen; price:
fünfzehn, zwanzig, vierzig, achtzehn, dreißig, dreizehn, siebzehn)
English Translation
1
2
3
4
5
So you would like to eatfoodtype food on weekday? (weekday: Monday, Tuesday, Wednes-
day, Thursday, Friday, Saturday, Sunday; foodtype: vegetarian, Italian, French, Greek,
Spanish, oriental, asian)
The restaurant location is closed on weekday. (location: at the theater, in town center, at the
main station, at the city park, at the art museum, at the stadium, at the opera house; weekday:
Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday)
When would you like to eat foodtype food location? (foodtype: vegetarian, Italian, French,
Greek, Spanish, oriental, asian; location: at the theater, in town center, at the main station,
at the city park, at the art museum, at the stadium, at the opera house)
The restaurant price and opens at time. (price: is cheap, is good value, is expensive, has
high prices, is in the lower price category, is in the middle price category, is in the upper
price category; time: one p.m., seven o’clock, three p.m., six p.m., eight p.m., two p.m., five
p.m.)
The menu in the foodtype restaurant starts at price DM. (foodtype: vegetarian, Italian,
French, Greek, Spanish, oriental, asian; price: fifteen, twenty, forty, eighteen, thirty, thirteen,

seventeen)
This page intentionally left blank
Appendix C
BoRIS Dialogue Structure
Figure C. 1. Dialogue flow in the BoRIS restaurant information system of experiment 6.3, part
1.
384
Figure C.2. Dialogue flow in the BoRIS restaurant information system of experiment 6.3, part
2. For a legend see Figure C.1.
BoRIS Dialogue Structure
385
Figure C.3. Dialogue flow in the BoRIS restaurant information system of experiment 6.3, part
3. For a legend see Figure C.1.
This page intentionally left blank
Appendix D
Instructions and Scenarios
D.1
Instructions to Test Subjects
BoRIS
Dear participant!
Thank you for taking the time to do this experiment!
During the next hour you will get to know BoRIS via the telephone: The Bochum
restaurant information system.
This test will show how you experience a telephone call with BoRIS. For this aim,
we ask you to call BoRIS five times. Before each call you will get a small task. At
the end of each telephone call, we ask you to write down what you think about the
system. You can do this easily by filling out a questionnaire.
Before the test starts, we would like to ask you to answer the questions given on
the following pages. For the test evaluation, we need some personal information
from your side, information which will be treated anonymously of course.

At the end of the whole experiment, we ask you to give an overall judgment about
all the calls you had with BoRIS.
For some assessments you will find the following scale:
Usually, your judgment should be in the range between bad and excellent. In case
of an unpredictable extreme judgment, you can use the thinly drawn edges of the
scale as well. Please also use the spaces between the grid marks, as depicted
above.
Assess the system in a very self-confident way and remember during the whole
tes
t
session:
Not you are tested, but you test our system!
And now: Have a lot of fun!
extremely
bad
bad
poor
fair
good
excellent ideal
388
D.2
Scenarios
Dialogue no.
You would like to know where you can eat duck. Please ask BoRIS.
Restaurant name(s):
Instructions and Scenarios
389
Dialogue no.
You plan to go out for a Greek dinner on Tuesday night in Grumme.

Price:
Restaurant name(s):
If BoRIS is unable to indicate a restaurant, please change the following
specification:
You want to have the dinner in Weitmar.
Restaurant name(s):
390
Dialogue no.
You plan to have your lunch break in a Chinese restaurant downtown.
Price:
Restaurant name(s):
If BoRIS is unable to indicate a restaurant, please change the following
specification:
The price.
Restaurant name(s):
Instructions and Scenarios
391
Dialogue no.
Please gather your information from the following hints:
Price:
Type of food:
Location:
Restaurant name(s):
If BoRIS doesn’t find a matching restaurant, please change the following:
Price:
Restaurant name(s):
392
Dialogue no.
Yo
u

plan to eat out in Bochum. Because your favorite restaurant is closed
fo
r
holidays, ask BoRIS for a restaurant.
Please
write
down
first
which
specifications
you
want
to
give
to
BoRIS.
If BoRIS is unable to find a matching restaurant, please search for an
alternative until BoRIS indicates at least one restaurant.
Restauran
t
name(s):
Instructions and Scenarios
393
D.3
Instructions for Expert Evaluation
The following guidelines describe the steps an evaluation expert has to perform in order to
analyze and annotate an interaction with the BoRIS restaurant information system, see experi-
ments 6.1 to 6.3. A number of criteria are given which have to be judged upon in each step, and it
is illustrated how these criteria have to be interpreted in the context of the restaurant information
task. It has to be noted that the criteria and recommendations are not strict rules. Instead, the

evaluation expert often has a certain degree of freedom for interpretation. In order to take a
decision in an individual case, the expert should consider the objective of the criteria, and the
course of the interaction up to the specific point. In the case that a certain interpretation is chosen,
the expert should try to adhere to this interpretation in order to reach consistent results for all
dialogues in the analysis set.
The analysis and annotation procedure consists of the following steps:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Scenario definition
Dialogue execution
Transcription
Barge-in labelling
Task AVM analysis
Task success labelling
Contextual appropriateness labelling
System correction turn labelling

User correction turn labelling
Cancel attempt labelling
Help request labelling
User question labelling
System answer labelling
Speech understanding labelling
Automatic calculation of speech-recognition-related measures.
Automatical calculation of further interaction parameters.
The following guidelines focus on steps where the expert has to take a judgment on a specific
interaction aspect (Steps 1 and 3 to 14). Practical information on the operation of the CSLU-based
WoZ workbench and of the expert evaluation tool are given in Skowronek (2002).
Step 1: Scenario Definition
The scenario for the restaurant information task consists of six slots for attribute-value pairs,
namely Task, Foodtype, Date, Time, Price and Location. The field Task can take two different
values: “Get information” where the aim of the dialogue is to obtain information about restau-
rants, or “unknown” where the user asks for a task which is not supported by the system, e.g. a
reservation. The expert has to interpret relative date specifications like today, tomorrow, etc. as
follows:
Today, tomorrow, the day after tomorrow etc. the corresponding weekday.
Now, in a little while etc. the corresponding day and time.
394
This interpretation corresponds to the canonical values used by the language understanding
component of the system. The following expressions should not be changed in this way because
they are out of the understanding capability of the system:
During the week, weekdays, weekend, etc. leave unchanged.
In the case that no specifications for a slot are given in the scenario definition, the according slot
should be left undetermined. The same principle applies to the free scenario.
Step 3:
The system utterances are automatically logged during the interaction. Thus, only the user
utterances have to be transcribed by the expert, in the case that no transcription has been pro-

duced during the interaction (which is the case for simulated recognition). The user utterance
transcription has to include literally everything that has been articulated during a user’s turn,
including laughing, talking to himself, etc. In this way, it will reflect the input of the system in
a real-life environment.
The expert has to type the transcription into the according field of the evaluation tool. All
letters (including German “Umlaute”) and punctuation marks are allowed. Linebreaks are gen-
erated automatically, but they can also be enforced by pressing the return key. However, it has to
be ensured that no empty lines are transcribed, except when the whole user utterance is empty.
Scrolling over several lines is possible.
Transcription
Step 4: Barge-In Labelling
This step refers to the user utterances only. A barge-in attempt is counted when the user
intentionally addresses the system while the system is still speaking. In this definition, user
utterances which are not intended to influence the course of the dialogue (laughing, speaking to
himself/herself) are not counted as barge-ins. They are treated as spontaneous reactions which
are not intended to influence the course of the dialogue.
All barge-in attempts are labelled by setting the according radio button in the expert evaluation
tool. The barge-in utterance will not be transcribed until the user repeats it when the turn is on
the user again.
Step 5: Task AVM Analysis
The “Scenario AVM” is specified by the scenario and consists of six attribute-value pairs for
the slots Task, Date, Foodtype, Time, Price and Location.
During the course of the interaction, it may happen that the user changes one or several of
the specifications given in the scenario, either by adding further constraints, by omitting to give
constraints, or by changing the constraint values. Such a change may happen either on the user’s
own authority, or because the system requested to do so. In both cases the “Scenario AVM” has
to be amended, resulting in a “Modified Scenario AVM” and in a “Changed AVM”.
In a first step, the attributes of the user query which differ from the specification given in
the scenario have to be identified. These attributes and the corresponding values are written
down in the according “Modified Scenario AVM”.

If the user voluntarily sets the value of an attribute to a neutral value (e.g. by saying “don’t
know”, “doesn’t matter”, etc.), the value “neutral” has to be set in the AVM. However, in the
case that the user has no possibility to specify the value (e.g. because the system did not ask
him/her to do so), the AVM remains unchanged at this point. This guideline assumes that
the user would have provided the missing information but the system prematurely directed
the dialogue in a different way.
Instructions and Scenarios
395
In the case that the user specifies a value for an attribute that is not indicated in the scenario,
this value has to be included in the “Modified Scenario AVM”, independently of whether
the system asked for it or not.
In the case that the system asks the user to modify attribute values during the interaction
(e.g. because it did not find a matching restaurant), such modifications should be included
in the “Changed AVM”.
In the case that the user changes an attribute value which was previously specified without
being asked to do so by the system, two situations have to be distinguished:
If the user changes a specification spontaneously, by intuition, this modification should
be handled in the “Modified Scenario AVM”.
If the user changes a specification because the system obviously did not process his/her
first specification attempt, this modification should be handled by the “Changed AVM”.
This principle is in accordance with the definition of user correction turns, see below. When
the expert would rate such an utterance as a user correction turn, then the modification should
be handled by the “Changed AVM”.
When the modification occurs in an explicit confirmation situation (e.g. as a response to a
system confirmation utterance like “Do you really want to eat out in ?”), then it should
be handled by the “Modified Scenario AVM”.
The expert is only allowed to provide values which are in the system’s vocabulary (so-called
“canonical values”). Other values, although they might be specified by the user, should not be
introduced in the AVMs. This rule corresponds to a system-orientated point of view.
All AVMs are amended by an additional slot, namely the one with the restaurants which match

the specified attribute values. This slot is automatically calculated from the system database.
Example:
Scenario AVM:
Dialogue:
S:
U:
S/U:
S:
U:
S:
U:
S:
“Today, I’d like to eat out in a Greek restaurant downtown.”
“I’m sorry. There’s no restaurant that matches your query.
Would you like to change you query?”
“Yes, please.”
“You can change the type of food, the preferred price range, ”
What is your modification?”
“Well, I want Italian food.”
Attribute
Value
Task
Date –
Foodtype
Time –
Price –
Location
Get information
Greek
Center

×