|
SLS TECHNOLOGIES
The conversational systems developed by SLS require a number of
specialized core technologies to perform the tasks of speech
recognition, natural language understanding, discourse and dialogue
modeling, language generation and speech synthesis. SLS has developed
its own system components for each of these tasks as well as creating
an architecture to integrate these individual components together to
create systems for specific applications.
An Example
To illustrate these SLS technology components in action, let's consider
the following request posed to the MERCURY
air travel planning server:
"Is there a flight from Boston to San Franscisco Friday?"
GALAXY: An Architecture for Conversational Speech Systems
Conversational speech systems require the integration of a variety
of different specialized components. GALAXY provides a client/server based
architecture for performing this system integration task. In our example,
GALAXY's hub is responsible for moving the user's request through the various
stages of processing described below.
SUMMIT: Speech Recognition
Spoken language conveys measurable acoustic signals. SUMMIT converts
these signals into a sentence of distinct words by matching segments of
the incoming signals with a stored library of phonemes -- irreducible units
of sound (such as the "B" in "Boston") that make up a word. Relying on
internal language models, SUMMIT then generates a ranked list of candidate
sentences. In our example, SUMMIT produces the following list:
- Is there a flight from Boston to San Franscisco Friday?
- Is there a flight from Austin to San Franscisco Friday?
- Is there flight from Boston to San Franscisco Friday?
- ...
TINA: Natural Language Understanding
Beneath the surface representation of words, sentences carry a deeper
semantic meaning. In order to determine what a user actually wants, the
user's utterance must be represented in a logical, meaningful structure.
Based on stored rules, TINA parses each sentence into grammatical components,
such as subject, verb, object, and predicate. TINA then augments the syntactic
components with semantic information and converts the sentences into a
semantic frame, a command-like structure consisting of clauses, topics,
and predicates. In our example, the semantic frame for
"Is there a flight from Boston to San Franscisco Friday?"
would be
Clause:
EXIST
Topic:
FLIGHT
Quantifier:
INDEF
Predicate:
SOURCE
Topic:
CITY
Name:
Boston
Predicate:
DESTINATION
Topic:
CITY
Name:
San Francisco
Predicate:
TIME
Topic:
DATE
Day:
Friday
Dialogue
Management
In order to carry out the request of a user, it is the role of the
dialogue manager to evaluate the relevance and completeness of the user's
request, retrieve the requested information from the database and format
an appropriate reply in the form of a semantic frame.
In our example, the dialogue manager might return the following semantic
frame representing the retrieved information from the database:
Clause:
AVAILABILITY
Flights
found: 3
List:
Topic:
FLIGHT
Date:
October 19
Airline:
United
Flight
number: 163
Departure
Airport: BOS
Departure
Time: 7:00 AM
Arrival
Airport: SFO
Arrival
Time: 10:23 AM
Stops:
0
Topic:
FLIGHT
Date:
October 19
Airline:
United
Flight
number: 161
Departure
Airport: BOS
Departure
Time: 9:00 AM
Arrival
Airport: SFO
Arrival
Time: 12:22 PM
Stops:
0
Topic:
FLIGHT
Date:
October 19
Airline:
American
Flight
number: 195
Departure
Airport: BOS
Departure
Time: 9:00 AM
Arrival
Airport: SFO
Arrival
Time: 12:37 PM
Stops:
0
GENESIS:
Language Generation
GENESIS processes components of a semantic frame and generates a text
representation of the semantics in the requested language. GENESIS can
generate text in natural languages such as English or Chinese or in formal
languages such as the Standard Query Language (SQL). For the MERCURY example,
GENESIS takes a frame containing the tabular data returned from the database,
and converts this frame into a standard English response. For example:
I
have 3 nonstop flights:
A
United flight arriving at 10:23 AM,
a
United flight arriving at 12:22 PM,
and
an American flight arriving at 12:37 PM.
Please
select one of these flights or change
any
constraint you have already specified.
ENVOICE: Speech Synthesis
ENVOICE is a concatenative speech synthesis system which creates synthetic
speech by concatenating segments of speech from a pre-recorded speech corpus.
The concatenation of segments can occur at the phrase, word, or sub-word
levels.
|