|
Bavarian Archive for Speech Signals -
Services Schiel
|
|
|
BASSS Speech Corpora Validation
|
Technical Validation of Language Resources (LRs) is a core
competence of BAS Services Schiel (BASSS).
|
|
What is validation?
|
In the context of LRs the validation process
comprises all quality checks with regard to either the
specification or the documentation of the LR.
For example, the validation of a speech corpus will include
roughly the following quality checks:
- Documentation complete and error free
- Speech signal quality checks, formal checks,
completeness
- Annotations: sample checks, formal checks, correctness,
completeness
- Meta Data: formal checks, correctness, completeness
The results of the validation
are summarized in a validation report
that enables the customer to evaluate the current value of
the LR.
|
|
Why validate a LR?
|
There are several reasons to mandate BASSS with
an LR validation:
The first and obvious is independent quality
assurance. This can be best guaranteed by
pre-validation checks during the production process
and a final validation after completion.
The second reason is to obtain an independent evaluation
regarding the quality of an existing LR.
|
|
Validation Guidelines
|
Although individual LR validations depend on the nature of
the LR and of course also on the intentions of the
customer, BASSS has compiled a
vademecum
for the 'best practise' of LR validation,
which can be taken as a basis for any speech corpus
validation. BASSS also offers
1-day or 3-day tutorials on
this topic.
|
|
Validation Techniques
|
BASSS has developed a web based validation
tool for the manual validation of large speech
samples
that enables us to minimize logistic efforts and errors
during a LR validation. Please refer to our
scientific
publications for a closer look at WebTranscribe.
|
|
|
Copyright 2005 BAS Services Schiel
Impressum: Florian Schiel, Moltkestr. 1, D-80803 München,
Germany, schiel@bas-services.de
|