Persistent Identifierauto

http://hdl.handle.net/21.11114/COLL-0000-000B-CAA7-5

Description0-1

The LUCEA corpus (Longitudinal Universi…

The LUCEA corpus (Longitudinal University College utrecht Corpus of English Accents) was collected to study this type of phonetic convergence in a multilingual environment. Students and teachers at University College Utrecht (UCU) come from various countries and native languages, yet they all use English as the lingua franca on campus. Hence, phonetic convergence may result in a unique international version of English, influenced by the speakers’ native languages and accents. The corpus now contains data from about 850 interviews from 282 unique students. Each interview contains about 20 minutes of speech. The speech corpus is augmented with participants’ responses from entry and exit questionnaires, and supplementary data about the participants and about each recording. When finished in 2016, the total corpus will contain about 3 TB (about 3000 GB) of audio data.

LandingPage1

https://hdl.handle.net/1839/00-C3AD1CEE-985D-42E5-8528-730774C187C1@view

Title(s)1-n

[1]: D-LUCEA,

[2]: the LUCEA corpus,

[3]: the LUCEA database,

[4]: Database of the Longitudinal Utrecht Collection of English Accents

Owner(s)0-n

University College Utrecht

Genre(s)0-n

interviews , conversation , academic-nonfiction , prompted speech , academic-nonfiction , academic-nonfiction , fiction

Language disorder(s)0-n

none

Domain(s)0-n

The database is of interest for research and development in linguistics, language education (pronunciation training), speech technology (foreign accent detection, language recognition, speech recognition), and sociophonetics.

Language(s)1-n

English [eng]

CLARIN centre0-1

MPI for Psycholinguistics

Persistent identifier(s)0-n

https://hdl.handle.net/1839/00-58F6586A-55F4-4B45-8341-6E2F7FF0668C

Version0-1

Size(s)0-n

282 stud. , 850 intrv , 3000 GB

Creator(s)0-n

Dr Hugo Quené (Max Planck Institute for Psycholinguistics, Nijmeg)

Project(s)0-n

UCU Accent site (Funder: University College Utrecht-Utrecht Institute of Linguistics OTS-CLARIN-NL)

Resource(s)1-n

Resource 1

Description0-1

The LUCEA database is a database of exi…

The LUCEA database is a database of existing speech recordings of L1 and L2 speakers of English. The recorded speakers are students from an international student community where English is used as lingua franca. These students are being recorded longitudinally throughout their 3-year period on campus, using read and spontaneous speech in L1 and in L2 English (or in L1 English only). The database is of interest for research and development in linguistics, language education (pronunciation training), speech technology (foreign accent detection, language recognition, speech recognition), and sociophonetics. The corpus now contains data from about 850 interviews from 282 unique students. Each interview contains about 20 minutes of speech. The speech corpus is augmented with participants’ responses from entry and exit questionnaires, and supplementary data about the participants and about each recording. When finished in 2016, the total corpus will contain about 3 TB (about 3000 GB) of audio data.

Dublin-Core Type1

Sound

subtype0-1

speech

Modality1-n

speech

Recording environment0-n

home/office

Recording condition0-n

8 microphones were used , Microphone 1 is a close-talking headset microphone, 30 cm in front of speaker

Channel0-n

experimental-setting , face-to-face

Social context0-n

controlled-environment

Planning type0-n

spontaneous , planned

Interactivity0-n

interactive

Involvement0-n

elicited

Audience0-n

small

SC duration speech0-1

20 mins per recording

SC duration full0-1

unknown

SC speakers0-1

282

SC sp. demogr0-1

- 70% female, 30% male

Size0-n

282 stud.

Size0-n

850 intrv

Size0-n

3000 GB

Annotation0-n

[orthographicTranscription] [unknown] [other]

Media0-n

audio/x-wav, text/xml

Provenance(s)0-n

Provenance 1

Temporal0-1

2011-2016

Cities0-n

Utrecht

Country0-1

Netherlands (the) NL

Linguality0-1

Linguality

Type0-n

monolingual

Nativeness0-n

non-native , native

AgeGroup0-n

adult

Status0-n

normal

Variant0-n

standard , dialect

Accessibility0-1

Accessibility

Name1

D-LUCEA

Availability0-n

academic , restricted

Non-commercial usage0-1

yes

Website(s)0-n

http://lucea.wp.hum.uu.nl/summary/

ISBN0-1

ISLRN0-1

Contact(s)0-n

Dr Hugo Quené: Utrecht inst of Linguistics OT, (H.Quene@uu.nl) , Dr Rosemary Orr: University College Utrecht and, (r.orr@uu.nl)

Medium(s)0-n

internet

Documentation0-1

Documentation

Language(s)1-n

English [eng]

Type(s)0-n

website

URL(s)0-n

https://portal.clarin.nl/node/4183 , http://lucea.wp.hum.uu.nl/summary/

Validation0-1

Validation

Type0-1

unknown

Method(s)0-n

unknown