| LDC0000 |  | To be filled ♦ [Hua Yu] [Laura Mayfield] |
| | 1993 |
| LDC93S10 |  | TIDIGITS ♦ |
| LDC93S11 |  | Road Rally ♦ |
| LDC93S12 |  | HCRC Map Task Corpus ♦ |
| LDC93S1 |  | TIMIT Acoustic-Phonetic Continuous Speech Corpus ♦ |
| LDC93S2 |  | NTIMIT ♦ |
| LDC93S3A |  | Resource Management Complete Set 2.0 ♦ |
| LDC93S3B |  | Resource Management RM1 2.0 |
| LDC93S3C |  | Resource Management RM2 2.0 |
| LDC93S4A |  | ATIS0 Complete ♦ |
| LDC93S4B |  | ATIS0 Pilot |
| LDC93S4B-2 |  | ATIS0 Read |
| LDC93S4B-3 |  | ATIS0 SD Read ♦ |
| LDC93S5 |  | ATIS2 ♦ |
| LDC93S6A |  | CSR-I (WSJ0) Complete |
| LDC93S6B |  | CSR-I (WSJ0) Sennheiser ♦ |
| LDC93S6C |  | CSR-I (WSJ0) Other ♦ |
| LDC93S7 |  | Switchboard ♦ |
| LDC93S8 |  | Switchboard Credit Card ♦ |
| LDC93S9 |  | TI 46-Word ♦ |
| LDC93T1 |  | ACL/DCI ♦ [Larry Zitnick] |
| LDC93T2 |  | Penn Treebank 0.5 ♦ |
| LDC93T3A |  | TIPSTER Complete |
| LDC93T3B |  | TIPSTER Volume 1 ♦ |
| LDC93T3C |  | TIPSTER Volume 2 ♦ |
| LDC93T3D |  | TIPSTER Volume 3 ♦ |
| LDC93T4 |  | Switchboard-1 Transcripts |
| | 1994 |
| LDC94L1 |  | CELEX Lexical Database ♦ |
| LDC94L2 |  | COMLEX English Syntax Lexicon |
| LDC94L3 |  | COMLEX Pronouncing Dictionary |
| LDC94S13A |  | CSR-II (WSJ1) Complete |
| LDC94S13B |  | CSR-II (WSJ1) Sennheiser |
| LDC94S13C |  | CSR-II (WSJ1) Other ♦ |
| LDC94S14A |  | Air Traffic Control Complete |
| LDC94S14B |  | Air Traffic Control BOS |
| LDC94S14C |  | Air Traffic Control DCA |
| LDC94S14D |  | Air Traffic Control DFW |
| LDC94S15 |  | SPIDRE |
| LDC94S16 |  | YOHO Speaker Verification ♦ |
| LDC94S17 |  | OGI Multilanguage Corpus ♦ |
| LDC94S18 |  | OGI Spelled and Spoken Word ♦ |
| LDC94S19 |  | ATIS3 Training Data ♦ |
| LDC94S20 |  | BRAMSHILL ♦ |
| LDC94S21 |  | MACROPHONE ♦ |
| LDC94T4A |  | UN Parallel Text (Complete) ♦ |
| LDC94T4B-1 |  | UN Parallel Text (English) ♦ |
| LDC94T4B-2 |  | UN Parallel Text (French) ♦ |
| LDC94T4B-3 |  | UN Parallel Text (Spanish) ♦ |
| LDC94T5 |  | ECI Multilingual Text ♦ [Dan Dewey] |
| | 1995 |
| LDC95L4 |  | COMLEX English Syntax Lexicon ♦ |
| LDC95L5 |  | COMLEX Pronouncing Dictionary ♦ |
| LDC95S22 |  | KING Speaker Verification |
| LDC95S23 |  | CSR-III Speech ♦ |
| LDC95S24 |  | WSJCAM0 Cambridge Read News ♦ |
| LDC95S25 |  | TRAINS Spoken Dialog Corpus ♦ |
| LDC95S26 |  | ATIS3 Test Data ♦ |
| LDC95S27 |  | PhoneBook: NYNEX Isolated Words |
| LDC95S28 |  | LATINO-40 Spanish Read News ♦ |
| LDC95T11 |  | European Language Newspaper Text ♦ |
| LDC95T13 |  | Mandarin Chinese News Text ♦ |
| LDC95T20 |  | Hansard French/English ♦ |
| LDC95T21 |  | North American News Text Corpus ♦ |
| LDC95T6 |  | CSR-III Text |
| LDC95T7 |  | Treebank-2 ♦ [Brian MacWhinney] |
| LDC95T8 |  | Japanese Business News Text ♦ |
| LDC95T9 |  | Spanish News Text ♦ |
| | 1996 |
| LDC96A02 |  | Switchboard Speaker ID Evaluation Test |
| LDC96A12 |  | Raw Data:1996 Language Recognition Evaluation |
| LDC96L14 |  | CELEX2 ♦ |
| LDC96L15 |  | CALLHOME Mandarin Chinese Lexicon ♦ |
| LDC96L16 |  | CALLHOME Spanish Lexicon ♦ |
| LDC96L17 |  | CALLHOME Japanese Lexicon ♦ |
| LDC96L6 |  | COMLEX English Syntax Lexicon |
| LDC96L7 |  | COMLEX Pronouncing Dictionary |
| LDC96S29 |  | Frontiers in Speech Processing 93 |
| LDC96S30 |  | CTIMIT ♦ |
| LDC96S31 |  | CSR-IV HUB4 |
| LDC96S32 |  | FFMTIMIT ♦ |
| LDC96S33 |  | CSR-IV HUB3 |
| LDC96S34 |  | CALLHOME Mandarin Chinese Speech ♦ |
| LDC96S35 |  | CALLHOME Spanish Speech [Alex Waibel] ♦ [Laura Mayfield] |
| LDC96S36 |  | Boston University Radio Speech Corpus ♦ |
| LDC96S37 |  | CALLHOME Japanese Speech [Alex Waibel] ♦ |
| LDC96S38 |  | DCIEM/HCRC ♦ |
| LDC96S39 |  | RM Isolated and Spelled Word Data |
| LDC96S40 |  | Frontiers in Speech Processing 94 |
| LDC96S41 |  | VAHA (POLYPHONE II) ♦ |
| LDC96S46 |  | CALLFRIEND American English-Non-Southern Dialect ♦ |
| LDC96S47 |  | CALLFRIEND American English-Southern Dialect ♦ |
| LDC96S48 |  | CALLFRIEND Canadian French ♦ [Dan Dewey] |
| LDC96S49 |  | CALLFRIEND Egyptian Arabic ♦ |
| LDC96S50 |  | CALLFRIEND Farsi ♦ |
| LDC96S51 |  | CALLFRIEND German ♦ |
| LDC96S52 |  | CALLFRIEND Hindi ♦ |
| LDC96S53 |  | CALLFRIEND Japanese ♦ |
| LDC96S54 |  | CALLFRIEND Korean ♦ |
| LDC96S55 |  | CALLFRIEND Mandarin Chinese-Mainland Dialect ♦ |
| LDC96S56 |  | CALLFRIEND Mandarin Chinese-Taiwan Dialect ♦ |
| LDC96S57 |  | CALLFRIEND Spanish-Caribbean Dialect ♦ |
| LDC96S58 |  | CALLFRIEND Spanish-Non-Caribbean Dialect ♦ |
| LDC96S59 |  | CALLFRIEND Tamil ♦ |
| LDC96S60 |  | CALLFRIEND Vietnamese ♦ |
| LDC96S61 |  | 1996 Speaker Recognition Benchmark ♦ |
| LDC96S64 |  | JEIDA/JCSD-Channel 0 Complete ♦ |
| LDC96S64-1 |  | JEIDA/JCSD-Channel 0 City Names |
| LDC96S64-2 |  | JEIDA/JCSD-Channel 0 Control Words |
| LDC96S64-3 |  | JEIDA/JCSD-Channel 0 Isolated Digits |
| LDC96S64-4 |  | JEIDA/JCSD-Channel 0 Four Digit Sequences |
| LDC96S64-5 |  | JEIDA/JCSD-Channel 0 Mono Syllables |
| LDC96S65 |  | JEIDA/JCSD-Channel 1 Complete ♦ |
| LDC96S65-1 |  | JEIDA/JCSD-Channel 1 City Names |
| LDC96S65-2 |  | JEIDA/JCSD-Channel 1 Control Words |
| LDC96S65-3 |  | JEIDA/JCSD-Channel 1 Isolated Digits |
| LDC96S65-4 |  | JEIDA/JCSD-Channel 1 Four Digit Sequences |
| LDC96S65-5 |  | JEIDA/JCSD-Channel 1 Mono Syllables |
| LDC96T10 |  | Message Understanding Conference (MUC) 6 Additional News Text ♦ |
| LDC96T11 |  | COMLEX Syntax Text Corpus Version 2.0 ♦ |
| LDC96T16 |  | CALLHOME Mandarin Chinese Transcripts ♦ |
| LDC96T17 |  | CALLHOME Spanish Transcripts |
| LDC96T18 |  | CALLHOME Japanese Transcripts |
| | 1997 |
| LDC97E1 |  | Spoken Document Retrieval Training |
| LDC97E2 |  | preliminary CALLHOME Egyptian Arabic Transcripts |
| LDC97E3 |  | Spoken Document Retrieval Speech Recognizer |
| LDC97E4 |  | MUC-VII |
| LDC97E5 |  | 1997 TREC SDR Test Set Text Data |
| LDC97E6 |  | CSR-VI Hub 4 Spanish News Lexicon [Maxine Eskenazi] |
| LDC97E7 |  | CSR-VI Hub 4 Mandarin Chinese News Lexicon |
| LDC97L18 |  | CALLHOME German Lexicon ♦ |
| LDC97L19 |  | CALLHOME Egyptian Arabic Lexicon ♦ |
| LDC97L20 |  | CALLHOME American English Lexicon (PRONLEX) ♦ |
| LDC97S42 |  | CALLHOME American English Speech ♦ |
| LDC97S43 |  | CALLHOME German Speech [Alex Waibel] |
| LDC97S44 |  | 1996 English Broadcast News Speech (HUB4) ♦ |
| LDC97S45 |  | CALLHOME Egyptian Arabic Speech [Alex Waibel] ♦ |
| LDC97S62 |  | Switchboard-1 Release 2 ♦ |
| LDC97S63 |  | The CMU Kids Corpus [Jack Mostow] [Maxine Eskenazi] |
| LDC97S66 |  | 1996 English Broadcast News Dev and Eval (HUB4) [Hua Yu] |
| LDC97T12 |  | DSO Corpus of Sense-Tagged English ♦ |
| LDC97T14 |  | CALLHOME American English Transcripts |
| LDC97T15 |  | CALLHOME German Transcripts |
| LDC97T19 |  | CALLHOME Egyptian Arabic Transcripts ♦ |
| LDC97T22 |  | 1996 English Broadcast News Transcripts (HUB4) ♦ |
| | 1998 |
| LDC98E10 |  | 1998 HUB4 English UTF Transcript Compendium ♦ |
| LDC98E11 |  | BBN IE/NE-tagged HUB4 Training Transcripts ♦ |
| LDC98E8 |  | 1998 SDR Textual Training Corpus ♦ |
| LDC98E9 |  | 1998 TREC-7 SDR Evaluation Material ♦ |
| LDC98L21 |  | COMLEX English Syntax Lexicon ♦ |
| LDC98S67 |  | HTIMIT |
| LDC98S68 |  | LLHDB |
| LDC98S69 |  | HUB5 Mandarin Telephone Speech Corpus ♦ |
| LDC98S70 |  | HUB5 Spanish Telephone Speech Corpus [Laura Mayfield] |
| LDC98S71 |  | 1997 English Broadcast News Speech (HUB4) ♦ |
| LDC98S72 |  | Taiwanese Putonghua Speech and Transcripts ♦ |
| LDC98S73 |  | 1997 Mandarin Broadcast News Speech (HUB4-NE) ♦ |
| LDC98S74 |  | 1997 Spanish Broadcast News Speech (HUB4-NE) ♦ |
| LDC98S75 |  | Switchboard-2 Phase I ♦ |
| LDC98S76 |  | 1998 Speaker Recognition Benchmark ♦ |
| LDC98S77 |  | Voicemail Corpus Part I ♦ |
| LDC98T10 |  | *UNKNOWN* |
| LDC98T23 |  | CSR-VI Hub 4 Spanish News Transcripts |
| LDC98T24 |  | 1997 Mandarin Broadcast News Transcripts (HUB4-NE) ♦ |
| LDC98T25 |  | TDT Pilot Study Corpus ♦ |
| LDC98T26 |  | HUB5 Mandarin Transcripts ♦ |
| LDC98T27 |  | HUB5 Spanish Transcripts |
| LDC98T28 |  | 1997 English Broadcast News Transcripts (HUB4) ♦ |
| LDC98T29 |  | 1997 Spanish Broadcast News Transcripts (HUB4-NE) ♦ |
| LDC98T30 |  | North American News Text Supplement ♦ |
| LDC98T31 |  | 1996 CSR HUB4 Language Model ♦ |
| LDC98T32 |  | JURIS ♦ |
| | 1999 |
| LDC99E12 |  | BBN_IENE_TRN99 |
| LDC99L22 |  | Egyptian Colloquial Arabic Lexicon ♦ |
| LDC99L23 |  | American English Spoken Lexicon ♦ |
| LDC99S78 |  | SUSAS ♦ |
| LDC99S79 |  | Switchboard-2 Phase II ♦ |
| LDC99S80 |  | 1997 Speaker Recognition Benchmark [Nadine Reaves] |
| LDC99S81 |  | 1999 Speaker Recognition Benchmark [Nadine Reaves] |
| LDC99S82 |  | USC Marketplace Broadcast News Speech |
| LDC99S83 |  | Tactical Speaker Identification Speech Corpus (TSID) ♦ |
| LDC99S84 |  | TDT2 English Audio ♦ |
| LDC99T33 |  | SUSAS Transcripts ♦ |
| LDC99T34 |  | Japanese Business News Text Supplement ♦ |
| LDC99T35 |  | TDT2 English Text ♦ |
| LDC99T36 |  | USC Marketplace Broadcast News Transcripts |
| LDC99T37 |  | TDT2 English Text, Version 2 |
| LDC99T38 |  | TDT2 Mandarin Text ♦ |
| LDC99T39 |  | TDT2 Multilanguage Text Version 3.0 ♦ |
| LDC99T40 |  | Portuguese Newswire Text ♦ |
| LDC99T41 |  | Spanish Newswire Text, Volume 2 ♦ |
| LDC99T42 |  | Treebank-3 ♦ |
| | 2000 |
| LDC2000S85 |  | Santa Barbara Corpus of Spoken American English Part I ♦ [Brian MacWhinney] |
| LDC2000S86 |  | 1998 HUB4 Broadcast News Evaluation English Test Material ♦ |
| LDC2000S87 |  | Speech in Noisy Environments (SPINE) Training Audio ♦ |
| LDC2000S88 |  | 1999 HUB4 Broadcast News Evaluation English Test Material ♦ |
| LDC2000S89 |  | Voice of America (VOA) Czech Broadcast News Audio ♦ |
| LDC2000S92 |  | TDT2 Careful Transcription Audio ♦ |
| LDC2000S96 |  | Speech in Noisy Environments (SPINE) Evaluation Audio ♦ |
| LDC2000T43 |  | BLLIP 1987-89 WSJ Corpus Release 1 ♦ |
| LDC2000T44 |  | TDT2 Careful Transcription Text ♦ |
| LDC2000T45 |  | Korean Newswire ♦ |
| LDC2000T46 |  | Hong Kong News Parallel Text ♦ |
| LDC2000T47 |  | Hong Kong Laws Parallel Text ♦ |
| LDC2000T48 |  | Chinese Treebank Final Release ♦ |
| LDC2000T49 |  | Speech in Noisy Environments (SPINE) Training Transcripts ♦ |
| LDC2000T50 |  | Hong Kong Hansards Parallel Text ♦ |
| LDC2000T51 |  | TREC Spanish ♦ |
| LDC2000T52 |  | TREC Mandarin ♦ |
| LDC2000T53 |  | Voice of America (VOA) Broadcast News Czech Transcript Corpus ♦ |
| LDC2000T54 |  | Speech in Noisy Environments (SPINE) Evaluation Transcripts ♦ |
| | 2001 |
| LDC2001S04 |  | Speech in Noisy Environments (SPINE2) Part 1 Audio ♦ |
| LDC2001S06 |  | Speech in Noisy Environments (SPINE2) Part 2 Audio ♦ |
| LDC2001S08 |  | Speech in Noisy Environments (SPINE2) Part 3 Audio ♦ |
| LDC2001S13 |  | Switchboard Cellular Part 1 Audio ♦ |
| LDC2001S15 |  | Switchboard Cellular Part 1 Transcribed Audio ♦ |
| LDC2001S16 |  | Grassfields Bantu Fieldwork: Ngomba Tone Paradigms ♦ |
| LDC2001S91 |  | 1997 HUB4 Broadcast News Evaluation Non-English Test Material ♦ |
| LDC2001S93 |  | TDT2 Mandarin Audio Corpus ♦ |
| LDC2001S94 |  | TDT3 English Audio ♦ |
| LDC2001S95 |  | TDT3 Mandarin Audio ♦ |
| LDC2001S97 |  | 2000 NIST Speaker Recognition Evaluation ♦ |
| LDC2001S99 |  | Speech in Noisy Environments 1 (SPINE1 CODED) Coded Audio ♦ |
| LDC2001T02 |  | Message Understanding Conference (MUC) 7 ♦ |
| LDC2001T05 |  | Speech in Noisy Environments (SPINE2) Part 1 Transcripts ♦ |
| LDC2001T07 |  | Speech in Noisy Environments (SPINE2) Part 2 Transcripts ♦ |
| LDC2001T09 |  | Speech in Noisy Environments (SPINE2) Part 3 Transcripts ♦ |
| LDC2001T10 |  | Prague Dependency Treebank 1.0 ♦ |
| LDC2001T11 |  | Chinese Treebank 2.0 ♦ |
| LDC2001T14 |  | Switchboard Cellular Part 1 Transcription ♦ |
| LDC2001T55 |  | Arabic Newswire Part 1 ♦ [NianLi Ma] |
| LDC2001T57 |  | TDT2 Multilanguage Text Version 4.0 ♦ |
| LDC2001T58 |  | TDT3 Multilanguage Text Version 2.0 ♦ |
| LDC2001T60 |  | Syllable-Final /s/ Lenition ♦ |
| LDC2001T61 |  | CALLHOME Spanish Dialogue Act Annotation |
| LDC2001T62 |  | CETEMpublico ♦ |
| | 2002 |
| LDC2002E14 |  | Chinese English Translation Lexicon Version 3-beta ♦ |
| LDC2002E15 |  | UN Arabic English Parallel Text Version 1 beta ♦ |
| LDC2002E16 |  | Hong Kong News Parallel Text Version 2 beta ♦ |
| LDC2002E17 |  | English Translation of Chinese Treebank Version 1 beta ♦ |
| LDC2002E18 |  | Xinhua Chinese English Parallel News Text Version 1 beta ♦ |
| LDC2002E19 |  | Hong Kong Hansard Parallel Text Version 2 beta ♦ |
| LDC2002E27 |  | Chinese English Translation Dictionary v3.0 ♦ |
| LDC2002E32 |  | TDT3 Arabic Text Version 0.1 |
| LDC2002E33 |  | ACE Phase 2 Training Data Version 6 |
| LDC2002E36 |  | 2002 DUC Evaluation Version 0.1 |
| LDC2002E48 |  | Ummah Arabic English Parallel News Text |
| LDC2002E49 |  | Buckwalter Arabic Morphological Analyzer |
| LDC2002E50 |  | Name-Annotated TDT Corpus Supplement for ACE ♦ |
| LDC2002E52 |  | TDT4 Multilanguage Text Corpus |
| LDC2002E53 |  | Multiple-Translation Chinese Corpus 2.0 ♦ |
| LDC2002E54 |  | Multiple-Translation Arabic Corpus ♦ |
| LDC2002E55 |  | Arabic Treebank: Part 1 v 1.0 |
| LDC2002E58 |  | Sinorama Chinese English Parallel Text ♦ |
| LDC2002L27 |  | Chinese-English Translation Lexicon Version 3.0 ♦ |
| LDC2002L49 |  | Buckwalter Arabic Morphological Analyzer Version 1.0 |
| LDC2002S02 |  | West Point Arabic Speech Corpus ♦ |
| LDC2002S04 |  | Translanguage English Database (TED) Speech ♦ |
| LDC2002S06 |  | Switchboard-2 Phase III Audio ♦ |
| LDC2002S10 |  | 1998 HUB5 English Evaluation ♦ |
| LDC2002S11 |  | 1997 HUB4 English Evaluation Speech and Transcripts ♦ |
| LDC2002S12 |  | 2001 HUB5 Mandarin Evaluation ♦ |
| LDC2002S13 |  | 2001 HUB5 English Evaluation ♦ |
| LDC2002S22 |  | 1997 HUB5 Arabic Evaluation ♦ |
| LDC2002S24 |  | 1997 HUB5 German Evaluation ♦ |
| LDC2002S25 |  | 1997 HUB5 Spanish Evaluation ♦ |
| LDC2002S28 |  | Emotional Prosody Speech and Transcripts ♦ |
| LDC2002S34 |  | 2001 NIST Speaker Recognition Evaluation Corpus ♦ |
| LDC2002S35 |  | Voicemail Corpus Part II ♦ |
| LDC2002S37 |  | CALLHOME Egyptian Arabic Speech Supplement ♦ |
| LDC2002S56 |  | 2000 Communicator Evaluation ♦ |
| LDC2002T01 |  | Multiple-Translation Chinese Corpus ♦ |
| LDC2002T03 |  | Translanguage English Database (TED) Transcripts ♦ |
| LDC2002T07 |  | RST Discourse Treebank ♦ |
| LDC2002T26 |  | Korean English Treebank Annotations ♦ |
| LDC2002T31 |  | The AQUAINT Corpus of English News Text ♦ |
| LDC2002T38 |  | CALLHOME Egyptian Arabic Transcripts Supplement ♦ |
| LDC2002T39 |  | 1997 HUB5 Arabic Transcripts |
| LDC2002T42 |  | 1997 HUB5 Spanish Transcripts |
| | 2003 |
| LDC2003E01 |  | Chinese <-> English Name Entity Lists Version 1.0 beta ♦ |
| LDC2003E02 |  | TDT4 Multilanguage Speech [Hua Yu] |
| LDC2003E03 |  | TDT4 Multilanguage Transcripts |
| LDC2003E04 |  | Multiple Translation Chinese Corpus Part 3 |
| LDC2003E05 |  | Arabic Translation Corpus Part 1 ♦ |
| LDC2003E06 |  | Chinese Treebank 3.0 ♦ |
| LDC2003E07 |  | Chinese Treebank English Parallel Corpus ♦ |
| LDC2003E08 |  | Chinese News Translation Corpus Part 1 ♦ |
| LDC2003E09 |  | Arabic News Translation Corpus Part 2 ♦ |
| LDC2003E10 |  | Aquaint Xinhua for NTCIR Evaluation |
| LDC2003E11 |  | UN Chinese English Parallel Text Version 1.0 beta |
| LDC2003E12 |  | Fisher Training Speech Part 1 [Hua Yu] |
| LDC2003E12B |  | Fisher Training Speech Part 2 [Hua Yu] |
| LDC2003E12C |  | Fisher Training Speech Part 3 [Hua Yu] |
| LDC2003E12D |  | Fisher Training Speech Data, Part 4 [Hua Yu] |
| LDC2003E12E |  | Fisher Training Speech Data, Part 5 [Hua Yu] |
| LDC2003E12F |  | Fisher Training Speech Data, Part 6 [Hua Yu] |
| LDC2003E13 |  | Fisher Quick Transcription Part 1 Version 1.0 |
| LDC2003E13B |  | Fisher Quick Transcription Part 2 Version 1.0 |
| LDC2003E13C |  | Fisher Quick Transcription Part 3 Version 1.0 |
| LDC2003E13D |  | Fisher Training Transcripts Part 4, v1.0 |
| LDC2003E13E |  | Fisher Training Transcripts, Part 5, v1.0 |
| LDC2003E14 |  | FBIS Multilanguage Texts ♦ |
| LDC2003E15 |  | HARD GovDocs ♦ |
| LDC2003E16 |  | SIGHAN Bakeoff ♦ |
| LDC2003E17 |  | Arabic Treebank: Part 2 v 1.0 |
| LDC2003E18 |  | ACE3-V1.3 |
| LDC2003E19 |  | EARS MDE RT-03F Training Corpus |
| LDC2003E20 |  | TDT4 Multilanguage Text Subset for TIDES Extraction 2003 |
| LDC2003E21 |  | TDT4 Multilanguage Text Version 1.1 [Hua Yu] [Jamie Callan] [Jian Zhang] |
| LDC2003E22 |  | The SLX Corpus of Classic Sociolinguistic Interviews |
| LDC2003E24 |  | Arabic Treebank: Part 2 v 1.1 |
| LDC2003E25 |  | Hong Kong News Parallel Text ♦ |
| LDC2003E26 |  | ACE 2004 Pilot Corpus V1.0 |
| LDC2003E27 |  | EARS MDE RT-03 DevTest and Evaluation Corpus |
| LDC2003L01 |  | Grassfields Bantu Fieldwork: Dschang Lexicon ♦ |
| LDC2003L02 |  | Korean Telephone Conversations Lexicon ♦ |
| LDC2003S01 |  | 2001 Communicator Evaluation ♦ |
| LDC2003S02 |  | Grassfields Bantu Fieldwork: Dschang Tone Paradigms ♦ |
| LDC2003S03 |  | Korean Telephone Conversations Speech ♦ |
| LDC2003S04 |  | Cross-Channel Forensic Speech for Automatic Speaker Recognition |
| LDC2003S05 |  | West Point Russian Speech ♦ |
| LDC2003S06 |  | Santa Barbara Corpus of Spoken American English Part II ♦ |
| LDC2003S07 |  | Korean Telephone Conversations Complete Set |
| LDC2003T01 |  | 2001 HUB5 Mandarin Transcripts ♦ |
| LDC2003T02 |  | 1998 HUB5 English Transcripts ♦ |
| LDC2003T03 |  | 1997 HUB5 German Transcripts ♦ |
| LDC2003T04 |  | 1997 HUB5 Spanish Transcripts ♦ |
| LDC2003T05 |  | English Gigaword ♦ |
| LDC2003T06 |  | Arabic Treebank: Part 1 v 2.0 ♦ |
| LDC2003T07 |  | Arabic Treebank: Part 1 - 10K-word English Translation ♦ |
| LDC2003T08 |  | Korean Telephone Conversations Transcripts ♦ |
| LDC2003T09 |  | Chinese Gigaword ♦ |
| LDC2003T10 |  | SAID ♦ |
| LDC2003T11 |  | ACE-2 Version 1.0 ♦ |
| LDC2003T12 |  | Arabic Gigaword ♦ |
| LDC2003T13 |  | Message Understanding Conference (MUC) 6 ♦ |
| LDC2003T15 |  | SLX Corpus of Classic Sociolinguistic Interviews ♦ |
| LDC2003T16 |  | SummBank 1.0 ♦ |
| LDC2003T17 |  | Multiple-Translation Chinese (MTC) Part 2 ♦ |
| LDC2003T18 |  | Multiple-Translation Arabic (MTA) Part 1 ♦ |
| LDC2003T20 |  | American National Corpus(ANC) First Release ♦ |
| LDC2003V01 |  | FORM2 Kinematic Gesture ♦ |
| | 2004 |
| LDC2004E01 | ![CD available form librarian]() |