Native Speaker Data to compare


English essays written by 20 native English speakers in 30 minutes
Same prompts as EFL learners
All essays written in the same day

91,184 words
160 files
20 speakers
8 prompts (ETS corpus)


Complex nominals

Linguistic Indexes used in Lexical Complexity Analyzer


Measures of Lexical Density and Sophistication

 ld lexical density
 ls1 lexical sophistication-I
 ls2 lexical sophisitication-II
 vs1 verb sophistication-I
 vs2 verb sophistication-II
 cvs1 corrected VS1

Measures of Lexical Variation

 ndw number of different words
 ndwz first 50 words
 ndwerz expected random 50
 ndwesz expected sequence 50
 ttr type/token ratio
 msttr mean segmental TTR
 cttr corrected TTR
 rttr root TTR
 logttr bilogarithmic TTR
 uber uber index
 lv lexical word variation
 vv1 verb variation-I
 svv1 squared VV1
 cvv1 corrected VV1
 vv2 verb variation-II
 nv noun variation
 adjv adjective variation
 advv aderb variation
 modv modifier variation

Linguistic Indexes used in L2 Syntactic Complexity Analyzer

9 structures in the text:

  1. words (W)
  2. sentences (S)
  3. verb phrases (VP)
  4. clauses (C)
  5. T-units (T)
  6. dependent clauses (DC)
  7. complex T-units (CT)
  8. coordinate phrases (CP)
  9. complex nominals (CN)

14 syntactic complexity indices of the text:

  1. mean length of sentence (MLS)
  2. mean length of T-unit (MLT
  3. mean length of clause (MLC)
  4. clauses per sentence (C/S)
  5. verb phrases per T-unit (VP/T)
  6. clauses per T-unit (C/T)
  7. dependent clauses per clause (DC/C)
  8. dependent clauses per T-unit (DC/T)
  9. T-units per sentence (T/S)
  10. complex T-unit ratio (CT/T)
  11. coordinate phrases per T-unit (CP/T)
  12. coordinate phrases per clause (CP/C)
  13. complex nominals per T-unit (CN/T)
  14. complex nominals per clause (CN/C)