Quantcast
Channel: Jeff Kaufman's Writing
Viewing all articles
Browse latest Browse all 142

Possible one-syllable words

$
0
0

How many distinct one-syllable words are possible in English? How many of these are used?

A syllable is going to have some optional series of consonants (onset), then a vowel (nucleus, which in English is probably a diphthong, so kind of like two vowels), then another optional series of consonants (coda). I belive all three of these can vary independently and we don't have any restrictions like "syllables starting with /f/ must end in /f/" or "can't end in /f/". Similarly I don't think we have restrictions like "The vowel /aw/ can't come before/after these specific phonemes. Which means a way to figure this out would be:

  1. Get a list of all existing English sylables.
  2. Divide each word into onset / nucleus / coda.
  3. Make lists of valid values for each of the three.
  4. Hand filter these lists looking for mistakes.
  5. Number of possible words is the cross product of these three lists.
  6. Unused words is this list minus a list of one-syllable words.

Summary:

  • Onsets: 80
  • Nucleii: 20
  • Codas: 198
  • Possible one syllable words: 316,800
  • Existing one syllable words: 9268 (2.9%) present.txt
  • Available one syllable words: 307532 (97%) absent.txt
  • Worst existing words: strengthed, glimpsed, twelfths
  • Worst possible words: hweangksts, spyeaksths, styeampf

Update 2018-01-23: removed some codas voicing prohibits; thanks Jim!

Details:

I found a list of existing syllables from Chris Barker at UCSD. The syllabification isn't ideal, as they note, which meant I needed to weed out a lot of bad onsets. [1] It's also British English, but oh well.

They're using the following phoneme transcriptions:

aa     odd
ae     at
ah     hut
ao     ought
aw     cow
ax     abaft  (first vowel; schwa)
ay     hide
ea     wear   (remember, this is Britsh pronunciation)
eh     Ed
er     hurt
ey     ate
ia     fortieth  (second vowel)
ih     it
iy     teen
oh     mob
ow     lobe
oy     toy
ua     intellectual  (last vowel)
uh     nook
uw     two

p      pick
b      be
t      tip
d      dee
f      fee
v      vise
th     thick
dh     thee   (eth)
s      sick
z      zip
sh     ship
zh     seizure
ch     cheese
jh     jeep
k      key
ng     rang   (engma)
g      green
m      me
n      new
l      lee
r      ream
w      win
y      you
hh     he

Here are the examples I found for each. In general, don't trust the counts too much, since the list has some cases where it consistently mis-divides syllables and when I culled I only removed onsets/codas where I couldn't find a valid example and didn't try to update counts.

Onsets (80):

548lLARCHl aa ch
547bBESTSb eh s t s
541rRUNEr uw n
509kCOVEk ow v
484tTOOLt uw l
478mMOOTm uw t
476sSOONs uw n
475(null)OATSow t s
468pPOMPp oh m p
455dDEBTSd eh t s
425hhHOOFhh uw f
410nNOPEn ow p
399fFONTf oh n t
368wWOVEw ow v
340s tSTOATSs t ow t s
332gGOSHg oh sh
282vVOIDv oy d
277chCHOICEch oy s
273jhJUNEjh uw n
269shSHONEsh oh n
267t rTRUCEt r uw s
259k rCRUISEk r uw z
250s kSCOFFs k oh f
243k lCLUESk l uw z
241b rBROACHEDb r ow ch t
228s pSPOOLs p uw l
225zZOOMz uw m
214d rDROVESd r ow v z
214g rGROOVEg r uw v
203s lSLOPEs l ow p
201b lBLOOMSb l uw m z
197p rPRUNEp r uw n
191f lFLUTEf l uw t
188s t rSTROLLEDs t r ow l d
169p lPLUMEp l uw m
160k wQUADk w oh d
149f rFRILLEDf r ih l d
149yYEWSy uw z
146s wSWOOPEDs w uw p t
145thTHEMEDth iy m d
120s nSNOBs n oh b
118g lGLOOMg l uw m
111t wTWIGt w ih g
108s mSMOCKEDs m oh k t
97s k rSCRIMPSs k r ih m p s
82th rTHREESth r iy z
77d wDWARFSd w ao f t
71n yNEWTSn y uw t s
65dhTHISdh ih s
63s p rSPRANGs p r ae ng
62sh rSHREWSsh r uw z
60s k wSQUIRMs k w er m
58s p lSPLINTSs p l ih n t s
55zhGENRESzh aa n
50d yDUNESd y uw n z
47g wGWAMg w aa m
45z lZLOTYSz l oh t
41s rSREBRENICAs r eh
40t yTUNESt y uw n z
39k yQUEUEDk y uw d
39p yPEWSp y uw z
31th wTHWACKSth w ae k s
30v lVLADv l ae d
29f yFUEDSf y uw d z
27s ySUESs y uw z
27sh mSCHMALTZsh m ao l t s
25z wZWIEBACKSz w iy
23b yBJORNb y ao n
22g yARGUESg y uw z
20s k lSCLEROSISs k l ax
20sh nSCHNITZELSsh n ih t
19s t ySTEWSs t y uw z
19v yVIEWSv y uw z
17sh wSCHWASsh w aa z
17z yZURICHz y ua
15s k ySKEWERINGs k y ua
11s p ySPUEDs p y uw d
11sh ySIANsh y aa n
9hh yHEWNhh y uw n
4hh wJUANhh w aa n

Nucleii (20):

1834ihWHISKSw ih s k s
1282aeTWANGt w ae ng
1224ehYELPy eh l p
1102axOFax v
1099ahWRUNGr ah ng
1060ohYACHTSy oh t s
955eyWRAITHSr ey th
934iyBEEPEDb iy p t
803aaYARNy aa n
803owDOGEd ow jh
797uwGROOMSg r uw m z
775ayFLIGHTSf l ay t s
721aoBRAWLSb r ao l z
604erWORDSw er d z
438iaUNIONSn ia n z
396awVOWSv aw z
257uhWOOLVESw uh l v z
215oyTOILSt oy l z
212uaCRUELk r ua l
206eaWHEREw ea r

Codas (198):

o
1520(null)WHEYw ey
719nYAWNy ao n
699zZOOSz uw z
668dBLEDb l eh d
535tSPATEs p ey t
522lSQUEALs k w iy l
471n zVANSv ae n z
451t sARTSaa t s
425kDIRKd er k
416mQUALMk w aa m
403sTWICEt w ay s
399l zWALLSw ao l z
391d zCREEDSk r iy d z
367n dANDae n d
362k sLURKSl er k s
297m zBLAMESb l ey m z
277s tUSEDy uw s t
274rSHAREsh ea r
272l dBALEDb ey l d
248pHYPEhh ay p
224k tLEAKEDl iy k t
212ngWRUNGr ah ng
203p sWHOOPShh uw p s
193fSKIFFs k ih f
188n tSCANTs k ae n t
176vVERVEv er v
171r zJEERSjh ia r z
168p tSLIPPEDs l ih p t
162m dSUMMEDs ah m d
161gTWIGt w ih g
153n t sAUNTSaa n t s
141s t sENTHUSIASTSae s t s
139bVERBv er b
133f sRUFFSr ah f s
132shTRASHt r ae sh
128chCROUCHk r aw ch
127v zLOAVESl ow v z
126n d zBLENDSb l eh n d z
121g zRUGSr ah g z
118thBROTHb r oh th
115b zTABSt ae b z
114z dCAUSEDk ao z d
113jhVERGEv er jh
105ng zWINGSw ih ng z
101n sTROUNCEt r aw n s
100f tAFTaa f t
88ng kSLUNKs l ah ng k
83ch tARCHEDaa ch t
81jh dBARGEDb aa jh d
80r dSHEAREDsh ia r d
78m pSWAMPs w oh m p
76sh tPLUMPp l ah m p
75v dBRAVEDb r ey v d
74l tALTae l t
70ng k sBANKSb ae ng k s
68th sSLEUTHSs l uw th s
67g dBAGGEDb ae g d
63b dORBEDao b d
60m p sAMPSae m p s
52l t sCOBALTSb ao l t s
52n s tNUANCEDaa n s t
51ng k tBANKEDb ae ng k t
49m p tBUMPEDb ah m p t
48zhBEIGEb ey zh
46s kASKaa s k
45n chBUNCHb ah n ch
42l d zSCALDSs k ao l d z
41ng dBANGEDb ae ng d
39k t sFACTSf ae k t s
38n jhFLANGEf l ae n jh
35n ch tBUNCHEDb ah n ch t
34f t sGIFTSdg ih f t s
33dhBATHEb ey dh
32n jh dBINGEDb ih n jh d
32s k sASKSaa s k s
28k s tAXEDae k s t
27dh zTEETHESt iy dh z
21dh dSWATHEDs w ey dh d
21l kBULKb ah l k
20th tBATHEDb aa th t
18s k tASKEDaa s k t
15l fALFae l f
15n thNINTHn ay n th
14p t sADAPTSd ae p t s
13l k sBULKSb ah l k s
13l pKELPk eh l p
13s pASPae s p
12l sBELLSb eh l s
12l vSHELVEsh eh l v
12l v zDELVESd eh l v z
11l mREALMr eh l m
11n th sTRILLIONTHSl ia n th s
11s p sASPSae s p s
10l f sRUDOLPH'Sd ao l f s
10l k tBULKEDb ah l k t
10s p tGRASPEDg r aa s p t
9l chBELCHb eh l ch
9l m zREALMSr eh l m z
9zh dCAMOUFLAGEDf l aa zh d
8l p sALPSae l p s
8l p tGULPEDg ah l p t
8l v dDELVEDd eh l v d
7m fTRIUMPHax m f
7n kBANKb ae n k
7ng thLENGTHl eh ng th
7r d zVILLARD'Sax r d z
7r tGERHARDThh aa r t
7r t sGEPHARDTSaa r t s
6l thFILTHf ih l th
6l th sFILTHSf ih l th s
6n z dBRONZEDb r oh n z d
6p s tCLAPSEDk l ae p s t
5l bBULBb ah l b
5l ch tBELCHEDb eh l ch t
5l f tWOLFEDw uh l f t
5l jhBULGEb ah l jh
5l jh dBULGEDb ah l jh d
5l nKILNk ih l n
5l s tREPULSEDp ah l s t
5m f sTRIUMPHSax m f s
5ng th sLENGTHSl eh ng th s
4d s tMIDSTm ih d s t
4k s t sTEXTSt eh k s t s
4l m dFILMEDf ih l m d
4l n zHELENShh eh l n z
4l shWELSHeh l sh
4m p t sPROMPTSp r oh m p t s
4n s kCHELYABINSKb ih n s k
4n shAVALANCHEl aa n sh
4ng k t sCONJUNCTSjh ah ng k t s
4z d zACCUSED'Sk y uw z d z
3d sHUNDREDSd r ax d s
3n k sBANKSb ae n k s
3t thEIGTHey t th
3t th sEIGTHSey t th s
2d thBREADTHb r eh d th
2d th sBREADTHSb r eh d th s
2dh d zBETROTHEDSt r ow dh d z
2f s kSVERDLOVSKd l oh f s k
2k dBIVOUACKEDae k d
2l b zBULBSb ah l b z
2l n dKILNEDk ih l n d
2m f tGALUMPHEDl ah m f t
2m p fHUMPHhh ah m p f
2m p f sHUMPHShh ah m p f s
2n s k sGDANSK'Sd ae n s k s
2n s t sERNST'Ser n s t s
2n zhM\'ELANGEl aa n zh
2ng k s tJINXEDjh ih ng k s t
2t s kDONETSKn eh t s k
1d z t sMIDSTSm ih d s t s
1d z dSUDSEDs ah d z d
1f thFIFTHf ih f th
1f th sFIFTHSf ih f th s
1k s thSIXTHs ih k s th
1k s th sSIXTHSs ih k s th s
1l b dBULBEDb ah l b d
1l f t sDELFTSd eh l f t s
1l f thTWELFTHt w eh l f th
1l f th sTWELFTHSt w eh l f th s
1l k t sMULCTSm ah l k t s
1l p t sSCULPTSs k ah l p t s
1l sh tWELSHEDw eh l sh t
1m n dILL-OMENEDl ow m n d
1m p f tHUMPHEDhh ah m p f t
1m p s tGLIMPSEDg l ih m p s t
1m s kOMSKoh m s k
1m s tZEMSTVOSz eh m s t
1m tUNDREAMTd r eh m t
1m thWARMTHw ao m th
1m th sWARMTHSw ao m th s
1ng k s t sANGSTSae ng k s t s
1n k tSHANKEDsh ae n k t
1n sh tAVALANCHEDl aa n sh t
1ng t sEXTINCTSs t ih ng t s
1ng th tSTRENGTHEDs t r eh ng th t
1p thDEPTHd eh p th
1p th sDEPTHSd eh p th s
1r bORBoh r b
1r chTORCHt oh r ch
1r ch tTORCHEDt oh r ch t
1r gBERGb er r g
1r g zBERGSb er r g z
1r kMARKm aa r k
1r k sMARK'Sm aa r k s
1r l zKARL'Sk aa r l z
1r mPERMp y eh r m
1r n sCLARENCEk l ae r n s
1r n tAREN'Taa r n t
1r n zBARRONSb ae r n z
1r s kMAGNITOGORSKg oh r s k
1r thBARTHb aa r th
1r th sBARTHESb aa r th s
1s dDRESDNER'Sd r eh s d
1s thISTHMIANih s th
1sh dPUBLISHEDb l ih sh d
1t s k sDONETSK'Sn eh t s k s
1t s tBLITZEDb l ih t s t

Putting these together and checking against BEEP gets us the lists.

Code, such as it is, is on github.


[1] Tables of invalid onsets and codas that I hand-removed from the above. They mostly demonstrate syllabification errors but also that I'm not handling syllabic consonants (l/n/etc). Since those mostly apply to multisyllable words, though, I'm not to worried.

Culled onsets:

91n hhUNHARNESSINGn hh aa
82t lWYTHALLt l
79n wSTONEWALLINGn w aa
74d lYODELd l
65s bSPACE-BARs b aa
48l yVOLUMEl y uw m
48t hhSWEETHEARTINGt hh aa
47s hhDISHARMONYs hh aa
39l rKOHLRABISl r aa
39s chUNDISCHARGEABLEs ch aa
36d hhKIND-HEARTEDd hh aa
32k hhBACKHANDINGk hh ae n
32l hhWHOLEHEARTEDNESSl hh aa
31l wSEYCHELLOISl w aa
26s gPRESS-GALLERYs g ae
25p hhGEPHARDTp hh aa r t
25s dMORRIS_DANCESs d aa n
23sh lUNSOCIALsh l
23th lLETHALth l
21b jhSUBJUNCTIVESb jh ah ng k
19s t lWASTELLs t l
18k chMISAFFECTIONk ch
18s s tPUMICE-STONESs s t ow n z
17ch wSHIHKIACHWANGch w ae ng
17s t hhFIRSTHANDs t hh ae n d
17z rARMS-RUNNERSz r ah
15d l wNEEDLEWORKERd l w er
15l lTAILLESSNESSl l ax
15p chUNSCRIPTURALNESSp ch ax
15s sDISSATISFACTORYs s ae
14s jhSPORTS-JACKETSs jh ae
14s t wWESTWARDLYs t w ax
13ch lSATCHELch l
13f hhHALF-HEARTEDLYf hh aa
13g hhBUG-HUNTERSg hh ah n
12ch hhSLOUCH-HATch hh ae t
11r lORLANDO'Sr l ae n
11v rNERVE-RACKINGv r ae
10b hhABHORRINGb hh ao
10z fIGNES_FATUIz f ae
10z hhLUDWIGSHAFENz hh aa f
9f thPHENOLPHTHALEINf th ae
9jh hhSLEDGEHAMMERSjh hh ae
9s b rGAS-BRACKETSs b r ae
9s g rPAMPAS-GRASSs g r aa s
9s s pMISSPELLINGSs s p eh
9sh hhWASH-HAND-STANDSsh hh ae n d
9th fMOUTHFULth f ah l
8ch fPITCHFORKINGch f ao
8ch rUNNATURALLYch r ax
8f shOFFSHOREf sh ao
8jh wBRIDGWATERjh w ao
8jh ySWEDENBORGIANISMjh y ax
8t jhNIGHTJARt jh aa
7d chSIDE-CHAPELSd ch ae
7jh lNIGELjh l
7jh rLUGGAGE-RACKjh r ae k
7k jhBLACKJACKINGk jh ae
7sh fWISHFULLYsh f ax
7t chCHIT-CHATt ch ae t
7z vTRANSVESTITESz v eh
6jh fSEWAGE-FARMjh f aa m
6jh shCAMBRIDGESHIREjh sh ax
6p jhFLAPJACKp jh ae k
6th yTHEWSth y uw z
6v fMATTER-OF-FACTv f ae k t
5b l rTABLE-RAPPINGb l r ae
5t l wMETALWORKINGSt l w er
4b l lCABLE-LENGTHb l l eh ng th
4b l wTABLEWAREb l w ea
4ch yCHURCHYARDch y aa d
4dh hhWITHHELDdh hh eh l d
4g l hhSINGLEHEARTEDNESSg l hh aa
4n hh yUNHUMILIATEDn hh y uw
4r ySEVERIANOr y aa
4s hh yEXHUMINGs hh y uw
4sh vNASHVILLEsh v ih l
4v hhSHOVE-HA'PENNYv hh ey p
4z f lNEWSFLASHESz f l ae
3b l yY._W._C._A.b l y uw
3d l hhMUDDLEHEADEDNESSd l hh eh
3d l lCANDLELIGHTINGd l l ay
3dh z hhCLOTHES-HANGERSdh z hh ae ng
3jh chLOUNGE-CHAIRjh ch ea
3k l wCLOISONN\'Ek l w aa
3s b lOFFICE-BLOCKs b l oh k
3s s kICE-SKATINGs s k ey
3s s k rPOSTSCRIBEs s k r ay b
3s s t rBREASTSTROKERs s t r ow
3sh n wSTATION-WAGGONSsh n w ae
3t l rNETTLERASHt l r ae sh
3th hhSLEUTH-HOUNDth hh aw n d
3v l rUNCHIVALROUSNESSv l r ax
2ch chWATCH-CHAINch ch ey n
2d jhWIND-JAMMERSd jh ae
2d l rBRIDLE-ROADd l r ow d
2dh rSOUTHRONdh r ax n
2dh vEISTEDDFODdh v oh d
2dh z lCLOTHESLINEdh z l ay n
2f l rRIFLE-RANGESf l r ey n
2g jhLOG-JAMg jh ae m
2jh f rWAGE-FREEZESjh f r iy
2jh vLUGGAGE-VANjh v ae n
2p l lMAPLE-LEAFp l l iy f
2r hhMOREHOUSE'Sr hh aw
2s g lFOXGLOVEs g l ah v
2s t l rHOSTELRYs t l r ih
2sh shHORSESHOERSsh sh uw
2t r wOCTROIt r w aa
2v zGRAVESENDv z eh n d
2z shNEWSSHEETz sh iy t
2zh wPETIT_BOURGEOISzh w aa
2zh yETESIANzh y ax n
1b chDABCHICKb ch ih k
1b hh ySUBHUMIDb hh y uw
1b l dWOBBLEDb l d
1b l zWORKTABLESb l z
1ch l dSATCHELEDch l d
1ch l zSATCHELSch l z
1ch tCHUTch t
1ch vH._V._O.ch v iy
1d hh yGOOD-HUMOUREDd hh y uw
1d l dYODELLEDd l d
1d l zYODELSd l z
1d ngUNPARDONABLYd ng
1dh fSMOOTH-FACEDdh f ey s t
1dh lSMOOTHLYdh l ih
1dh yMIDLOTHIANdh y ax n
1f hh yROUGH-HEWNf hh y uw n
1f l dWAFFLEDf l d
1f l zWAFFLESf l z
1f r wSANG_FROIDf r w aa
1f th lFIFTHLYf th l ih
1f vHALF-VOLLEYSf v oh
1g l dWRIGGLEDg l d
1g l wMANGEL-WURZELSg l w er
1g l zWRIGGLESg l z
1hh mH'Mhh m
1hh m dH'MMEDhh m d
1hh m zH'MShh m z
1jh l zRANGEL'Sjh l z
1jh ngORIGINALLYjh ng
1jh zMANGESjh z
1k l dWRINKLEDk l d
1k l zYOKELSk l z
1k ngSICKENERSk ng
1l dUNSENTINELLEDl d
1l d zRONALDSl d z
1l zVIRGINALSl z
1m kMCMANUS'Sm k
1m mHEMm m
1m zLYRICISMSm z
1n dWIZENEDn d
1n d zTHOUSANDSn d z
1n hh wSAN_JUANn hh w aa n
1n sSUBSIDENCEn s
1n s tPRESENCEDn s t
1n tWOULDN'Tn t
1n t sTRIDENTSn t s
1n thTHOUSANDTHn th
1n th sTHOUSANDTHSn th s
1n zWIZENSn z
1p l dWIMPLEDp l d
1p l hhSIMPLE-HEARTEDp l hh aa
1p l tAPPLETON'Sp l t
1p l zWIMPLESp l z
1p ngTUPENNYp ng
1r wNORWALKr w oh l k
1s d rICEDROMEs d r ow m
1s k l zRASCALSs k l z
1s l dWRESTLEDs l d
1s l zWRESTLESs l z
1s n dUNLOOSENEDs n d
1s n sVITRESCENCEs n s
1s n s tUNLICENSEDs n s t
1s n tVITRESCENTs n t
1s n t sVINCENT'Ss n t s
1s n zWILSONSs n z
1s ngJANSENISTSs ng
1s p l zGOSPELSs p l z
1s s mCONSCIENCE-SMITTENs s m ih t
1s s yMISSUITs s y uw t
1s t jhDUST-JACKETSs t jh ae
1s t l dPISTOLEDs t l d
1s t l zROCK-CRYSTALSs t l z
1sh l dPARTIALEDsh l d
1sh l sNUPTIALSsh l s
1sh l wSOCIAL-WORKsh l w er k
1sh l zSPECIALSsh l z
1sh n dWELL-INTENTIONEDsh n d
1sh n sPATIENCEsh n s
1sh n tUNIMPATIENTsh n t
1sh n t sSENTIENTSsh n t s
1sh n zWEATHER-STATIONSsh n z
1sh ngUNOBJECTIONABLEsh ng
1t l dWHITTLEDt l d
1t l hhLITTLEHAMPTONt l hh ae m
1t l sSKITTLESt l s
1t l zWHITTLESt l z
1t ngTIGHTENERSt ng
1th dhTHUMMIMth dh
1th f lFAITHFULNESSth f l
1th l zLETHALSth l z
1th ngSTRENGTHENERSth ng
1th shSOUTH_SHIELDSth sh iy l z
1v dhRUN-OF-THE-MILLv dh ax
1v l dUNSHRIVELLEDv l d
1v l hhLEVEL-HEADEDv l hh eh
1v l wTRAVEL-WORNv l w ao n
1v l zWATER-LEVELSv l z
1v ngSEVENISHv ng
1v shEVESHAMv sh ax m
1z l dWEASELEDz l d
1z l zWITCH-HAZELSz l z
1z ngUNSEASONABLEz ng
1zh ngOCCASIONALISTzh ng

Culled codas:

6k zBIVOUACSae k z
4ng tWALKINGTONk ih ng t
2g sBIGGSb ih g s
2l d sGOLD'Sg ow l d s
2m bCHORIAMBae m b
2m d zLAMEDSl ey m d z
2n pBURNUPb er n p
2ng sFINANCINGSs ih ng s
2ng s tANGSTae ng s t
2r lKARLk aa r l
2r nBARRONb ae r n
1b sARABS'r ax b s
1ch sBACHE'Sb ae ch s
1d tBRADTb r ae d t
1d t sBRADT'Sb r ae d t s
1f t zCRAFTSk r aa f t z
1f zNIXDORF'Sax f z
1g tWIGTON'Sw ih g t
1g z dHOGSEDhh oh g z d
1k t s sDISTRICT'Ss t r ih k t s s
1l m nCOLEMANk ow l m n
1l m n zCOLEMAN'Sk ow l m n z
1l n sMALEVOLENCEv ax l n s
1l n tMALEVOLENTv ax l n t
1n mUNMUSICALNESSah n m
1ng mHINGHAMhh ih ng m
1m sCONSORTIUMSsh ia m s
1sh kMESHKOV'Sm eh sh k
1yWOJCIECHv oy y
1n d sASHLANDSsh l ae n d s
1ng k zBANKSb ae ng k z
1p zEUROPE'Sr ax p z
1r b sFORBESf oh r b s

Comment via: google plus, facebook


Viewing all articles
Browse latest Browse all 142

Trending Articles