Discussion:
given Dict=(act, eat, sat, ...) make a long chain (no repeats) with 2-letter overlaps
(too old to reply)
HenHanna
2024-06-17 19:01:27 UTC
Permalink
given (a list of 3-letter words)
Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)

The object is to make a long chain (no repeats) with 2-letter overlaps.

e.g. -- [cat, ate, tea, eat, ATT, ...]



What's a good approach (in Python)?


in Mathematica, it's easy to find THE Longest chain?

is this a typical NP-complete problem?

________________

-- Martha has aspirin in industrial allotments.

-- Two women enter erotic icehouse, seduce celibate teacher.

-- Rush showed editorial alarmism, smeared educational alliance ceaselessly.
James Waldby
2024-06-17 23:24:15 UTC
Permalink
Post by HenHanna
given (a list of 3-letter words)
Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)
The object is to make a long chain (no repeats) with 2-letter overlaps.
e.g. -- [cat, ate, tea, eat, ATT, ...]
What's a good approach (in Python)?
According to ref 1, longest-path problems are NP-complete. At the
moment there's no method known that is "good" in general for the
problem. However, if all of the dictionary words are chosen from a
natural-language, then we have a special (not general) case. I think
in this special case a method like finding pairs, then combining pairs
to triple, then triples to fives, fives to nines, etc, might work
well, given obvious fallbacks to combining different-length sequences
when at some length same-length combinations don't exist.

*Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
*Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>
Post by HenHanna
in Mathematica, it's easy to find THE Longest chain?
is this a typical NP-complete problem?
As noted in ref 2, "A problem p in NP is NP-complete if every other
problem in NP can be transformed (or reduced) into p in polynomial
time", so there is a sense in which every NP-complete problem is a
typical NP-complete problem.
Post by HenHanna
________________
-- Martha has aspirin in industrial allotments.
-- Two women enter erotic icehouse, seduce celibate teacher.
-- Rush showed editorial alarmism, smeared educational alliance ceaselessly.
HenHanna
2024-06-18 02:32:58 UTC
Permalink
Post by James Waldby
Post by HenHanna
given (a list of 3-letter words)
Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)
The object is to make a long chain (no repeats) with 2-letter overlaps.
e.g. -- [cat, ate, tea, eat, ATT, ...]
What's a good approach (in Python)?
According to ref 1, longest-path problems are NP-complete. At the
moment there's no method known that is "good" in general for the
problem. However, if all of the dictionary words are chosen from a
natural-language, then we have a special (not general) case. I think
in this special case a method like finding pairs, then combining pairs
to triple, then triples to fives, fives to nines, etc, might work
well, given obvious fallbacks to combining different-length sequences
when at some length same-length combinations don't exist.
*Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
*Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>
Post by HenHanna
in Mathematica, it's easy to find THE Longest chain?
is this a typical NP-complete problem?
As noted in ref 2, "A problem p in NP is NP-complete if every other
problem in NP can be transformed (or reduced) into p in polynomial
time", so there is a sense in which every NP-complete problem is a
typical NP-complete problem.
Post by HenHanna
________________
-- Martha has aspirin in industrial allotments.
-- Two women enter erotic icehouse, seduce celibate teacher.
-- Rush showed editorial alarmism, smeared educational alliance ceaselessly.
thanks!




word Chain with 3-letter overlaps (between successive words)

10 (shtoomest, estrayed, yedo, edomitish,
ish, ishime, imelle, ller, lerps, rps)
James Waldby
2024-07-06 22:08:53 UTC
Permalink
Post by HenHanna
Post by James Waldby
Post by HenHanna
given (a list of 3-letter words)
Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)
The object is to make a long chain (no repeats) with 2-letter overlaps.
e.g. -- [cat, ate, tea, eat, ATT, ...]
What's a good approach (in Python)?
According to ref 1, longest-path problems are NP-complete. At the
moment there's no method known that is "good" in general for the
problem. However, if all of the dictionary words are chosen from a
natural-language, then we have a special (not general) case. I think
in this special case a method like finding pairs, then combining pairs
to triple, then triples to fives, fives to nines, etc, might work
well, given obvious fallbacks to combining different-length sequences
when at some length same-length combinations don't exist.
*Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
*Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>
Post by HenHanna
in Mathematica, it's easy to find THE Longest chain?
is this a typical NP-complete problem?
As noted in ref 2, "A problem p in NP is NP-complete if every other
problem in NP can be transformed (or reduced) into p in polynomial
time", so there is a sense in which every NP-complete problem is a
typical NP-complete problem.
...
Post by HenHanna
word Chain with 3-letter overlaps (between successive words)
10 (shtoomest, estrayed, yedo, edomitish,
ish, ishime, imelle, ller, lerps, rps)
I wrote some programs to look for word chains, first on the plan
mentioned earlier -- finding pairs, then combining pairs to triples,
triples to fives, etc which worked ok to find chains of up to 30 or 40
words, but this turned out to be fairly slow and of course memory
intensive, running out of memory after a few hours on my 64GB-ram
system. Then I tried depth-first recursion, which in .1 second can
find chains of 150+ words, using 6-letter dictionary words with
3-letter overlaps, but taking 1.5 hours to find its first 171-word
chain, 11 hours to find a 174-chain, 30 hours for a 177-chain, then no
progress in the next 142 hours of computation. With a couple of minor
program changes I got the 174 time below 6 minutes, and time to 177
around 14 minutes, but no progress beyond that in the next 18 hours.

Here's an 8-letter/4-overlap 10-chain example, based on an
88-word dictionary of 8-letter words, all containing `over`:
10 : discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts at 0.201194 seconds

Example 6-letter/3-overlap 177-chain:
177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
bonbon bongos gospel pellet letups upshot hotbed bedbug bugles lessee
seeped pedant anthem hempen pennon noncom combat bather herbal balder
dermis misfit fitful fulcra craven vendor dormer merman manful fulfil
filthy thymus muscat catnap napped pedlar largos gossip siphon honcho
choice icecap captor torpor portal talent entrap rapper permit mitten
tenpin pintos tosses sesame amebas basins instil tildes despot potash
ashcan cancan cancel cellar larvas vassal salver verbal ballad ladles
lessen sensor sorbet betcha chapel pelvis viscus custom tombed bedlam
lambed bedpan pantry tryout outlay layout outran rancor cornea neared
redden denser server vermin minute uterus russet settee teepee peewee
weevil villas lassie siesta stamen mentor torque queasy asylum lumbar
barfed fedora orator torrid ridden dentin tinsel selves vessel selfie
fiesta stadia dialog logout outrun runway waylay layman mantra trains
insult ultras rascal callus lustre tremor mortar tartar tarpon poncho
chores resend endear earwig wigwam wampum pumper person sonnet nether
hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds

Example 8-letter/4-overlap 79-chain:
79: abjuring ringside sidekick kickback backbone bonehead headland
landlady ladyship shipload loadstar starfish fishtail tailgate
gatepost postdate dateline linefeed feedback backfire fireside
sidelong longhand handbook bookcase casework workbook bookworm
wormwood woodcock cocksure surefire firewood woodland landmark
markdown downhill hillside sideshow showdown downturn turnover
overcome comeback backhand handrail railroad roadkill killdeer
deerskin skinhead headword wordplay playback backstop stopover
overhand handsome somewhat whatever evermore moreover overhang
hangover overhung hungover overtake takeover overdraw drawback
backrest restrain raindrop dropouts outshone honester sternest
nestling lingered at 54577.844678 seconds
HenHanna
2024-07-07 05:42:38 UTC
Permalink
Post by James Waldby
Post by HenHanna
Post by James Waldby
Post by HenHanna
given (a list of 3-letter words)
Dict=(act, ATT, eat, sat, sit, cat, bat, dog, god, mat, tim, kim, ...)
The object is to make a long chain (no repeats) with 2-letter overlaps.
e.g. -- [cat, ate, tea, eat, ATT, ...]
What's a good approach (in Python)?
According to ref 1, longest-path problems are NP-complete. At the
moment there's no method known that is "good" in general for the
problem. However, if all of the dictionary words are chosen from a
natural-language, then we have a special (not general) case. I think
in this special case a method like finding pairs, then combining pairs
to triple, then triples to fives, fives to nines, etc, might work
well, given obvious fallbacks to combining different-length sequences
when at some length same-length combinations don't exist.
*Ref 1 <https://en.wikipedia.org/wiki/Longest_path_problem#NP-hardness>
*Ref 2 <https://en.wikipedia.org/wiki/NP-completeness>
Post by HenHanna
in Mathematica, it's easy to find THE Longest chain?
is this a typical NP-complete problem?
As noted in ref 2, "A problem p in NP is NP-complete if every other
problem in NP can be transformed (or reduced) into p in polynomial
time", so there is a sense in which every NP-complete problem is a
typical NP-complete problem.
...
Post by HenHanna
word Chain with 3-letter overlaps (between successive words)
10 (shtoomest, estrayed, yedo, edomitish,
ish, ishime, imelle, ller, lerps, rps)
I wrote some programs to look for word chains, first on the plan
mentioned earlier -- finding pairs, then combining pairs to triples,
triples to fives, etc which worked ok to find chains of up to 30 or 40
words, but this turned out to be fairly slow and of course memory
intensive, running out of memory after a few hours on my 64GB-ram
system. Then I tried depth-first recursion, which in .1 second can
find chains of 150+ words, using 6-letter dictionary words with
3-letter overlaps, but taking 1.5 hours to find its first 171-word
chain, 11 hours to find a 174-chain, 30 hours for a 177-chain, then no
progress in the next 142 hours of computation. With a couple of minor
program changes I got the 174 time below 6 minutes, and time to 177
around 14 minutes, but no progress beyond that in the next 18 hours.
Here's an 8-letter/4-overlap 10-chain example, based on an
10 : ( discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts ) at 0.201194 seconds
menomini + minidisc +

( discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts )
Post by James Waldby
177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
bonbon bongos gospel pellet letups upshot hotbed bedbug bugles lessee
seeped pedant anthem hempen pennon noncom combat bather herbal balder
dermis misfit fitful fulcra craven vendor dormer merman manful fulfil
filthy thymus muscat catnap napped pedlar largos gossip siphon honcho
choice icecap captor torpor portal talent entrap rapper permit mitten
tenpin pintos tosses sesame amebas basins instil tildes despot potash
ashcan cancan cancel cellar larvas vassal salver verbal ballad ladles
lessen sensor sorbet betcha chapel pelvis viscus custom tombed bedlam
lambed bedpan pantry tryout outlay layout outran rancor cornea neared
redden denser server vermin minute uterus russet settee teepee peewee
weevil villas lassie siesta stamen mentor torque queasy asylum lumbar
barfed fedora orator torrid ridden dentin tinsel selves vessel selfie
fiesta stadia dialog logout outrun runway waylay layman mantra trains
insult ultras rascal callus lustre tremor mortar tartar tarpon poncho
chores resend endear earwig wigwam wampum pumper person sonnet nether
hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds
79: abjuring ringside sidekick kickback backbone bonehead headland
landlady ladyship shipload loadstar starfish fishtail tailgate
gatepost postdate dateline linefeed feedback backfire fireside
sidelong longhand handbook bookcase casework workbook bookworm
wormwood woodcock cocksure surefire firewood woodland landmark
markdown downhill hillside sideshow showdown downturn turnover
overcome comeback backhand handrail railroad roadkill killdeer
deerskin skinhead headword wordplay playback backstop stopover
overhand handsome somewhat whatever evermore moreover overhang
hangover overhung hungover overtake takeover overdraw drawback
backrest restrain raindrop dropouts outshone honester sternest
nestling lingered at 54577.844678 seconds
my PC is slow... but i got curious... (about 6-letter/3-overlap)

........, massig,signal,nalfon, +

........, reshun,hungry,gryfon, +

95 ( fondak, dakota, otakus, kuskus, kussos, sossed, sedate, ateles,
lesses, sestet, tethys, hyssop, sophia, hiatus, tuscan, canell, ellops,
opsins, insoul, oulder, dermol, molten, tenure, uretal, talpas, passes,
seskin, kindie, dietal, tallis, lisbon, bonagh, aghast, astony, onymal,
malice, icecap, caplin, lingas, gassit, situps, upsend, endore, oregon,
gonads, adsorb, orbits, itself, elfish, ishtar, tarpot, potboy, boyaux,
auxins, inship, hippus, pusley, leymus, muslin, linsey, seyens, ensure,
uredia, diamat, matlos, losels, elsins, insunk, unkill, illing, ingoes,
oesogi, ogived, vedism, ismdom, domett, ettles, lessee, seeing, ingans,
anshar, hardim, dimply, plying, inglut, lutzes, zester, ternar, narica,
icarus, rushes, hestia, tiamat, mating, ingots )



i think... A long (straight) chain like this (below) is harder to find
(no dict for them)

Survey -
Monkey - Business - Casual - Sex - Work - Visa - Card - Shark - Tank
- Top - Secret - Agent - Orange - Julius - Caesar - Salad - Dressing
James Waldby
2024-07-07 20:41:41 UTC
Permalink
...
Post by HenHanna
Post by James Waldby
Here's an 8-letter/4-overlap 10-chain example, based on an
10 : ( discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts ) at 0.201194 seconds
menomini + minidisc +
( discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts )
Post by James Waldby
177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
...
Post by HenHanna
Post by James Waldby
hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds
79: abjuring ringside sidekick kickback backbone bonehead headland
landlady ladyship shipload loadstar starfish fishtail tailgate
gatepost postdate dateline linefeed feedback backfire fireside
...
Post by HenHanna
Post by James Waldby
backrest restrain raindrop dropouts outshone honester sternest
nestling lingered at 54577.844678 seconds
my PC is slow... but i got curious... (about 6-letter/3-overlap)
........, massig,signal,nalfon, +
........, reshun,hungry,gryfon, +
95 ( fondak, dakota, otakus, kuskus, kussos, sossed, sedate, ateles,
lesses, sestet, tethys, hyssop, sophia, hiatus, tuscan, canell, ellops,
opsins, insoul, oulder, dermol, molten, tenure, uretal, talpas, passes,
seskin, kindie, dietal, tallis, lisbon, bonagh, aghast, astony, onymal,
malice, icecap, caplin, lingas, gassit, situps, upsend, endore, oregon,
gonads, adsorb, orbits, itself, elfish, ishtar, tarpot, potboy, boyaux,
auxins, inship, hippus, pusley, leymus, muslin, linsey, seyens, ensure,
uredia, diamat, matlos, losels, elsins, insunk, unkill, illing, ingoes,
oesogi, ogived, vedism, ismdom, domett, ettles, lessee, seeing, ingans,
anshar, hardim, dimply, plying, inglut, lutzes, zester, ternar, narica,
icarus, rushes, hestia, tiamat, mating, ingots )
Do you have a link to the dictionary you're using? Among the first 17
words (for example) of that sequence, only sedate and hiatus are in
linux's /usr/share/dict/american-english file.
Post by HenHanna
i think... A long (straight) chain like this (below) is harder to find
(no dict for them)
Survey -
Monkey - Business - Casual - Sex - Work - Visa - Card - Shark - Tank
- Top - Secret - Agent - Orange - Julius - Caesar - Salad - Dressing
I think the basics of the problem are not more difficult, if a
dictionary of collocations were to be found. I don't know of a free
one, or whether the following would be accessible if paid for.
https://www.oxfordlearnersdictionaries.com/us/definition/collocations
HenHanna
2024-07-07 21:23:29 UTC
Permalink
Post by James Waldby
...
Post by HenHanna
Post by James Waldby
Here's an 8-letter/4-overlap 10-chain example, based on an
10 : ( discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts ) at 0.201194 seconds
menomini + minidisc +
( discover overhang hangover overhung hungover overtake
takeover overturn turnover overacts )
Post by James Waldby
177 : abacus cuspid pidgin ginger gerbil billet lethal halter terser
serene enemas mascot cotton tonsil silent entice icebox boxcar carbon
...
Post by HenHanna
Post by James Waldby
hernia niacin cinder derail ailing ingest esters ersatz at 820.516 seconds
79: abjuring ringside sidekick kickback backbone bonehead headland
landlady ladyship shipload loadstar starfish fishtail tailgate
gatepost postdate dateline linefeed feedback backfire fireside
...
Post by HenHanna
Post by James Waldby
backrest restrain raindrop dropouts outshone honester sternest
nestling lingered at 54577.844678 seconds
my PC is slow... but i got curious... (about 6-letter/3-overlap)
........, massig,signal,nalfon, +
........, reshun,hungry,gryfon, +
95 ( fondak, dakota, otakus, kuskus, kussos, sossed, sedate, ateles,
lesses, sestet, tethys, hyssop, sophia, hiatus, tuscan, canell, ellops,
opsins, insoul, oulder, dermol, molten, tenure, uretal, talpas, passes,
seskin, kindie, dietal, tallis, lisbon, bonagh, aghast, astony, onymal,
malice, icecap, caplin, lingas, gassit, situps, upsend, endore, oregon,
gonads, adsorb, orbits, itself, elfish, ishtar, tarpot, potboy, boyaux,
auxins, inship, hippus, pusley, leymus, muslin, linsey, seyens, ensure,
uredia, diamat, matlos, losels, elsins, insunk, unkill, illing, ingoes,
oesogi, ogived, vedism, ismdom, domett, ettles, lessee, seeing, ingans,
anshar, hardim, dimply, plying, inglut, lutzes, zester, ternar, narica,
icarus, rushes, hestia, tiamat, mating, ingots )
Do you have a link to the dictionary you're using? Among the first 17
words (for example) of that sequence, only sedate and hiatus are in
linux's /usr/share/dict/american-english file.
Do you prefer that these words be excluded?
dublin, otaku, otakus, shinju, ohtani,
jonathan, eliza, mary, Biden, boston, oregon
Post by James Waldby
Post by HenHanna
i think... A long (straight) chain like this (below) is harder to find
(no dict for them)
Survey -
Monkey - Business - Casual - Sex - Work - Visa - Card - Shark - Tank
- Top - Secret - Agent - Orange - Julius - Caesar - Salad - Dressing
I think the basics of the problem are not more difficult, if a
dictionary of collocations were to be found. I don't know of a free
one, or whether the following would be accessible if paid for.
https://www.oxfordlearnersdictionaries.com/us/definition/collocations
i've used a bigger dict before, but stopped using that one,
when i realized that it was giving me so many words
i'd never seen and will never use or learn.

https://github.com/possibly-wrong/word-frequency/blob/main/word-frequency.txt



_________________ Rearrange the words into a chain of 2-word phrases:

Rocky Break Road Point Compound
Game Eye Open Level



(Hint: the ends (Rocky ... Level) don't move)

Loading...