Poetry Scanner (Ftrain.com)

Up: Box of Ideas for Future Language Toys

[Related] «^» «T »

Wednesday, December 6, 2000

Poetry Scanner

A possible tool to figure out where the accents are.

I want a computer program to scan poetry and show you the accents and the syllabic inflections. Such a program should be able to guess the meter of a poem - save for that poetry often uses archaic forms of language. I wonder if, ultimately, the problem is intractable for all but the most formally constructed and properly punctuated poems.

What's the point of doing this? Well, once you've broken everything up and gotten the syllables and accents, accurate by part of speech, broken out into a big array of data in the computer's memory, you can put it back together and display it in unique ways. For instance, you could show the number of stresses, counted to the right of each line, or you could analyze the pattern (heptameter! iambic pentameter!). The enterprising poetry analyst, undergraduate, graduate, or even professor-level, has a nifty tool to quickly map the linguistic patterns of a poem, especially if he or she uses the tool in conjunction with the Etymologizer.

One major function would be a rhymical-semantic-suggestotron. You could point to a word you wanted to replace, and the tool would go out and find all the cognitively related words it could that had the same rhythm. Voila!

You could also create some amusing, random, rhymic poetry via these methods.

You could also play around with such functions as “phoneme sort,” where you split out poems and lyrics by phonemes and sort the results to find out which phonemes have prevalence, or a “stress sort,” to find out which syllables get stress. I mean, it's not thrill-a-minute, but it would have helped me during that sophomore poetry 305 I took with Dr. Howard.

Given a block of poetry:

Parse it into sentences, not lines, but remember where the linebreaks are. We need to have full sentences in order to--

Apply a link grammar to each sentence, so that we know the part of speech of each word.

Break each word down into syllables, making sure you're doing it with the right part of speech. This is a problem. While there are web tools to tell you the phonetic breakdown of words, they use data intended for machine reading:

ABROGATED AE1 B R AH0 G EY2 T IH0 D

ABROGATING AE1 B R AH0 G EY2 T IH0 NG

ABROGATION AE2 B R AH0 G EY1 SH AH0 N

ABRON AH0 B R AA1 N

ABRUZZO AA0 B R UW1 Z OW0

ABROGATING AE1 B R AH0 G EY2 T IH0 NG

ABROGATION AE2 B R AH0 G EY1 SH AH0 N

ABRON AH0 B R AA1 N

ABRUZZO AA0 B R UW1 Z OW0

See the problem? It doesn't show you where the words break; abrogated should a*bro*gat*ed; abrupt should be ab*rupt, and trying to deduce where the split of sounds by reverse engineering the pronunciation codes is very difficult. One way around this is to parse the syllable information out of the Merriam Webster dictionary, originally published via Project Gutenberg, but now part of The GNU Dictionary

An ambitious dictionary project oriented around GNU philosophy; currenly offers an XML markup of a massive Merriam Webster dictionary of the early 1900's, integrated with WordNet data.

, and generate a syllabic (not pronouncing) dictionary. Except this doesn't have full syllabic information for every word, just a large number of root words. But it does have a good heap of information, and it does accent the syllables; it might be a starting point.

Actually, this idea isn't going to go anywhere. But it's nice to think about a little poetry machine, I find.

Ftrain.com

PEEK

Ftrain.com is the website of Paul Ford and his pseudonyms. It is showing its age. I'm rewriting the code but it's taking some time.

FACEBOOK

There is a Facebook group.

TWITTER

You will regret following me on Twitter here.

EMAIL

About the author: I've been running this website from 1997. For a living I write stories and essays, program computers, edit things, and help people launch online publications. (LinkedIn). I wrote a novel. I was an editor at Harper's Magazine for five years; then I was a Contributing Editor; now I am a free agent. I was also on NPR's All Things Considered for a while. I still write for The Morning News, and some other places.

If you have any questions for me, I am very accessible by email. You can email me at ford@ftrain.com and ask me things and I will try to answer. Especially if you want to clarify something or write something critical. I am glad to clarify things so that you can disagree more effectively.

Story

Theory

Colorless Green Ideas»

«Language Machines»

«Box of Ideas for Future Language Toys

Poetry Scanner»

POKE

Syndicate: RSS1.0, RSS2.0
Links: RSS1.0, RSS2.0

Contact

Recent

@20, by Paul Ford. Not any kind of eulogy, thanks. And no header image, either. (October 15)

Recent Offsite Work: Code and Prose. As a hobby I write. (January 14)

Rotary Dial. (August 21)

10 Timeframes. (June 20)

Facebook and Instagram: When Your Favorite App Sells Out. (April 10)

Why I Am Leaving the People of the Red Valley. (April 7)

Welcome to the Company. (September 21)

“Facebook and the Epiphanator: An End to Endings?”. Forgot to tell you about this. (July 20)

“The Age of Mechanical Reproduction”. An essay for TheMorningNews.org. (July 11)

Woods+. People call me a lot and say: What is this new thing? You're a nerd. Explain it immediately. (July 10)

Reading Tonight. Reading! (May 25)

Recorded Entertainment #2, by Paul Ford. (May 18)

Recorded Entertainment #1, by Paul Ford. (May 17)

Nanolaw with Daughter. Why privacy mattered. (May 16)

0h30m w/Photoshop, by Paul Ford. It's immediately clear to me now that I'm writing again that I need to come up with some new forms in order to have fun here—so that I can get a rhythm and know what I'm doing. One thing that works for me are time limits; pencils up, pencils down. So: Fridays, write for 30 minutes; edit for 20 minutes max; and go whip up some images if necessary, like the big crappy hand below that's all meaningful and evocative because it's retro and zoomed-in. Post it, and leave it alone. Can I do that every Friday? Yes! Will I? Maybe! But I crave that simple continuity. For today, for absolutely no reason other than that it came unbidden into my brain, the subject will be Photoshop. (Do we have a process? We have a process. It is 11:39 and...) (May 13)

That Shaggy Feeling. Soon, orphans. (May 12)

Antilunchism, by Paul Ford. Snack trams. (May 11)

Tickler File Forever, by Paul Ford. I'll have no one to blame but future me. (May 10)

Time's Inverted Index, by Paul Ford. (1) When robots write history we can get in trouble with our past selves. (2) Search-generated, "false" chrestomathies and the historical fallacy. (May 9)

Bantha Tracks. (May 5)

More...

Tables of Contents