Citizendia

This article is about the phonetic algorithm. For the Rock n' Soul band, see the SoundEx. TheSoundEx, also known as "The Sound Explosion" are a self-proclaimed Rock n' Soul band from Newcastle upon Tyne, England.

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. A phonetic algorithm is an Algorithm for indexing of Words by their Pronunciation. English is a West Germanic language originating in England and is the First language for most people in the United Kingdom, the United States The goal is for names with the same pronunciation to be encoded to the same representation so that they can be matched despite minor differences in spelling[1]. Soundex is the most widely known of all phonetic algorithms and is often used (incorrectly) as a synonym for "phonetic algorithm". A phonetic algorithm is an Algorithm for indexing of Words by their Pronunciation. In Rhetoric, metonymy (mɨˈtɒnɨmi is the use of a word for a concept or object associated with the concept/object originally denoted by the word Improvements to Soundex are the basis for many modern phonetic algorithms.

Contents

History

Soundex was developed by Robert Russell and Margaret Odell and patented in 1918[2] and 1922[3]. A variation called American Soundex was used in the 1930s for a retrospective analysis of the US censuses from 1890 through 1920. The Soundex code came to prominence in the 1960s when it was the subject of several articles in the Communications and Journal of the Association for Computing Machinery (CACM and JACM), and especially when described in Donald Knuth's magnum opus, The Art of Computer Programming. Communications of the ACM ( CACM) is the flagship monthly Journal of the Association for Computing Machinery (ACM The Journal of the ACM ( JACM) is the leading Scientific journal of the Association for Computing Machinery (ACM in the broad area of The Association for Computing Machinery, or ACM, was founded in 1947 as the world's first scientific and educational Computing society Communications of the ACM ( CACM) is the flagship monthly Journal of the Association for Computing Machinery (ACM The Journal of the ACM ( JACM) is the leading Scientific journal of the Association for Computing Machinery (ACM in the broad area of Donald Ervin Knuth (kəˈnuːθ (born 10 January 1938) is a renowned computer scientist and Professor Emeritus of the Art of Computer Magnum opus (sometimes Opus magnum, plural magna opera) from the Latin meaning great work, refers to the best the greatest The Art of Computer Programming is a comprehensive Monograph written by Donald Knuth that covers many kinds of Programming Algorithms

The National Archives and Records Administration (NARA) maintains the current rule set for the official implementation of Soundex used by the U. The United States National Archives and Records Administration ( NARA) is an independent agency of the United States federal government charged S. Government. [1] These encoding rules are available from NARA, upon request, in the form of General Information Leaflet 55, "Using the Census Soundex".

Rules

The Soundex code for a name consists of a letter followed by three digits: the letter is the first letter of the name, and the digits encode the remaining consonants. Similar sounding consonants share the same digit so, for example, the labials B, F, P, and V are each encoded as 1. Labials are consonants articulated either with both lips ( bilabial articulation or with the lower lip and the upper teeth ( labiodental articulation Vowels can affect the coding, but are not coded themselves except as the first letter.

The correct value can be found as follows:

  1. Replace consonants with digits as follows (but do not change the first letter):
    • b, f, p, v => 1
    • c, g, j, k, q, s, x, z => 2
    • d, t => 3
    • l => 4
    • m, n => 5
    • r => 6
  2. Collapse adjacent identical digits into a single digit of that value.
  3. Remove all non-digits after the first letter.
  4. Return the starting letter and the first three remaining digits. If needed, append zeroes to make it a letter and three digits.

Using this algorithm, both "Robert" and "Rupert" return the same string "R163" while "Rubin" yields "R150".

Soundex variants

A similar algorithm called "Reverse Soundex" prefixes the last letter of the name instead of the first.

The NYSIIS algorithm was introduced by the New York State Identification and Intelligence System as an improvement to the Soundex algorithm. The New York State Identification and Intelligence System Phonetic Code commonly known as NYSIIS is a Phonetic algorithm devised in 1970 as part of the New York State NYSIIS handles some multi-character n-grams and maintains relative vowel positioning, whereas Soundex does not. An n -gram is a sub-sequence of n items from a given Sequence.

The Celko Improved Soundex algorithm was introduced by Joe Celko in his book SQL For Smarties: Advanced SQL Programming. Joe Celko is a Relational database expert and Author from Austin Texas, United States.

As a response to deficiencies in the Soundex algorithm, Lawrence Philips developed the Metaphone algorithm for the same purpose. Metaphone is a Phonetic algorithm, an algorithm for indexing words by their sound when pronounced in English Metaphone is a Phonetic algorithm, an algorithm for indexing words by their sound when pronounced in English Philips later developed an improvement to Metaphone, which he called Double-Metaphone. The Double Metaphone search algorithm is a Phonetic algorithm written by Lawrence Philips and is the second generation of his Metaphone algorithm Double-Metaphone includes a much larger encoding rule set than its predecessor, handles a subset of non-Latin characters, and returns a primary and a secondary encoding to account for different pronunciations of a single word in English.

Daitch-Mokotoff Soundex (D-M Soundex) was developed by genealogist Gary Mokotoff and later improved by genealogist Randy Daitch because of problems they encountered while trying to apply the Russell Soundex to Jews with Germanic or Slavic surnames (such as Moskowitz vs. Daitch-Mokotoff Soundex (D-M Soundex is a Phonetic algorithm invented in 1985 by genealogist Gary Mokotoff, and later improved by Randy Daitch Moskovitz or Levine vs. Lewin). D-M Soundex is sometimes referred to as "Jewish Soundex" or "Eastern European Soundex" [4], although the authors discourage the use of these nicknames. The D-M Soundex algorithm can return as many as 32 individual phonetic encodings for a single name. Results of D-M Soundex are returned in an all-numeric format between 100000 and 999999. This algorithm is much more complex than Russell Soundex.

See also

References

  1. ^ a b The Soundex Indexing System. Metaphone is a Phonetic algorithm, an algorithm for indexing words by their sound when pronounced in English The New York State Identification and Intelligence System Phonetic Code commonly known as NYSIIS is a Phonetic algorithm devised in 1970 as part of the New York State National Archives and Records Administration (2007-05-30). The United States National Archives and Records Administration ( NARA) is an independent agency of the United States federal government charged Year 2007 ( MMVII) was a Common year starting on Monday of the Gregorian calendar in the 21st century. Events 1416 - The Council of Constance, called by the Emperor Sigismund a supporter of Antipope John XXIII burns Jerome of Prague following Retrieved on 2007-06-07. Year 2007 ( MMVII) was a Common year starting on Monday of the Gregorian calendar in the 21st century. Events 1099 - The First Crusade: The Siege of Jerusalem begins
  2. ^ US1,261,167 (PDF version) (1918-04-02) R. Year 1918 ( MCMXVIII) was a Common year starting on Tuesday (link will display the full calendar of the Gregorian calendar (or a Common Events 68 - Galba, Governor of Hispania, names himself legatus senatus populique Romani, breaking the line of C. Russell (title unknown) 
  3. ^ US1,435,663 (PDF version) (1922-11-14) R. Year 1922 ( MCMXXII) was a Common year starting on Sunday of the Gregorian calendar. Events 1533 - Conquistadors from Spain under the leadership of Francisco Pizarro arrive in Cajamarca, Inca C. Russell (title unknown) 
  4. ^ Mokotoff, Gary (2007-09-08). Year 2007 ( MMVII) was a Common year starting on Monday of the Gregorian calendar in the 21st century. Events 70 - Roman forces under Titus sack Jerusalem. 1264 - The Statute of Kalisz Soundexing and Genealogy. Retrieved on 2008-01-27. 2008 ( MMVIII) is the current year in accordance with the Gregorian calendar, a Leap year that started on Tuesday of the Common Events 98 - Trajan becomes Roman Emperor after the death of Nerva.

External links

NOTES FOR EDITORS "Perl" is not an acronym (read the "Name" section below CPAN is an Acronym standing for Comprehensive Perl Archive Network, an archive of over 12200 modules of software written in Perl, as well as documentation

Dictionary

Soundex

-noun

  1. (informatics) A phonetic algorithm for indexing names by their English pronunciation, based on the most probably significant consonants, so that a search for a misspelled name may find the desired one.
© 2009 citizendia.org; parts available under the terms of GNU Free Documentation License, from http://en.wikipedia.org
Dapyx Software network: MP3 Explorer | Ebook Manager | Zenithic