Citizendia
Your Ad Here

In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium. Information theory is a branch of Applied mathematics and Electrical engineering involving the quantification of Information.

One of the main types of entropy coding assigns codes to symbols so as to match code lengths with the probabilities of the symbols. Probability is the likelihood or chance that something is the case or will happen Typically, these entropy encoders are used to compress data by replacing symbols represented by equal-length codes with symbols represented by codes where the length of each codeword is proportional to the negative logarithm of the probability. This article is about proportionality the mathematical relation In Mathematics, the logarithm of a number to a given base is the power or Exponent to which the base must be raised in order to produce Therefore, the most common symbols use the shortest codes.

According to Shannon's source coding theorem, the optimal code length for a symbol is −logbP, where b is the number of symbols used to make output codes and P is the probability of the input symbol. Claude Elwood Shannon (April 30 1916 – February 24 2001 an American Electronic engineer and Mathematician, is "the father of Information In Information theory, Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possible Data compression, and the operational

Two of the most common entropy encoding techniques are Huffman coding and arithmetic coding. History In 1951 David A Huffman and his MIT information theory classmates were given Arithmetic coding is a method for Lossless data compression. Normally a string of characters such as the words "hello there" is represented using a fixed number of If the approximate entropy characteristics of a data stream are known in advance (especially for signal compression), a simpler static code may be useful. In Telecommunication, the term signal compression has the following meanings In analog (usually audio systems reduction of the Dynamic range of a signal These static codes include universal codes (such as Elias gamma coding or Fibonacci coding) and Golomb codes (such as unary coding or Rice coding). In Data compression, a universal code for integers is a Prefix code that maps the positive integers onto binary codewords with the additional property that whatever Elias gamma code is a universal code encoding positive integers In Mathematics, Fibonacci coding is a universal code which encodes positive integers into binary Code words All tokens end with "11" and have Golomb coding is a Data compression scheme invented by Solomon W Unary coding is an Entropy encoding that represents a Natural number, n, with n  &minus 1 ones followed by a zero Golomb coding is a Data compression scheme invented by Solomon W

Entropy as a measure of similarity

Besides using entropy encoding as a way to compress (and losslessly recover) digital data, an entropy encoder can also be used to measure the amount of similarity between streams of data. This is done by generating an entropy coder/compressor for each class of data; unknown data is then classified by feeding the uncompressed data to each compressor and seeing which compressor yields the highest compression. The coder with the best compression is probably the coder trained on the data that was most similar to the unknown data.

External links


An earlier (open content) version of the above article was posted on PlanetMath.


© 2009 citizendia.org; parts available under the terms of GNU Free Documentation License, from http://en.wikipedia.org
Dapyx Software network: MP3 Explorer | Ebook Manager | Zenithic