compression,
Moderator: SG Admins
The problem with compression league tables is that there are too many "what if's". If you define the source file as containing a large string of pseudo-random binary data, the compressors will all have a very hard time making much of it. I.e. it will become difficult to find any one algorithm that will perform that much better than any other.
If you then use a more "real world" dataset as your source data, such as word documents, bmps etc, you can achieve colossol compression rates becuase there is so much junk "space" that repeats itself. But then you get involved into a lengthy argument of what exactly IS a good (read "fair") input data file.
It is the very predictibility of the data stream being fed into the compression algorithm that determines the results. The more predictable or repeatable patterns contained in a source file, the more of this data that can be described with less and less output data.
There is also a further complication in that different input data stream patterns compress better with different algorithms. Also, if you intend the resultant file to be repairable in case of file damage, then extra redundancy has to be added (interleaved with bit spreading) to the output data stream, thus increasing the file size again.
If one had loads of time available or very fast processing speeds, the best solution would be to come up with a compression utility that can switch the algorithm on the fly, dependant on the input data content. This can easily be done using parallel processing threads, with each thread running a different algorithm. One could then compare the resultant files sizes of each output stream and save whichever was best. (Thinks: Doesn't Winzip do this anyway if you choose "best compression"?)
I used a similar technique when I wrote a decryption algorithm for a certain military radio modem, the software ran multiple parallel decoding streams, with different seeds, with which ever one producing the highest correlation of plain text output being chosen to generate the actual outputted data stream. Although the genuine equipment was supposd to keep rotating keys on a constant basis to "foil the enemy", my decoder was able to keep a steady output en clair all the time. As long as the input data was clean and error free of course.
But, anyway, I digress. (Often, I've been told!)
If you then use a more "real world" dataset as your source data, such as word documents, bmps etc, you can achieve colossol compression rates becuase there is so much junk "space" that repeats itself. But then you get involved into a lengthy argument of what exactly IS a good (read "fair") input data file.
It is the very predictibility of the data stream being fed into the compression algorithm that determines the results. The more predictable or repeatable patterns contained in a source file, the more of this data that can be described with less and less output data.
There is also a further complication in that different input data stream patterns compress better with different algorithms. Also, if you intend the resultant file to be repairable in case of file damage, then extra redundancy has to be added (interleaved with bit spreading) to the output data stream, thus increasing the file size again.
If one had loads of time available or very fast processing speeds, the best solution would be to come up with a compression utility that can switch the algorithm on the fly, dependant on the input data content. This can easily be done using parallel processing threads, with each thread running a different algorithm. One could then compare the resultant files sizes of each output stream and save whichever was best. (Thinks: Doesn't Winzip do this anyway if you choose "best compression"?)
I used a similar technique when I wrote a decryption algorithm for a certain military radio modem, the software ran multiple parallel decoding streams, with different seeds, with which ever one producing the highest correlation of plain text output being chosen to generate the actual outputted data stream. Although the genuine equipment was supposd to keep rotating keys on a constant basis to "foil the enemy", my decoder was able to keep a steady output en clair all the time. As long as the input data was clean and error free of course.
But, anyway, I digress. (Often, I've been told!)
Politics: 'Poli' in Latin means 'many' and 'tics' means 'bloodsucking creatures'.
Surely he can't be a dyslexic programmer...
Sits on verandah, opens a beer, loads shotgun and awaits "politically correct" misinterpretation of posting...

Sits on verandah, opens a beer, loads shotgun and awaits "politically correct" misinterpretation of posting...

"And regrettably your planet is one of those scheduled for demolition"
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
Now this time you HAVE completely lost me.Seahorse wrote:Surely he can't be a dyslexic programmer...![]()
p.s. why did some evil person come up with a word that no dyslexic person could ever spell right anyway!? And, why is "abbreviated" such a long word? Answers on a £20 note to me asap please.
Politics: 'Poli' in Latin means 'many' and 'tics' means 'bloodsucking creatures'.
-
- LAN Admin-Monkey
- Posts: 259
- Joined: Sun Oct 27, 2002 5:03 pm
- Location: Basingstoke
- Contact:
Explanation:Sparks wrote:Now this time you HAVE completely lost me.Seahorse wrote:Surely he can't be a dyslexic programmer...![]()
p.s. why did some evil person come up with a word that no dyslexic person could ever spell right anyway!? And, why is "abbreviated" such a long word? Answers on a £20 note to me asap please.
One of my pet hates is Leet (sic) speak...
If music is the food of love, why do CDs taste so bad

"And regrettably your planet is one of those scheduled for demolition"
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
There are acceptable internet abbreviations (BTW) and the there is Leet...
Mobile phone users are in a world of their own...
Mobile phone users are in a world of their own...

"And regrettably your planet is one of those scheduled for demolition"
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
- Spey
- Mega-pe0n
- Posts: 226
- Joined: Mon Oct 28, 2002 10:59 am
- Location: behind you.... Harder than: Michael Dorn in heat baby!!
- Contact:
Sparks wrote:Now this time you HAVE completely lost me.Seahorse wrote:Surely he can't be a dyslexic programmer...![]()
p.s. why did some evil person come up with a word that no dyslexic person could ever spell right anyway!? And, why is "abbreviated" such a long word? Answers on a £20 note to me asap please.
and why does rythym have NO VOWELS ffs...what a freak of a word
narcissism, my only pleasure in life.
because y is not a vowel because it can be used as a sound, 'yay' as well as a vowel sort of thingy, rhy[color=#44444][/color]thym.
and they choose sound!
its a lot like science, when faced with 2 choices, "i wonder which end the electrons come out of", they generally feck it up to annoy us all! and we end up with electrons coming out of the negative, but of course we have to put that power flows from the positive because it is a 'convention' and the exam board are in love with it.
and they choose sound!
its a lot like science, when faced with 2 choices, "i wonder which end the electrons come out of", they generally feck it up to annoy us all! and we end up with electrons coming out of the negative, but of course we have to put that power flows from the positive because it is a 'convention' and the exam board are in love with it.
You haven't considered that science pretty much revolves around conventions and standards? We wouldn't get very far without them..
mid_gen - www.the-midfield.com
Perhaps his deoderant is made by Lynx...Spey wrote: and why does rythym have NO VOWELS ffs...what a freak of a word
"And regrettably your planet is one of those scheduled for demolition"
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
It is also a word with no vowels, continuning the thread. Maintaining the 'rythm' as it were...
"And regrettably your planet is one of those scheduled for demolition"
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs
Rgds
Mike
Dead-Fish, Deep Sea Daddies...
My DVDs