I first heard about data compression through perceptual coding more than ten years ago, having lunch in a pub just outside of the SSL factory, which I was visiting for the day.
The guy from SSL was telling me 'off the record' that they had linked up with a company that was able to encode audio into just four bits, when CD-quality is sixteen.
CD-quality is pretty good at 16-bit resolution. Some of the early samplers worked at 12-bit and even 8-bit resolution, and they were pretty grungy. So 4-bit resolution - how could that possibly work?
That part was kept secret from me at the time. But the technology did work. And further developments led to MP3, which can reduce an audio file's size to one-eleventh or less and still sound pretty good.
And the rest is history (as will MP3 be, when people realize that AAC - Advanced Audio Coding - is much better!).
But how does it work?
Simple - MP3 discards any aspect of the audio that we are not likely to notice. And it's amazing how much we do ignore.
For instance, if there is a particularly high level at a certain frequency at any moment, the ear won't notice another nearby frequency that is at a lower level. So it might as well be discarded.
Same with time. If a loud sound occurs at a certain time, a quieter sound that occurs just before or just after will not be noticed, so it can again be discarded.
Of course, it depends how far you want to take all of this. For most people, reducing the data rate to 128 kilobits/second is far enough. I would put that on a par with the cassette format in terms of the degree of degradation. MP3 can encode to even lower data rates, but the side effects will show.
One thing puzzles me though. What I would really love to be able to hear is what the audio that is thrown away actually sounds like! Now that would be interesting indeed.
Anyone know how that can be done?Come on the FREE COURSE TOUR