Tech ARP Forums

Go Back   Tech ARP Forums > Site Updates & Promotions > Reviews & Articles
Register
FAQ Members List Calendar Arcade Mark Forums Read

Google Web www.techarp.com forums.techarp.com

Reviews & Articles There will be a post for every Tech ARP article. Come in here to discuss about your favourite article!

Reply
 
LinkBack Thread Tools
Old 28th Feb 2003, 12:31 PM   #21 (permalink)
Mgz
Newbie
 
Join Date: 28 Feb 2003
Posts: 1
Reputation: 0
Mgz is an unknown quantity at this point
Rep Power: 0
Default

I just wonder ?


Why this super guide doesn't have CAB format (with LZX algorithms "unbeatable " ,this is the format M$ used to pack their OS)

You can use M$ Tool (freeware) @ http://support.microsoft.com/default...;en-us;310618& or some shareware like Cabinet Manager 2002


How about state the algorithm each frogram use....like LZX(LZW,LZ77/LZ78,etc),Blowfish,Burrows-Wheeler block sorting text,Huffman, just a brief info
Mgz is offline   Reply With Quote
SPONSOR

Old 18th Mar 2003, 05:10 PM   #22 (permalink)
Hyperactive
 
drab's Avatar
 
Join Date: 4 Jan 2003
Location: spain
Posts: 2,509
Reputation: 76
drab will become famous soon enough
Rep Power: 8
Default

The only compressor i use is to dig the drive up!Discs are so big these days dont bother
__________________
NOTE.This is a private parking space.non permit holders will be clamped.
drab is offline   Reply With Quote
Old 4th Sep 2004, 06:35 AM   #23 (permalink)
Newbie
 
Join Date: 4 Sep 2004
Location: Netherlands
Posts: 1
Reputation: 0
Fulcrum2000 is an unknown quantity at this point
Rep Power: 0
Default Maximum Compression

Very nice compressor comparison you have on your site!. Great work.

Also have a look at Maximum Compression (http://www.maximumcompression.com/) for some more up to data benchmarks.
Fulcrum2000 is offline   Reply With Quote
Old 27th Dec 2005, 11:50 AM   #24 (permalink)
Newbie
 
Join Date: 27 Dec 2005
Posts: 2
Reputation: 0
MisterE is an unknown quantity at this point
Rep Power: 0
Default

Hi there,

I found my way to your article via Slashdot and I wanted to commend you on your efforts. That's a lot of work. I do have a few comments about your results.

(Full disclosure: I am a QA Engineer at Aladdin/Allume/Smith Micro)

First, you should make it clear that you are comparing compression formats, not compression applications. There is a distinction. Some applications can create compressed files in a variety of formats (StuffIt Standard can create archives in StuffIt X, Zip, gzip, bzip2 among others).

You might want to go into some detail about what each compression format is doing. One format may be using a very different algorithm than another. There are several common compression algorithims:

Deflate - LZ-Huffman - [see Wikipedia - Huffman Coding and Wikipedia - Deflate ] - a matching algorithm.

LZ-Arithmetic - [see Wikipedia - Arithmetic Coding ] - a matching algorithm.

BWT - "Burrows-Wheeler transform" also called "block-sorting compression" - [see Wikipedia - BWT and Wikipedia - Burrows-Wheeler Transform ] - the order of characters in a file are rearranged to increase redundancy and optimize compression size.

PPM - "Prediction by Partial Matching" - [see Wikipedia - PPM ] - a prediction algorithm, especially suited for text.

Zip, GZip use Deflate. BZip2 uses the BWT algorithm. Knowing what algorithm is being used can help explain why one format might compress a particular data set smaller, or faster. The StuffIt X format can use any of the above algorithms alone or in combination to get optimal compression for a given data set. The "Faster" or "Better" set is not necessarily best for any particular data set, but the custom settings give users a lot of control over how their data is compressed.

Your tests focus mainly on compression size and compression speed. Users need to consider a number of other factors as well when deciding which format to use for their archiving needs: decompression speed, cross-platform compatability, open standards (ie: non-proprietary format), security, or some combination of all of these. If using an open standard compression format is very important, than one might choose a format that doesn't compress as well. Also, compression speed is usually inversely related to compression size. If you are only compressing files a little, you can usually compress them really fast. If you are trying to compress them as much as possible, then it usually takes a long time. Since compression algorithms look for the redundancy in files, files that have little redundancy (eg: lossy compressed files such as MP3) take a lot of effort and give little return. It's like putting a crushed soda can into a trash compactor. It may get a little smaller, but you probably aren't going to see great results, so you may not want to bother.

A few issues with your StuffIt results:

You give StuffIt high marks for JPEG compression, but you should note that there are two different StuffIt formats: the older StuffIt (.sit) format and the newer StuffIt X format introduced in 2000. JPEG compression is an enhancement added to StuffIt X at the beginning of 2004. It is not part of the older StuffIt format. It is proprietary and requires StuffIt to expand, but it does get up to 30% compression on JPEG files which previously were considered difficult to compress (see below).

You fault the StuffIt (X) format for not compressing certain file types (such as MP3 files), but there is a default setting in StuffIt (the application) to not compress already compressed items (I believe the windows version says "Do not Recompress Items"). If you uncheck this option and re-run your tests you will see results more comparable to the other compression applications. But as noted above, the time and effort used to compress files that have little redundancy can be better spent, so the StuffIt app defaults to adding the files to the archive without further compression.

This post is already too long. Sorry...I'm not trying to sell StuffIt here, I just want you to give it a fair review.

Thanks!

--Eric K
MisterE is offline   Reply With Quote
Old 28th Dec 2005, 02:05 AM   #25 (permalink)
Da Boss
 
Join Date: 10 Oct 2002
Location: In front of my ASUS F8V notebook!
Posts: 30,124
Reputation: 3081
Adrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond repute
Rep Power: 67
Default

Quote:
Originally Posted by MisterE
Hi there,

I found my way to your article via Slashdot and I wanted to commend you on your efforts. That's a lot of work. I do have a few comments about your results.

(Full disclosure: I am a QA Engineer at Aladdin/Allume/Smith Micro)

First, you should make it clear that you are comparing compression formats, not compression applications. There is a distinction. Some applications can create compressed files in a variety of formats (StuffIt Standard can create archives in StuffIt X, Zip, gzip, bzip2 among others).

You might want to go into some detail about what each compression format is doing. One format may be using a very different algorithm than another. There are several common compression algorithims:

Deflate - LZ-Huffman - [see Wikipedia - Huffman Coding and Wikipedia - Deflate ] - a matching algorithm.

LZ-Arithmetic - [see Wikipedia - Arithmetic Coding ] - a matching algorithm.

BWT - "Burrows-Wheeler transform" also called "block-sorting compression" - [see Wikipedia - BWT and Wikipedia - Burrows-Wheeler Transform ] - the order of characters in a file are rearranged to increase redundancy and optimize compression size.

PPM - "Prediction by Partial Matching" - [see Wikipedia - PPM ] - a prediction algorithm, especially suited for text.

Zip, GZip use Deflate. BZip2 uses the BWT algorithm. Knowing what algorithm is being used can help explain why one format might compress a particular data set smaller, or faster. The StuffIt X format can use any of the above algorithms alone or in combination to get optimal compression for a given data set. The "Faster" or "Better" set is not necessarily best for any particular data set, but the custom settings give users a lot of control over how their data is compressed.

Your tests focus mainly on compression size and compression speed. Users need to consider a number of other factors as well when deciding which format to use for their archiving needs: decompression speed, cross-platform compatability, open standards (ie: non-proprietary format), security, or some combination of all of these. If using an open standard compression format is very important, than one might choose a format that doesn't compress as well. Also, compression speed is usually inversely related to compression size. If you are only compressing files a little, you can usually compress them really fast. If you are trying to compress them as much as possible, then it usually takes a long time. Since compression algorithms look for the redundancy in files, files that have little redundancy (eg: lossy compressed files such as MP3) take a lot of effort and give little return. It's like putting a crushed soda can into a trash compactor. It may get a little smaller, but you probably aren't going to see great results, so you may not want to bother.

A few issues with your StuffIt results:

You give StuffIt high marks for JPEG compression, but you should note that there are two different StuffIt formats: the older StuffIt (.sit) format and the newer StuffIt X format introduced in 2000. JPEG compression is an enhancement added to StuffIt X at the beginning of 2004. It is not part of the older StuffIt format. It is proprietary and requires StuffIt to expand, but it does get up to 30% compression on JPEG files which previously were considered difficult to compress (see below).

You fault the StuffIt (X) format for not compressing certain file types (such as MP3 files), but there is a default setting in StuffIt (the application) to not compress already compressed items (I believe the windows version says "Do not Recompress Items"). If you uncheck this option and re-run your tests you will see results more comparable to the other compression applications. But as noted above, the time and effort used to compress files that have little redundancy can be better spent, so the StuffIt app defaults to adding the files to the archive without further compression.

This post is already too long. Sorry...I'm not trying to sell StuffIt here, I just want you to give it a fair review.

Thanks!

--Eric K
Hello Eric!

Thanks for your comments. Yeah, it is a lot of work, and I've only compressed them at the fastest settings. It's going to take a lot more time for the other settings.

Actually, I did consider testing all supported formats of each compressor, but that would take a lot more time and effort. If the response is good and there is a significant number of requests for non-native formats to be tested in each compressor, I wouldn't mind adding them to the results.

But I would prefer to think that I was testing each data compressor in its native format. I was not actually comparing compression format per se since different compressors using the same format will have different results. It was really about the data compressors' ability at compressing their native formats.

Yes, I agree that users should consider other factors and not only the compression speed and performance when they choose a data compressor. This comparison guide was never meant to advise readers on the other factors. It is essentially a performance comparison of the few popular data compressors.

Now, regarding the issues on StuffIt.

1. Yup, I know about the old format. In fact, it was covered in the first version of this comparison guide. In this guide, we used the latest StuffIt X format to ensure maximum performance.

2. I think I overlooked that setting. I will need to check it out and redo the tests. Thanks for letting me know about it. :thumbs:

Thanks again for your comments. I appreciate you taking the time to help us improve the article. Do let us know if you have further comments on the article.

Thanks!
__________________
Dr. Adrian Wong
Tech ARP | Blog @ Tech ARP | The Free Trade Zone


DYKT : The only offshore account I have is at the sand bank?

Keep Tech ARP free! Visit our sponsors!

We need PROGRAMMERS and TECHNICAL WRITERS! Contact us if you are a hot shot programmer or technical writer!

My items for sale : 50x SD Card | Memory Stick PRO | Cyclone Energy Saver | Seiko SS watch | Tiger/Carlsberg beer jugs | Travel Speakers | Motorola V600 | Nokia N90 SOLD! | New Lowepro Mini Trekker AW

Other items for sale @ the FTZ : Zalman CNPS9500 LED @ $20 | Zalman CNPS7700 Cu @ $20 | Zalman CNPS7000 Cu @ $20 | Swarovski bracelet watches | Dell 17" LCD | Hi-Fi speakers | English DIVX movies | HP LaserJet toners! | Office chairs
Adrian Wong is offline   Reply With Quote
Old 28th Dec 2005, 04:41 AM   #26 (permalink)
Newbie
 
Join Date: 28 Dec 2005
Posts: 2
Reputation: 0
cranstone is an unknown quantity at this point
Rep Power: 0
Default

Hi Adrian,

Can you send me a link to the full report - i'm interested in learning more. Many thanks.

(Full disclosure - I'm one of the co-inventors of mod_gzip for Apache, the first PD module of it's kind for compressing data in real time from an Apache web server)

All the best,

Peter
cranstone is offline   Reply With Quote
Old 28th Dec 2005, 04:06 PM   #27 (permalink)
Newbie
 
Join Date: 28 Dec 2005
Posts: 1
Reputation: 0
CeeJay is an unknown quantity at this point
Rep Power: 0
Default

Adrian ...

If you're trying to find the fastest compressor, you really need to test LZOP http://www.lzop.org/
Other compression comparisons consistently rank it as the fastest compressor on the planet.

Seeing Peter Cranstone post here, impels me to point out that rojakpot.com is not being served gzip-compressed and you could save a huge amount of bandwidth by doing that.
And the site would load a lot faster to boot.
CeeJay is offline   Reply With Quote
Old 29th Dec 2005, 03:27 AM   #28 (permalink)
Newbie
 
Join Date: 28 Dec 2005
Posts: 2
Reputation: 0
cranstone is an unknown quantity at this point
Rep Power: 0
Default

Hi Adrian,

A quick query on Netcraft shows that your running site is running :http://forums.rojakpot.com was running Microsoft-IIS on Windows Server 2003

Many moons ago we Kevin and I actually built a server side IIS compression filter which was faster than Microsoft's version.

I don't know what the current state of the art is on IIS but I would recommend turning on compression. All the current blog sites are using it and bandwidth savings are considerable - let alone faster load times for the viewer (customer) (ad server)

All the best,


Peter
cranstone is offline   Reply With Quote
Old 29th Dec 2005, 09:26 AM   #29 (permalink)
ARP Webmaster
 
peaz's Avatar
 
Join Date: 13 Oct 2002
Location: http://atpeaz.placidthoughts.com/
Posts: 8,501
Reputation: 1633
peaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant futurepeaz has a brilliant future
Rep Power: 31
Default

Quote:
Originally Posted by cranstone
Hi Adrian,

A quick query on Netcraft shows that your running site is running :http://forums.rojakpot.com was running Microsoft-IIS on Windows Server 2003

Many moons ago we Kevin and I actually built a server side IIS compression filter which was faster than Microsoft's version.

I don't know what the current state of the art is on IIS but I would recommend turning on compression. All the current blog sites are using it and bandwidth savings are considerable - let alone faster load times for the viewer (customer) (ad server)

All the best,


Peter
Hi there Peter.. thanks for the response!

Ok, this may be a bit off topic for this forums but...
We have already turned on IIS Compressions. But I guess as you said, it's just not as efficient. I'll look into the available IIS compression filter. And I'd also like to try out the version you have built as well if you don't mind.

Anyways, email me at ken[at]rojakpot{dot}com for further discussions on this topic. Thanks and cheers!
peaz is offline   Reply With Quote
Old 30th Dec 2005, 04:07 AM   #30 (permalink)
Da Boss
 
Join Date: 10 Oct 2002
Location: In front of my ASUS F8V notebook!
Posts: 30,124
Reputation: 3081
Adrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond reputeAdrian Wong has a reputation beyond repute
Rep Power: 67
Default

Quote:
Originally Posted by cranstone
Hi Adrian,

Can you send me a link to the full report - i'm interested in learning more. Many thanks.

(Full disclosure - I'm one of the co-inventors of mod_gzip for Apache, the first PD module of it's kind for compressing data in real time from an Apache web server)

All the best,

Peter
Hello Peter,

I'm still working on the other tests. The Fastest tests alone took 4-5 days of solid testing.

I will try to get the other test results out ASAP.
__________________
Dr. Adrian Wong
Tech ARP | Blog @ Tech ARP | The Free Trade Zone


DYKT : The only offshore account I have is at the sand bank?

Keep Tech ARP free! Visit our sponsors!

We need PROGRAMMERS and TECHNICAL WRITERS! Contact us if you are a hot shot programmer or technical writer!

My items for sale : 50x SD Card | Memory Stick PRO | Cyclone Energy Saver | Seiko SS watch | Tiger/Carlsberg beer jugs | Travel Speakers | Motorola V600 | Nokia N90 SOLD! | New Lowepro Mini Trekker AW

Other items for sale @ the FTZ : Zalman CNPS9500 LED @ $20 | Zalman CNPS7700 Cu @ $20 | Zalman CNPS7000 Cu @ $20 | Swarovski bracelet watches | Dell 17" LCD | Hi-Fi speakers | English DIVX movies | HP LaserJet toners! | Office chairs
Adrian Wong is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT +8. The time now is 11:45 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0
Copyright © 1998-2007 Tech ARP. All rights reserved.