{"id":14045,"date":"2017-08-04T15:06:10","date_gmt":"2017-08-04T13:06:10","guid":{"rendered":"https:\/\/codisec.com\/?p=14045"},"modified":"2023-03-22T16:29:57","modified_gmt":"2023-03-22T15:29:57","slug":"compression-vs-encryption-visual-recognition","status":"publish","type":"post","link":"https:\/\/codisec.com\/compression-vs-encryption-visual-recognition\/","title":{"rendered":"Compression vs Encryption – Visual Recognition"},"content":{"rendered":"

When dealing with firmware images we often find large blobs of binary data which look totally random. The key question then is whether they are encrypted or compressed and which algorithm was used?<\/p>\n

During Veles<\/a> development I noticed that for most compression algorithms it is relatively easy to recognize them visually, which is the main subject of this article.<\/p>\n

Visual artifacts<\/h2>\n

One might expect that compressed data should look the same as encrypted on statistical tests, but this turns out not to be true for most compression algorithms, even when looking at data stream without headers. Although there are already some good articles about this property from the \/dev\/ttyS0 team (the authors of binwalk<\/a>; first art<\/a>, second art<\/a>), I decided to also discuss this topic extending it with recognition between different compressions and adding some fancy visualizations from Veles<\/a> (if you are interested in how they work you can read this article<\/a>).<\/p>\n

For example, consider Doom1.WAD<\/a> compressed using deflate<\/a> (wrapped in gzip<\/a> format) in Veles trigram view and scroll to a random part of it (to exclude headers). At first glance the data may look totally random, but if you start rotating the cube you\u2019ll notice many subtle artifacts. Let’s take a look at them with AES-CBC-encrypted data for comparison:<\/p>\n