Will deduplication deduplicate data between two files with identical content if one is compressed but the other is not?


Last modified date: 2024-10-25

Applicable Products

  • QuTS hero
  • Storage & Snapshots

Short Answer

In most cases, only some data between the two files will be deduplicated.


Background

Compression and deduplication are data reduction features in QuTS hero that help reduce the amount of space occupied by your data. 

  • Compression removes redundant data within a file.
  • Deduplication eliminates duplicates of data within a shared folder or LUN.

You can enable or disable each feature separately for each shared folder and LUN in Storage & Snapshots > Storage > Storage/Snapshots, under the Data Reduction column. (Note: To enable deduplication, your NAS must have at least 8 GB of memory.) 


Scenario

You first add a file to a shared folder with both compression and deduplication enabled. Later, you disable compression on the shared folder, and then add the same file under a different file name. 

You end up with two files with identical content, but the first one is compressed and the second one is uncompressed.

Question: Will deduplication consider the data in the two files to be the same and therefore deduplicate the data between them?


Long Answer

Because compression and deduplication are both block-based technologies, they determine data redundancy by comparing data blocks.

If all data in the first file is successfully compressed, then all the data blocks in the first file would be different from all the data blocks in the second file, where compression did not occur. Therefore, there will be no deduplication between the two files.

However, it is uncommon for all data within a file to have redundancy, so in most cases not all data blocks within a file would be compressed. In such cases, some data blocks in the first file would remain identical to some data blocks in the second file, and therefore such blocks would be deduplicated.

Note
  • Enabling or disabling compression only affects new data. Existing data remains compressed or uncompressed according to the compression setting when the data was added to the shared folder or LUN.
  • Enabling or disabling deduplication only affects new data. Existing data remains deduplicated or undeduplicated according to the deduplication setting when the data was added to the shared folder or LUN.

Further Reading

Was this article helpful?

100% of people think it helps.
Thank you for your feedback.

Please tell us how this article can be improved:

If you want to provide additional feedback, please include it below.

Choose specification

      Show more Less

      Choose Your Country or Region

      open menu
      back to top