QuTS MEGA Operating System
for Enterprise Scale-out Storage

QuTS MEGA is built on a Ceph distributed storage architecture, integrating Service High Availability and data protection mechanisms to provide a stable and scalable storage platform for growing enterprise data.

Erasure Coding further enhances the system by delivering node-level fault tolerance while maintaining high storage efficiency.

Why Enterprises Need a QuTS MEGA Scale-out Solution

The common challenges include sustained data growth, zero tolerance for service disruption, and the need for predictable data protection and operational manageability.

Financial Industry

Regulation-driven long-term data retention

Transaction records, call recordings, and audit data must be retained long term without risk of loss. With high availability and robust data protection mechanisms, capacity can be continuously expanded without service interruption.

Healthcare & Research

Massive, continuously growing research data

Genomic, imaging, and research datasets continue to grow. With high-efficiency data protection and automated self-healing mechanisms, the platform provides long-term stability to support analytics and research workloads.

Semiconductor & Manufacturing

High-volume, long-term image data retention

Process images and surveillance recordings accumulate rapidly. The Scale-out architecture expands alongside production growth, while automatic rebalancing prevents performance and management bottlenecks.

Comprehensive Capabilities for Diverse Storage Needs

Built on a unified scale-out architecture, QuTS MEGA integrates file and object services with mainstream protocol support, enabling enterprises to scale as data grows.

One Platform Covering Services, Protocols, and Scalability

Designed for enterprise-grade availability, with clear capabilities and deployment specifications for POC and production.

Storage Types

A unified platform for multiple data formats

File Storage Ideal for shared folders, departmental collaboration, and image/file archiving scenarios.
Object Storage Designed for long-term retention, application integration, and S3 API connectivity.

Protocols

Compatible with enterprise applications and access methods

SMBCommonly used file-sharing protocol in Windows and Active Directory environments.
NFSWidely adopted file service protocol in Linux and R&D environments.
S3 APIStandard object storage interface for application integration and data lake architectures.

Scalable Architecture

A clear path from initial deployment to PB-scale growth

3–96 Node Scale-out Start with 3 nodes and scale up to 96 nodes, achieving PB-scale storage with high availability.
Non-disruptive expansion Nodes can be added as needed, with automatic rebalancing and built-in data protection.

※ Actual capacity and performance may vary depending on cluster size, service configuration, and data protection policies (such as EC or Replication).

Core Capabilities

Built on a Linux and Ceph distributed architecture with high availability, delivering an enterprise-grade storage platform with redundancy, fault tolerance, and scalability.

High Availability

Continuous service operation even during node failures

Data Redundancy

Replication Ensures data availability through multiple data copies, ideal for scenarios requiring fast access and high reliability.
Erasure Coding Leverages Ceph’s distributed EC mechanism to provide efficient data protection through mathematical algorithms, maintaining fault tolerance while optimizing storage efficiency.

Fault Tolerance

Service Distribution Services run across multiple nodes and automatically recover or migrate when a node fails, maintaining external service availability.
Self-healing Automatically reconstructs lost data using replicas or parity, preserving data integrity while minimizing manual intervention.

Operational Continuity

Dynamic Rebalancing Automatically redistributes data when nodes are added or removed, maintaining redundancy consistency and preventing hotspots to ensure balanced system performance.
Rolling Upgrades Performs system upgrades and maintenance without service interruption, ensuring continuous operations and service availability.
Data Storage Sustainability Built on a Ceph distributed architecture, enabling linear scaling of capacity and performance from a minimum of 3 nodes up to 96 nodes, supporting long-term enterprise data growth.

Enterprise Security and Compliance

Active Directory Integration Integrates with existing enterprise AD environments to provide centralized authentication and unified access control, simplifying user management.
Audit Log Records system operations and data access activities, providing a complete audit trail to meet compliance and security analysis requirements.
Write Once Read Many（S3 WORM） Immutable object locking mechanism that prevents data modification or deletion, meeting regulatory compliance requirements in industries such as finance and healthcare.

Erasure Coding (EC) Protection Overview

Using EC 4+2 as an example: “4 data fragments” + “2 parity fragments” distributed across 6 nodes, allowing up to 2 nodes to fail simultaneously without data loss.
QuTS MEGA supports multiple EC configurations (such as 8+2, 8+3, etc.), enabling flexible selection of capacity efficiency and protection levels based on requirements.

Visual Explanation: 4 Data + 2 Parity

A file is split into 6 fragments across 6 nodes: 4 data (D1–D4) and 2 parity (P1–P2). Even if 2 nodes fail, the data can still be reconstructed.

Note: This illustrates EC 4+2. Other configurations such as 8+2, 8+3, or 16+4 are available to meet different capacity and protection requirements.

EC 4+2 Node Distribution (Example Configuration)

NODE 1

Data

NODE 2

Data

NODE 3

Data

NODE 4

Data

NODE 5

Parity

NODE 6

Parity

Scenario 1: 2 Node Failures ✔︎ Data Protected

Even if Node 2 and Node 5 fail, the system can reconstruct the complete dataset from the remaining fragments (D1, D3, D4, P2), with no data loss.

Scenario 2: 3 Node Failures ✕ Data Loss

When 3 or more nodes fail simultaneously, the remaining fragments are insufficient to reconstruct the data, which may result in data loss. This exceeds the fault tolerance scope of EC 4+2.

Fault Tolerance (EC 4+2 Example) Supports up to 2 simultaneous node failures without data loss. Other configurations, such as 8+3, can tolerate up to 3 node failures.

Capacity Efficiency (Configurable) EC 4+2 ≈ 66.7%; EC 8+2 ≈ 80%; EC 8+3 ≈ 72.7%. Configurations can be selected based on desired protection level and storage efficiency.

※ This illustrates protection capability at the node-level failure domain. QuTS MEGA supports multiple EC configurations (such as 4+2, 8+2, 8+3, 16+4, etc.), allowing selection based on cluster size, workload characteristics, and protection requirements. Actual read/write availability may depend on cluster settings (e.g., min_size, service-layer HA, and load design).

Service Distribution

Services run across multiple nodes. When a node fails, services automatically recover and migrate to ensure continuous cluster availability.

Automatic Service Recovery Mechanism

Reduce single points of failure and enhance overall system availability

Service Distribution and Automatic Failover Illustration

✔︎ Normal State: Services distributed across multiple nodes

NODE 1

SMB

Service

NFS

Service

NODE 2

Service

MGR

Service

NODE 3

MON

Service

OSD

Service

NODE 4

RGW

Service

MDS

Service

⬇

⚠ Node 2 Failure → Automatic Service Migration

NODE 1

SMB

Service

NFS

Service

NODE 2

—

Failed

NODE 3

MON

Service

OSD

Service

S3 ↺

Migrated

NODE 4

RGW

Service

MDS

Service

MGR ↺

Migrated

When Node 2 fails, the S3 and MGR services originally running on it automatically migrate to Node 3 and Node 4,
ensuring uninterrupted service with no user impact.

Automatic Failure Detection

Continuously monitors node health, quickly identifying failures and triggering recovery processes.

Automatic Service Migration

Automatically migrates services from failed nodes to healthy nodes, ensuring uninterrupted service.

Load Distribution

Intelligently distributes services across multiple nodes to prevent overload on a single node and improve overall performance.

Zero Manual Intervention

Fully automated failure recovery reduces operational burden and minimizes the risk of human error.

Business Value

Ideal for 24×7 operations, high-concurrency workloads, and mission-critical applications requiring high availability. Significantly reduces the business impact of service disruptions while improving user experience.

Self-healing

Automatically detects and reconstructs lost or corrupted data to maintain data integrity and protection status, without manual intervention.

Intelligent Data Recovery Mechanism

Automatically rebuild data using replicas or parity to ensure long-term data integrity

Automatic Data Reconstruction Illustration (3-Replica Example)

✔︎ Normal State: Data stored with 3 replicas across nodes

NODE A

Data

Primary

File_001.mp4

File_002.jpg

File_003.pdf

NODE B

Data

Replica 1

File_001.mp4

File_002.jpg

File_003.pdf

NODE C

Data

Replica 2

File_001.mp4

File_002.jpg

File_003.pdf

⬇

⚠ Disk failure detected on Node B, data loss occurs

NODE A

Data

Primary

File_001.mp4

File_002.jpg

File_003.pdf

NODE B

Data

Lost ✖︎

✖︎ Data Lost

Rebuilding...

NODE C

Data

Replica 2

File_001.mp4

File_002.jpg

File_003.pdf

⬇

✔︎ Self-healing: Data automatically rebuilt to a new disk from Node A or Node C

NODE A

Data

Primary

Copying...

NODE B

Data ↺

Rebuilt

File_001.mp4 ✔︎

File_002.jpg ✔︎

File_003.pdf ✔︎

NODE C

Data

Replica 2

File_001.mp4

File_002.jpg

File_003.pdf

When data loss is detected on Node B, the system automatically copies intact data from Node A or Node C,
restoring the 3-replica protection level without manual intervention and ensuring long-term data reliability.

Continuous Health Monitoring

Regularly scans data integrity and proactively detects corrupted or missing data blocks.

Automatic Data Reconstruction

Rebuilds lost data automatically using replication copies or Erasure Coding parity, restoring full data integrity without manual intervention.

Protection Level Restoration

Automatically restores the original protection level after reconstruction, preventing prolonged degraded states.

Repair Progress Tracking

Reduces operational overhead and human error, making it ideal for long-term data retention, compliance, and mission-critical data protection.

Business Value

Reduces operational burden and labor costs while minimizing human error risks. Ideal for long-term data retention, regulatory compliance, and mission-critical data protection scenarios, ensuring long-term data reliability.

Dynamic Rebalancing

Automatically redistributes data when nodes are added or removed, maintaining redundancy consistency and preventing storage hotspots.

Intelligent Data Redistribution Mechanism

Ensures balanced cluster resource utilization for optimal performance and capacity efficiency

Dynamic Rebalancing Illustration: Automatic Data Migration After Adding a Node

⚠ Before Adding a Node: Uneven capacity usage across 3 nodes

NODE 1

Capacity Usage

85%

Data Chunks: 850

Used: 8.5 TB

NODE 2

Capacity Usage

82%

Data Chunks: 820

Used: 8.2 TB

NODE 3

Capacity Usage

88%

Data Chunks: 880

Used: 8.8 TB

⚠ Uneven capacity usage — Node 3 is nearing full capacity and may become a performance bottleneck

Add NODE 4 to the Cluster

⬇

Rebalancing in Progress: Data automatically migrating to the new node

NODE 1

Capacity Usage

70%

Chunks: 700 -150

Used: 7.0 TB

NODE 2

Capacity Usage

68%

Chunks: 680 -140

Used: 6.8 TB

NODE 3

Capacity Usage

72%

Chunks: 720 -160

Used: 7.2 TB

NODE 4

Capacity Usage

45%

Chunks: 450 +450

Used: 4.5 TB

⬇

✔︎ Rebalancing Complete: Balanced capacity across 4 nodes, optimized performance

NODE 1

Capacity Usage

64%

Chunks: 640 ✔︎

Used: 6.4 TB

NODE 2

Capacity Usage

62%

Chunks: 620 ✔︎

Used: 6.2 TB

NODE 3

Capacity Usage

66%

Chunks: 660 ✔︎

Used: 6.6 TB

NODE 4

Capacity Usage

63%

Chunks: 630 ✔︎

Used: 6.3 TB

✔︎ Even capacity distribution (62–66%) prevents hotspots and maintains optimal performanc

After adding Node 4, the system automatically migrates part of the data from Nodes 1–3 to the new node,
balancing capacity usage across all four nodes (approximately 62–66%) and preventing overload on any single node.

Automatic Integration of New Nodes

When a new node joins the cluster, the system automatically migrates part of the data to balance storage utilization.

Data Protection During Node Removal

Before a node is removed, data is automatically migrated to other nodes to ensure no data loss and maintain the configured protection level.

Hotspot Prevention

Automatically detects uneven load and redistributes data to prevent hotspots. Supports data disk suto metadata migration to optimize data and metadata placement during operation.

I/O Performance Priority

Provides Client I/O First and Recovery I/O First scheduling modes to safeguard critical service performance during rebalancing or data recovery operations.

Business Value

Supports flexible enterprise scalability, allowing nodes to be added gradually as business grows without service disruption. Maintains long-term performance stability, prevents capacity imbalance from degrading performance, and reduces expansion and operational complexity.

Monitoring & Alerting

Enhances operational responsiveness and collaboration through deep hardware monitoring, flexible alerting rules, and broad integration capabilities.

Hardware Monitoring & Diagnostics

Comprehensive Hardware Status Visibility

Monitors system fans, temperatures, and power module status in real time.

Hardware LED & Drive Locate

Quickly identify failed drives with visual hardware indicators.

S.M.A.R.T. Health Monitoring

Continuously tracks disk health to provide early warnings of potential risks.

Alert Notifications & Monitoring Integration

Prometheus + Alertmanager

Supports real-time notifications via Email, SNMP Traps, and Microsoft Teams.

SNMP & Third-Party Monitoring Platforms

Integrates with existing monitoring systems (such as PRTG Network Monitor).

QNAP Service War Room

Centralized visibility of cluster health, alerts, and events, enabling vendor-supported remote monitoring and proactive notification.

Cluster Scale & Node Models

A cluster can be established with as few as 3 nodes and expanded up to 96 nodes. Four node models are available, covering entry-level, large-capacity, high-density, and high-performance workload requirements.

QSN-3000

Entry-level Scale-out Node

Processor

Intel® Xeon® E-2336
6 Cores / 12 Threads

Memory

128GB DDR4 ECC UDIMM

Drive Configuration

12 × 3.5" SATA
6 × 2.5" SATA

Network Interface

2 × 10GbE BASE-T
2 × 2.5GbE BASE-T

Form Factor

2U Rackmount

Learn More

QSN-3050

High-capacity Node

Processor

Intel® Xeon® E-2378
8 Cores / 16 Threads

Memory

128GB DDR4 ECC UDIMM

Drive Configuration

24 × 3.5" SATA
6 × 2.5" SATA

Network Interface

2 × 10GbE BASE-T
2 × 2.5GbE BASE-T

Form Factor

4U Rackmount

Learn More

QSN-7530

High-performance Dense Node

Processor

AMD Ryzen™ 9 PRO 7945
12 Cores / 24 Threads

Memory

192GB DDR5 ECC RDIMM

Drive Configuration

30 × 2.5" SATA

Network Interface

2 × 10GbE BASE-T
2 × 2.5GbE BASE-T

Form Factor

2U Rackmount

Learn More

QuTS MEGA: A Scale-out Storage OS Built for Long-Term Growth

Built on a Ceph distributed storage architecture, QuTS MEGA delivers a continuously scalable storage foundation through a Scale-out design. It combines high availability and Erasure Coding data protection with monitoring, alerting, and non-disruptive upgrade mechanisms to reduce operational risk.

Contact Sales

QuTS MEGA Operating Systemfor Enterprise Scale-out Storage

Why Enterprises Need a QuTS MEGA Scale-out Solution

Regulation-driven long-term data retention

Massive, continuously growing research data

High-volume, long-term image data retention

Comprehensive Capabilities for Diverse Storage Needs

One Platform Covering Services, Protocols, and Scalability

Storage Types

Protocols

Scalable Architecture

Core Capabilities

High Availability

Data Redundancy

Fault Tolerance

Operational Continuity

Enterprise Security and Compliance

Erasure Coding (EC) Protection Overview

Visual Explanation: 4 Data + 2 Parity

Scenario 1: 2 Node Failures ✔︎ Data Protected

Scenario 2: 3 Node Failures ✕ Data Loss

Service Distribution

Automatic Service Recovery Mechanism

✔︎ Normal State: Services distributed across multiple nodes

⚠ Node 2 Failure → Automatic Service Migration

Automatic Failure Detection

Automatic Service Migration

Load Distribution

Zero Manual Intervention

Self-healing

Intelligent Data Recovery Mechanism

✔︎ Normal State: Data stored with 3 replicas across nodes

⚠ Disk failure detected on Node B, data loss occurs

✔︎ Self-healing: Data automatically rebuilt to a new disk from Node A or Node C

Continuous Health Monitoring

Automatic Data Reconstruction

Protection Level Restoration

Repair Progress Tracking

Dynamic Rebalancing

Intelligent Data Redistribution Mechanism

⚠ Before Adding a Node: Uneven capacity usage across 3 nodes

Rebalancing in Progress: Data automatically migrating to the new node

✔︎ Rebalancing Complete: Balanced capacity across 4 nodes, optimized performance

Automatic Integration of New Nodes

Data Protection During Node Removal

Hotspot Prevention

I/O Performance Priority

Monitoring & Alerting

Hardware Monitoring & Diagnostics

Comprehensive Hardware Status Visibility

Hardware LED & Drive Locate

S.M.A.R.T. Health Monitoring

Alert Notifications & Monitoring Integration

Prometheus + Alertmanager

SNMP & Third-Party Monitoring Platforms

QNAP Service War Room

Cluster Scale & Node Models

QSN-3000

QSN-3050

QSN-7530

QuTS MEGA: A Scale-out Storage OS Built for Long-Term Growth

QuTS MEGA Operating System
for Enterprise Scale-out Storage