When we look at high-performance disks, why do the fastest disks only have 10 percent capacity when compared to high-capacity disks? They are not using older technology and they have the same platters, but use only a fraction of the cylinders in order to constrain the range of arm movements. Why is this limitation important?

Performance requires Parallel processing

When microprocessors hit the performance wall, the solution was to introduce multiple cores. This means that you must parallelize your algorithms so the threads can run concurrently in different hyper-threads. The problem now becomes how you write out the result.

One solution could be to have a separate thread that collects the results of each compute thread and writes the overall result file. However, this thread would become a bottleneck: this is not a good approach. A different solution could be to have each thread writing a file, then combine the results in a post-processing step. This is also not a good approach, because the merge step would be performed in a single thread.

Fortunately, high-performance computing has always been parallel, so there are good solutions at hand. The most popular one is the Message Passing Interface (MPI), for which there are several open-source implementations. The parallel IO feature is sometimes called MPI-IO, and it can challenge a disk system. Consequently, in analytics we now have IO problems at a level previously known only for web services and database transaction systems.

With this, computing the number of required terabytes for an application is the easy part. The difficult metric to determine is the maximum number of IOs serviced per second (IOPS), or its inverse, the disk service time. Along with the disk utilization rate (U), the disk service time determines the IO response time for an application. It is the sum of the seek time T, rotational latency L, and the internal transfer time XTS = T+L+X.

The EMC book “Information Storage and Management” (account required) has an example on page 43:

Assume a disk with a seek time of 5 ms for random IO, 15,000 rpm, and 40 MB/s internal transfer rate with 32 KB block size. Then TS = 5 + (0.5/250) + 32/40 = 7.8 ms, yielding a maximum number of IOs serviced per second or IOPS of 1 / TS = 1 / (7.8 · 10-3) = 128 IOPS. The application response time R increases with disk controller utilization. In the above example, with a 96% utilization we obtain a response time of R = TS / (1 – U) = 7.8 / (1 – 0.96) = 195 ms. The response time can be reduced by reducing the disk utilization rate below 70%, but then the number of IOPS is also reduced.

When DC is the number of disks required to meet capacity and DI is the number of disks required to meet the application IOPS requirement, the number of disks required is DR = max (DCDI). Continuing the above example, if an application requires 1.46 TB and generates a peak workload of 9,000 IOPS, then using a 146 GB, 15,000 rpm drive capable of 180 IOPS gives TC = 1.46 T / 146 G = 10 disks. However, TI = 9,000/180 = 50, but if the application is response-time sensitive, 70% disk utilization is more realistic, resulting in 126 IOPS or TI = 9,000/126 = 72 and DR = max (DCDI) = 72 disks.

Choosing the right disk for the job

In summary, in this case 10 disks would be sufficient from a capacity point of view, but to fulfill the disk service time requirement we need 72 disks. This is why in practice solid state drives (SSD or flash memory) are the most economical solution to your storage problems. This is also a reason why enterprise disks for storage typically have a small capacity: only a small number of cylinders are used to limit arm movements over an area not needed anyway.

This example shows how you can plan your data access system to fulfill the IO requirements typical for today’s parallel analytics workloads.