I have heard a a lot of FUD flying around about Thin Provisioning recently. People are blogging and putting stuff out there as if Thin Provisioning is this very dangerous concept as if it’s too risky to deploy. Basically all Thin Provisioning boils down to is oversubscription. Oversubscription seems to be a dirty word in some peoples vocabulary. There is certainly nothing wrong with oversubscription if done properly. Just about every technology out there has some form of oversubscription going on.
Ethernet, Telcos, ISP’s, Bank Vaults, Electricity, Highways
All of the above are oversubscribed. If everyone were to simultaneously consume resources there would not be enough. Ethernet switches for example are oversubscribed. Even though the backplane may be non-blocking it is typical that the uplinks in many situations are oversubscribed. Fiber Channel storage is oversubscribed. If we weren’t oversubscribing we would be doing DAS, which is the opposite direction, instead we use SAN’s which by there very nature are oversubscribed. Obviously there are limits to this oversubscription, so we deal with generally acceptable fan-in/fan-out ratios and most importantly we look at what resources we need for our applications. There are a host of technologies that help us deal with oversubscription, for example in FC we have FCC/QoS/etc. The telephone system and ISP’s are heavily oversubscribed, from the perspective that if everyone demanded to make a phone call or download maximum speed simultaneously, the system would not be able to support it. But these systems are based on statistical multiplexing which is nothing more than someone has done the numbers to figure out what acceptable performance needs to exist during the busiest time. And the telephone system has safeguards in place for special situations, for example some customers have the ability of priority and preemption, so no matter what they will be able to place the call and someone else would get dropped.
What makes Thing Provisioning different?
Nothing really. So we have Thin Provisioning, the ability to divvy out to systems more storage than we realistically have as true usable capacity. Once again, you don’t just configure these things blindly, you base the amounts your giving on real data and low risk probabilities. The big fear is that some admin is going to get a call at 3am from his monitoring system or NOC that the disk is full on the companies high priority server. This would be very bad indeed. The reality is, is that even with static storage this scenario can happen if things go terribly wrong. The key is to make sure servers have enough drive space so they are properly sized. You should also have enough space for some decent growth and an amount of space for contingency if something goes wrong. But you can look at all this space as a whole, aggregated and then make that calculation on how much space you would need “worse case scenario”. Then you should carefully monitor the situation so that you can be comfortable that you have sized everything properly. In the end you have gained a much more efficient system and money saved can go into other parts of the system, like high performance disk cache or SSD’s.
So Thin Provision everything?
Definitely not! There are definitely trade-offs with Thin Provisioning. Every application needs to be looked at separately. Some applications are not supported on thin volumes, some applications are too unpredictable for thin volumes. Essentially when you make volumes thin, you are compressing the footprint of the IO’s in the SAN. This has performance impacts, but can be dealt with to some degree by rebalancing the drives in the SAN. So there are parts in a SAN where it may absolutely make sense to use Thin Provisioning and then there are parts that may make sense to not use it. Also in some cases to meet RPO/RTO’s you may need to change the RAID of the underlying storage to a more expensive design, for example going from RAID-5 to RAID-6. So that is a cost that needs to be considered.
All Thin Provisioning should be looked at is as a tool in the toolbox. It is not a black or white choice you make on how you setup your entire storage environment, but rather a choice being made per application. Just like many other storage applications out there consideration must be given to performance, scalability, efficiency, and the benefits you stand to gain from implementation. Obviously disk manufacturers don’t directly benefit from Thin Provisioning so there is bound to be some misinformation out there. We are basically oversubscribing a lot of things in the datacenter and storage should be looked at just the same, there are situations where you would want to do it and situations where you would not. I believe however that in almost any environment there are places for Thin Provisioning to be used and I welcome the technology and it’s future improvements.