Okay, I am determined to get this site up and running! This is just a place holder but I will be updating this site shortly with a lot more information. I do have a clear idea of what I think it should be so stay tuned!
Not huge news nor that new in fact, but nonetheless kind of interesting.
The NetApp DS4243 now can support the new 3TB drives! I am trying to think of the rebuild times in my head…. Anyways, some details:
- 18TB per rack unit
- 72TB per shelf
- Must be running ONTAP 8.0.2 or later
You may ask why is this is interesting?
Well it helps when configuring your array. If you can purpose a full shelf for tier 3 data and just archive stuff you could get rid of tape, data-domain box, etc. I don’t know the cost as of now for a fully configured array, but it will drop over time and you could get away with a lot of space, albeit they are a tad slow being only 7.2K drives.
Also, I find it interesting that large capacity drives are still getting R&D, support etc when performance depends so much on spindle count and near line memory. I do wonder how high it will go and what purposes these 3TB+ drives will provide in the future.
Just a thought.
“The only constant is change.” Heraclitus
It is funny how fast time can seem to go by. I took a small break from blogging due to my hectic work life and found that it ended up being months since my last real post. This morning I was gently reminded of how long it has been and here I am back at it.
For my first post back from my unofficial hiatus I wanted to give a little background on some of my personal/work life that has shifted (positively) that prompted the forced break. Warning this has nothing to do with VMware, virtualization or Technology. Read at your own discretion.
First and foremost my amazing wife (hoping she is reading this) and I are expanding our family by one in November. This would be our first child and it has been, to say the least, a colossal paradigm shift full of excitement, anxiety, and accelerated transformation. We are more than thrilled, however it has made us (me mostly) think about every detail and different aspects of our lives and actually focus on the future with a much stronger intent.
This new and very mature train of thought led us down a path of some more immense changes. For example, we decided to basically gut and remodel our house. I decided a minor career change was in order, more on that later, and we begin the adjustment period of preparing for a larger family. Our 2-year Boston Terrier has no idea what is in store. For that matter I don’t think I do either.
Ok, now back to the job change. I decided to leave my wonderfully cozy familiar office (with a view) and some amazing people to work from home (no view) for a very tiny, may not even be the right adjective, virtualization and storage consulting company. Let me put it another way, I went from a 5+ year job at a nationally awarded, highly recognized 1000+ person technology services company to a 20 person local, although well known, hyper focused consulting firm. It was an enormous professional adjustment. As time goes on I will post more on this change, overall it has been amazing and I am neck deep in the most advance and new technology that is out there. It is really cool.
So to recap: I have been exceedingly busy learning the new gig, putting my house back wall at a time, and trying to pick which college my child will attend that most of my “outside of work” things have been put on hold.
The good/great news I am settled and getting back to the swing of things and I have a lot to write about over the next few weeks. The industry had again changed, rapidly and with VMworld coming up there is a significant amount of new and exciting stuff coming!
Thanks for your patience during my vacation.
Some exciting news today!
NetApp made a pretty significant announcement about a stronger partnership with CommVault to resell their SnapProtect suite. This will include their Simpana 9’s snapshot copies, replication and tape management. If you are unfamiliar with CommVault, they make a software based backup solution.
What does this really mean?
Well in short, it will provide a much-needed improvement to for their disk-to-disk-to-tape process. There will a new console that will provide a single source to manage Snapshots, replication and tape backups. Should reduce response times and help improve overall experience, since you are not managing a bunch of different windows. Some pretty cool things, you will be able to monitor the creation of snapshot, when it is copied, where it is moved and provide a quicker search and recover experience, I could see some good uses for this.
As well, it is application aware, integrating nicely with virtualization, databases, and other applications. This will provide faster backup and a more granular experience. Also, since it uses the SnapProtect Agent it can adjust to specific applications to help with data in transit
Anyone who has used and restored from tapes can tell you it was a pain, and NetApp didn’t have it down by any stretch of the imagination. Their NDMP method wasn’t for the faint of heart and most people required a 3rd party agent such as Backupexec, or CommVault to make it useful, this will be a HUGE step forward.
Please note, this isn’t free and will cost extra, details still unknown on price.
Well this shows that secondary storage is anything but dead! It is going to be a point of concern for a lot of people because data growth is still growing at a rapid pace and we are running out of places to put it. Disk may get cheaper, but long term archiving doesn’t.
It was a nice idea and people have been trying for years to get rid of tape from their environment but this new partnership tells me that a true tapeless environment may be harder to achieve than anticipated, especially with some of the larger industry compliance requirements (HIPPA, etc.)
I am curious to see the responses from EMC, HP and DELL, I am sure they have something coming soon or will highlight some of their current technologies. So we should see things heat up, This is why I love this stuff!!
In April, sorry late post, EMC announced the release of Avamar 6.0, which is a pretty significant release in many ways. For starters it is the first true integration with Data Domain technology, leveraging that technology will provide a more optimized, and scalable recovery tool. This will also, allow the Avamar to expand into much larger customers and actually play in the enterprise field, big relief for those larger environments.
What else will this include? It will allow for centrally managed backups, simultaneous workloads, and selective processing based on a Data Domain or Avamar data store. Finally, because Data Domain distributes the Deduplication load differently from Avamar’s it will allow for larger streaming loads and much faster restores.
I was wondering when EMC would include this. Prior to the acquisition of Data Domain, Avamar was considered a direct competitor. While the two product lines lived in very specific use cases, the merging of these technologies will go a long way. If you want more information, I would suggest checking out this link on the details of Avamar.
Beyond the Data Domain addition, Avamar is now leveraging the VMware vStorage APIs with full integration with VMware vCenter. With this incorporation it will allow for changed block tracking, virtual proxy server pooling, and flexible image restore, simplifying recovery. All very useful updates..
There was a ton of activity in the storage arena this last week, while every week lately seems to be rich with news, the coexistence of EMC World and Interop down in Las Vegas really helped drive the flow of new information. There were the frequent press releases, new product introductions, model updates, and my personal favorite, the exaggerated claims from all vendors to have the solution to solve all your storage needs in a single appliance!
I hate to be the one to break the bad news, but there is no single solution to our epic storage problems. I use the word epic because it is a perfect way to describe the current problem. This large-scale issue will not be addressed with a single manufacturer or single appliance or even single piece of software. In fact, I would argue these types of solutions are making the problem worse, more on that later.
Before I get to far, I do want to point out that I don’t necessarily think this is a bad thing. In fact, it is why I love the industry; no matter what there will be something newer, faster, better etc. It is a gift that keeps on giving and giving. Technology gets better, applications get smarter, and it is a wonderful cycle. It has also provided a multi-billion, soon to be trillion dollar, industry that keeps a lot of people, including myself employed,
Everyone knows that the problem comes from the fact that as people, companies, society, even countries we are all information hoarders. We keep a ridiculous amount of data on just about anything and everything. I have witnessed first hand at clients site. Going through a public file store and come across a “1999 HR handbook rev2” trash it? Why goodness no, we may need that one day. Will what about the “1999 HR handbook Rev. 5 or 6 or 7?” Nope need all of those too. Okay, makes sense, no not really actually!
There is a general fear that one calm day your boss or a legal department will ask you to produce a document from nine years ago about god knows what and you wouldn’t be able to find it. Highly unlikely, but still a consistent fear among people I speak to.
So what does all this hoarding cause? It causes a lot stress an anxiety for your IT staff and puts a large burden on available resources. Because on top of them trying to find room to have their production (money making applications) they now have to find a place to put your family vacation photos or 80’s music collection (money costing) that ended up on the corporate network somehow, that is another article all together, but it does happened often.
Here is the raw truth; data is going to double, and then double again. I don’t need a cool IDC chart or Gartner fact here. It is going to happen, everyone is seeing it. I remember when my 1.44mb floppy was a good size or my 100MB hard drive would never fill. We have and are still living in a naïve world about the amount of data and its growth potential. Just know this, if data is in a flexible portable form (e-mail, pdf, mp3, doc, xls, etc) it will be copied and copied and copied again. So now we kind of know that data is going to explode, so what can you do about it?
My favorite new solution, cloud storage! Not really, I am being sarcastic. Cloud not only doesn’t solve this problem, it actually makes it much worse.
Case in point. Customer A pays good money to ship information to the cloud. Customer is then advised to keep a local set of data onsite incase of failure, you know because it isn’t 100% perfect yet. Cloud vendor will have a failure sometime (murphy’s law) and you are happy because you had the local data the whole time. But guess what, same problem, you have all the data on site and now out in the cloud. So in a way you have actually doubled the amount of your data produced, local and remote, and worse, you have paid another operational expense to something that may or may not work.
Plus don’t get me started on other factors of cloud networks; restore times, network accessibility, visibility of your data, and much more. I will address this in a later article, but know that there is a lot to consider before jumping to a public cloud infrastructure. I won’t beat up on cloud too much, I think it is a good idea with large possibilities, but it is still in beta and new.
SAN, NAS, and Appliances:
This is the most common form to fix the problem right? There is always a new box with faster specs, bigger hard drives, more efficient software, who knows what else. What I can tell you, this is making the problem much worse. A typical engagement for me with a customer is to ask how long a new SAN solution should last?
Most say four to five years, I say cut that in half and then double your budget (not a popular decision, but long run will pay off.) It has to do with two points of theory I have:
First, your data will grow faster than you would ever guess, even in best-case scenarios, period! I have never ever seen a customer’s data grow less than predicated or even for that matter, grow at the rate they thought, everyone is always off the mark, by a lot. I know why they are, but I won’t talk about it here; just know it has to do with saving face with management when asking for a lot of money.
Second, with the rise of the amount of data and the size of files the ability to access this data faster always comes in to play. And if any vendor says they are upgrade proof, I will advise that with a grain of salt There are some creative solutions to help future proof a SAN but no one can predict if the new 10000GB card will work with anything on your current SAN, so even if you didn’t fill it you will need to upgrade to match what your environment is demanding based on SLA”s. So in short, your best guess will always be inadequate.
How to fix it?
Here is the bad news. There isn’t a way to fix this , as far I am concerned. You remember when I mentioned that these new SAN’s and other appliances are making it worse?
It is because of saying stop and revaluate the way and what information you really are storing is credible and worthwhile. They say, “Buy more, and buy bigger.” It goes back and forth and there are new technologies like dedudplicaiton and inline compression that helps alleviate the situation, but it is a Band-Aid at best. Plus if you say today you have 15TB of data and you hit 30TB 5 months later I can just sell you a new box at a time and then again and again and pretty soon you will be at 100TB and beyond!
One thing that always cracks me up is that customers will say the cost of data is getting cheaper. Yes, on one hand a 2TB drive from Best buy cost $150 vs. $1,000 a few years ago it doesn’t address the fact that because of compliance, poor storage planning and meager data policies you end up buying a lot more disk to hold the exact same data! Even with Dedup you end up having primary SAN and then a second appliance (more disk) to store all that data! Plus the cost of enterprise drives stays the same because they get faster, smarter, more reliable and so forth.
What needs to be done?
Companies need to start putting serious capital into their budgets every year and not just every 3-5 years. Policies need to be intact to help reduce the amount of data that needs to be kept on hand. Finally, there needs to be a good affordable true dependable archiving solution, I am not talking tapes. A media that will last forever if need be and cost pennies per TB to store for as long as needed, still science fiction.
Long article short, just understands storage isn’t static and you will always be needed to think about ways to make your data storage more efficient
It seems that Compellent doesn’t get talked about as much as it should. Not that I am particularly biased, but I find that they do offer a wider variety of features and benefits that work great however doesn’t carry the same marketing glamor that the other vendors provide. I think DELL will change this of course.
Even during the DELL acquisition, Compellent was listed as a consolation prize to 3Par and wasn’t sort of an afterthought for DELL and the rest of the storage industry. While I don’t know the business politics behind it, I do know that Compellent does make a great storage platform that has a pretty strong following and for good reason.
There is a lot to talk about regarding features, however I wanted to spend sometime on what I think is their new secret sauce but little known feature: RAID 6:10. Which was released in 2010.
Lets start out with their RAID SYSTEM:
First and foremost RAID 5 is not best practice for secure protection anymore. As of last week, I had two customers complain about loss data and SAN failures around a RAID 5 set. The old adage of it being highly unlikely two drives could fail at the same time just doesn’t seem practical anymore. While some people argue statistically it is worth the risk, I would argue due to the larger RAID groups and disk size RAID, rebuilds almost guarantee a second disk failure. For example I have seen a 2TB drive take up to a day or more to fully rebuild, that is a long time to wait and hope nothing more goes wrong.
RAID 6 is commonly suggested but tends to be ruled out do to poor performance on the write portion, and poor storage efficiency.
Compellents solution is to use a “Fluid Data Architecture” which is really a system of combining RAID 10 with a secondary array at RAID 6. This idea is to use the speed and full redundancy of RAID 10DM (Dual mirror) with RAID 6 to provide a more efficient use of disk in conjunction to the high speed read access. They have labeled this RAID 6 with DATA Progression, or RAID 6:10.
- Full protection against dual drive failures
- RAID 10 write speeds while eliminating the 40% typical RAID 6 write overhead
- 80% storage efficiency with 8 data disk and 2 parity disks
- 1000x of DATA protection compared to RAID 5
RAID 6 Usage:
It is made for larger slower drives, SATA, around 900GB or larger. It isn’t really needed for 10K or 15K SAS/FC since it is made as more of a lower tier of data progression. Since the rebuild times are much fasted on FC and SAS which would be used for much more important tiers.
Some Things to consider:
Since RAID 6 is storing twice the parity information as RAID 5, it will require more raw disk to net the same amount of usable space. This will incur a cost increase when building a solution, for example RAID 5 would give you 10TB usable with 11.25TB RAW, to get the same amount you would need 12.5TB RAW, which is 11% more storage needed. But the benefits usually offset the increase in cost, and you can end up saving the money on the expensive disk to offset anything extra you would pay on SATA disk.
Over the last few years I have had a ridiculous amount of conversations around deduplication. The benefits are abundant (up to 90% reduction in storage) and for the most part purchasing is a no brainer; can get rid of tapes, faster restore, limited to no impact on performance, etc…
One obstacle that tends to stand in the way however, is the cost of such a useful technology as an appliance can start around $30,000 and for a proper solution you would need two of them (primary and secondary.)
With that said, I did come across a open source option today, and if you are brave enough you can have your very own deduplication appliance with nothing more than some time and old server equipment!
I would note that this is “use at your own risk.” Data (in my humble opinion) is the most valuable asset of any company can have, and while the idea of a free solution is nice, I would not trust my tier 1 data (tier 2 as a matter a fact) with this till you have really, really tested it. Remember, there is no support numbers or anyone to call if something goes wrong. You are on your own!
What is it?
Opendedup makes a file-system call SDFS is not only free but does legitimate inline deduplication! Below is the highlights from the website:
- Cross Platform Support – Works on Linux or Windows.
- Reduced Storage Utilization – SDFS Deduplication can reduce storage utilization by up to 90%-95%
- Scalability – SDFS can dedup a Petabyte or more of data. Over 3TB per gig of memory at 128k chunk size.
- Speed – SDFS can perform deduplication/redup at line speed 290 MB/S+
- VMWare support – Work with vms – can dedup at 4k block sized. This is required to dedup Virtual Machines effectively
- Flexible storage – deduplicated data can be stored locally, on the network across multiple nodes, or in the cloud.
- Inline and Batch Mode deduplication – The file system can dedup inline or periodically based on needs. This can be changed on the fly
- File and Folder Snapshot support – Support for file or folder level snapshots.
The flexibility and performance looks good and it does support 128k, 64k, 32k, and 4k block sizes, not bad! Below are some of the graphs showing performance:
Today it was leaked, by CEO Pat Gelsinger, that EMC will be launching a project aimed at accelerating the I/O between servers and backend storage. This idea has been addressed by another company, Fusion-io, however there has been some known technical hiccups. This new product is known as “Project Lightening” and there isn’t too many details as of now, but this is what we do know:
- It is a partnership between Intel (Mr. Gelsinger spent 30 years at Intel) and Micron labeled IM Flash Technologies (IMFT), they have been working on this for awhile
- As of now, they will be called P320h and rely on the P300 SSD and SLC NAND Flash Memory
- The cards should be in beta customers hands by the second half of this year, depending on how successful the trial go
- There is talks to integrate this with EMC other SSD technology, FAST Cache
- As of now, no plans to compete against HP or IBM on the server side, but to add another options for highend needs
At EMC World today, Pat Gelsinger, EMC President, Information Infrastructure Products detailed a multi-faceted strategy that is designed to further drive adoption of this technology, lower costs for customers and dramatically speed storage and application performance. The strategy includes:
- A new PCIe/flash-based server cache technology– code-named “Project Lightning” – due later this year that will move data closer to the processor to dramatically accelerate performance. Integrated flash in the server as cache and as storage in the array, combined with EMC FAST software, creates a single intelligent I/O path – from the application to the data store. The result is a networked infrastructure dynamically optimized for performance, cost and availability and significantly more reliable than implementations relying on flash as direct-attached storage in the server.
- EMC plans to design, test and qualify MLC-based SSDs for enterprise-class applications and incorporate them into EMC systems later this year, making enterprise flash storage more affordable.
- EMC has sold and delivered several all-flash Symmetrix VMAX arrays to customers with extremely demanding I/O workloads. Later this quarter, all-flash Symmetrix VMAX arrays will be offered as a standard configuration option.
- EMC later this year also plans to introduce a new all-flash configuration of its VNX unified storage system that will enable support of more virtual servers and more intense workloads. As part of industry benchmark testing, an all-flash VNX system recently demonstrated record performance.
- To help facilitate these projects, EMC has created a dedicated Flash business unit to identify and exploit new market opportunities, new technologies and create and manage strategic partner and supplier relationships.
No word from Fusion-io as yet…
Link to the press release here:
Well it didn’t take long for NetApp to use some of the Engenio products from their LSI purchase last year. NetApp announced a new line of products, E-Series Platform that will leverage some of the technology gained from the lSI acquisition.
According to a press release today NetApp will have the E5400 Storage system available for OEM’s. It was classified as high performance for big-bandwidth applications, extreme storage density and exceptional uptime. It is apparent that this will be targeted towards the enterprise-class market and will focus on scalability and reliability.
The focus of the announcement seems to be centered around full-motion video storage solution and the adoption of Big-analytics applications. This looks like a direct play against the EMC acquisition of Isiolon last year. The video solution mentions the use of StorNext from quantum and is directed towards U.S. Public Sector and the analytics is made specifically for Hadoop and comes in preconfigured nodes. From the press release:
- The E5400 provides OEMs with a proven architecture, with over 300,000 E-Series systems shipped and a history of more than 20 years of design knowledge and firmware development.
- The E5400′s compact 4U form factor integrates controllers and drives to maximize storage density while reducing operational expenditures.
- The E5400′s fully redundant design and online administration enable continuous high-speed data access.
Pretty exciting stuff. I would say this is a great lead in to the true Enterprise-Class storage systems that NetApp has sought over the last couple of years. More details on the E5400 series can be found here.