| By Thomas Krafft | Article Rating: |
|
| December 4, 2012 07:45 AM EST | Reads: |
2,741 |
Wikibon produced an interesting material (looks like paid by Aerospike, NoSQL database recently emerged by resurrecting failed CitrusLeaf and acquihiring AlchemyDB, which product, of course, was recommended in the end) that compares NoSQL databases based on storing data in flash-based SSD vs. storing data in DRAM.
There are number of factual problems with that paper and I want to point them out.
Note that Wikibon doesn’t mention GridGain in this study (we are not a NoSQL datastore per-se after all) so I don’t have any bone in this game other than annoyance with biased and factually incorrect writing.
“Minimal” Performance Advantage of DRAM vs SSD
The paper starts with a simple statement “The minimal performance disadvantage of flash, relative to main memory…”. Minimal? I’ve seen number of studies where performance difference between SSDs and DRAM range form 100 to 10,000 times. For example, this University of California, Berkeley study claims that SSD bring almost no advantage to the Facebook Hadoop cluster and DRAM pre-caching is the way forward.
Let me provide even shorter explanation. Assuming we are dealing with Java – SSD devices are visible to Java application as typical block devices, and therefore accessed as such. It means that a typical object read from such device involves the same steps as reading this object from a file: hardware I/O subsystem, OS I/O subsystem, OS buffering, Java I/O subsystem & buffering, Java deserialization and induced GC. And… if you read the same object from DRAM – it involves few bytecode instructions – and that’s it.
Native C/C++ apps (like MongoDB) can take a slightly quicker route with memory mapped files (or various other IPC methods) – but the performance increase will not be significant (for obvious reason of needing to read/swap the entire pages vs. single object access pattern in DRAM).
Yet another recent technical explanation of the disadvantages of SSD storage can be found here (talking about Oracle’s “in-memory” strategy).
MongoDB, Cassandra, CouchDB DRAM-based?
Amid all the confusion on this topic it’s no wonder the author got it wrong. Neither MongoDB, Cassandra or CouchDB are in-memory systems. They are disk-based systems with support for memory caching. There’s nothing wrong with that and nothing new – every database developed in the last 25 years naturally provides in-memory caching to augment it’s main disk storage.
The fundamental difference here is that in-memory data systems like GridGain, SAP HAHA, GigaSpaces, GemFire, SqlFire, MemSQL, VoltDB, etc. use DRAM (memory) as the main storage medium and use disk for optional durability and overflow. This focus on RAM-based storage allows to completely re-optimized all main algorithms used in these systems.
For example, ACID implementation in GridGain that provides support for full-featured distributed ACID transactions beats every NoSQL database (EC-based) out there in read and even write performance: there are no single key limitations, no consistency trade offs to make, no application-side MVCC, no user-based conflict resolutions or other crutches – it just works the same way as it works in Oracle or DB2 – but faster.
2TB Cluster for $1.2M :)
If there was on piece in the original paper that was completely made up to fit the predefined narrative it was a price comparison. If the author thinks that 2TB RAM cluster costs $1.2M today – I have not one but two Golden Gate bridges to sell just for him…
Let’s see. A typical Dell/HP/IBM/Cisco blade with 256GB of DRAM will cost below $20K if you just buy on the list prices (Cisco seems to offer the best prices starting at around $15K for 256GB blades). That brings the total cost of 2TB cluster well below $200K (with all network and power equipment included and 100s TBs of disk storage).
Is this more expensive that SSD only cluster? Yes, by 2.5-3x times more expensive. But you are getting dramatic performance increase with the right software that more than justifies that price increase.
Conclusion
2-3x times price difference is nonetheless important and it provides our customers a very clear choice. If price is an issue and high performance is not – there are disk-based systems of wide varieties. If high performance and sub-second response on processing TBs of data is required – the hardware will be proportionally more expensive.
However, with 1GB of DRAM costing less than 1 USD and DRAM prices dropping 30% every 18 months – the era of disks (flash or spinning) is clearly coming to its logical end. It’s normal… it’s a progress and we all need to learn how to adapt.
Has anyone seen tape drives lately?
Read the original blog entry...
Published December 4, 2012 Reads 2,741
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Thomas Krafft
Working on my unfunded startup, and contributing to several early stage ventures - all of which is technically different than being unemployed. No. Really. You can find me professionally at http://www.linkedin.com/in/krafft
- Cloud People: A Who's Who of Cloud Computing
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- Cloud Business Solutions, Social Media, and Platform Systems of Engagement Market Shares, Strategies, and Forecasts, Worldwide, 2013 to 2019
- ExtraHop Named a Best of Interop 2013 Finalist for Two Awards: Best Cloud and Virtualization Product and Best Monitoring and Management Product
- Kevin Benedict’s What’s New in HTML5 – Week of May 19, 2013
- Interop Las Vegas Previews News Announcements from over 60 Exhibitors & Sponsors
- BrightScope Releases Top 25 Technology Companies With the Best 401k Plans
- Research and Markets: Cloud Business Solutions, Social Media, and Platform Systems of Engagement
- Mobile Commerce News Weekly – Week of May 5, 2013
- This Week in Cloud, May 9, 2013: U.K. issues cloud-first policy, Dell acquires Enstratius, OpenStack’s growing pains. And more…
- TeamDrive Partners with SmartOffice to Offer First Office Productivity Solution with End-to-End Encryption
- Les plateformes de serveurs de Supermicro® utilisent la technologie NVIDIA GRID™ pour fournir les performances graphiques accélérées aux bureaux virtuels
- Cloud People: A Who's Who of Cloud Computing
- AMD and Adobe Collaborate on Upcoming Version of Adobe Premiere Pro Software to Enable Breakthrough Video Editing Performance Through Open Standards
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- Cloud Business Solutions, Social Media, and Platform Systems of Engagement Market Shares, Strategies, and Forecasts, Worldwide, 2013 to 2019
- Apple Makes Highly Eccentric Hire
- ExtraHop Named a Best of Interop 2013 Finalist for Two Awards: Best Cloud and Virtualization Product and Best Monitoring and Management Product
- Kevin Benedict’s What’s New in HTML5 – Week of May 19, 2013
- Interop Las Vegas Previews News Announcements from over 60 Exhibitors & Sponsors
- BrightScope Releases Top 25 Technology Companies With the Best 401k Plans
- Adobe Drives Innovation With New Video Workflows at NAB 2013
- Research and Markets: Cloud Business Solutions, Social Media, and Platform Systems of Engagement
- Mobile Commerce News Weekly – Week of May 5, 2013
- Where Are RIA Technologies Headed in 2008?
- Cover Story: How to Increase the Frame Rates of Your Flash Movies
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Your First Adobe Flex Application with a ColdFusion Backend
- Adobe Flex 2: Advanced DataGrid
- How To Create a Photo Slide Show ...
- i-Technology Blog: Death-Knell For "Rich Media? Hardly!
- Personal Branding Checklist
- Adobe Flex Interface Customization - Themes, Styles, Skins
- Adobe/Macromedia - Microsoft, Look Out!
- Has the Technology Bounceback Begun?
- "Real-World Flex" by Adobe's Christophe Coenraets























