Category Archives: Fusion

SLC and MLC NAND Chips for Fusion IO Cards

On Wednesday I attended a training put on by Fusion IO. The training is a pre-cursor to a future certification program for Fusion Cards. It was the first time the training was being given so in a way we were beta testing the training program. I have been testing the cards for about a year now and have had them running in Production servers for about 6 months. So I had a good working knowledge for the cards and how to use them but I was eager to learn some more of the internals of how the cards worked and some of the things I should look for when working with them.

The training did not disappoint. As I mentioned this was the first time anyone was going through the training so we had minor speed bumps but our Instructor Thom did an excellent job presenting and really showed that he knew his stuff and was passionate about the product. I love to see presenters that really enjoy what they are talking about because it makes for a great presentation. I don’t have all my notes from the event yet summarized but I did want to get some information out that I learned at the session.

Fusion Cards come in 2 flavors. SLC and MLC, the NAND chips specifically come in SLC and MLC flavors.

SLC = Single Level Cell – 2 states per memory cell

MLC = Multi Level Cell — 4 states per memory cell

The primary differences between the chips are speed, storage and Life of the card. The SLC memory cell has only 2 states Empty and full. This means that it has a better performance than MLC because it can conserve energy when managing the electrical charge during operations on the memory cell. MLC has 4 states empty, 1/3, 2/3 and full. MLC must expand more energy to maintain the states of the data causing it to have slower performance.

MLC does have the advantage when it comes to Storage. Being a Multi-level cell it means they can have more storage capacity. For the Fusion line this usually means the 320 and 640 cards are the MLC cards.

The MLC card will wear out quicker than an SLC card will because it is expending more of its energy to get the state of the data.

So let’s look at the pro/con here

SLC –Pros

  • Better performance/lower Latency
  • Longer Card life

SLC – Cons

  • Smaller size

MLC – Pros

  • Large size

MLC – Cons

  • Slower performance more latency
  • Less Card life

Now let’s put some things in perspective. When I say lower latency I’m talking about microseconds. We didn’t get quoted a number on the difference but I do have plans to re-run my original fusion tests with an MLC card compared to the SLC card I originally used (in these tests) . Basically if you want the max performance your best bet Is the SLC cards.

For storage you’re talking about Double or triple the size for MLC cards. Again if you need the extra storage you need to look for the MLC cards to cover your needs.

For the life of the cards it will really depend on how much they are used and how frequently they are written to. Both SLC and MLC are rated in the 5+ years range before they even start to degrade in most normal applications. I know I have hard drives that have gone over 5 years of use but I know many more hard drives that haven’t made it over 3 years of use.

An MLC card is still going to squash any conventional spinning media you have in performance numbers. The question you should ask when you’re looking to buy is the same question you have for hard drives. What ratio of performance to storage do I need and I can afford for the project? When someone asks me what to buy with standard disks I always suggest to buy the smallest fastest disks and as many of them as you can if you have performance needs for the disks. If you need space and can suffer some performance then buy the larger disks and get fewer of them. That still holds true here. The difference when you look at the Fusion drive is that an MLC card can do much more than what a normal set of drives can do. So even at its worst performance it’s better than the standard drives. Another important thing I learned at the training was keeping up to date on the drivers.  The new 2.0 driver from FusionIO increases the speed of the cards .  I need to get upgrading and re-running my tests.

Fusion IO DW stats update

 

I did a webinar on Tuesday with Fusion IO about using Fusion cards in a Data warehousing server. I also posted this blog post on Tuesday related to the numbers. I had a few question on specific stats on the DMX I was testing on so wanted to give an update today on that.

The DMX had 120 SAS 15K spin drives. They were set in a Raid 10 configuration giving 60 Effective disks. They were in use by other systems while running my IO tests and running the DW tests that I quoted on the blog post. As I mentioned in the post for me this is a typical scenario. I don’t have a dedicated SAN for my DW so these numbers work for me. Whether these are the numbers you will see in your testing depends on your environment.

I also promised some additional numbers about some other Fusion Raid sets I was testing and I’m getting a post ready for next week on those numbers. For now I would like to point out an excellent post by the SQLCat team on what it saw with Fusion IO and datawarehousing. Well worth a read here is the link.

SQL Cat Team Distinct Count and Xbox Live

New Testing numbers with FusionIO

As I was working on some new numbers for the FusionIO card I have been testing a friend that works there asked if I would join him on a webinar about BI and FusionIO. Since I have a Fusion card in my testing BI server I thought this would be a great opportunity to talk about some numbers I’ve gotten off of it.

Since this is a testing server right now for BI I’m able to test various ETL configurations against the server. For these tests I was running a 320GB FusionIO card(MLC Details here on there site). I decided I would pull over a Dailyreports table for my staging data. The table has 66 million rows in it (no it’s not one day worth). The table has 9 facts and 5 dimensions including a time dimension. I created a procedure to make my fact table and populate the dimensions based off the distinct list of items in the original table. Then I created a separate process to populate the fact table by looking up the dimension keys and populating the measures off the business key. Basically the same thing you would do for any typical ETL process. I ran this with the DB on our DMX 1000 what we consider (Tier 1 storage). The DMX 1000 is configured as Raid 10. I’m not using it exclusively so there are other things running on it. I’ll be adding the SQLIO profile for the DMX to my previous Excel sheet that I posted previously That file is located on google docs here is a link. I consider this a typical san situation that many BI professionals will find be in. They usually don’t have the slowest disk but they also don’t get the fastest disk possible. I ran the Process on the DMX then moved the dB to the Fusion and ran it again here are the results of the numbers.

Drive Type
SanDMX1000 FusionDrive
DW Procedure Sum of Time(Sec) Sum of Time(Sec) Percent Improvement
Create Business Key 303 291 -3.96%
Populate Acct Dimension 12644 4244 -66.43%
Populate Bank Dimension 794 226 -71.54%
Populate Group Name 2044 669 -67.27%
Populate Measures 1318 878 -33.38%
Populate time Dimension 1166 760 -34.82%
Populate TradeServer 962 645 -32.95%

The lesson that we can conclude from these results is even if you don’t have SSD even if you just have fast disks you can increase your ETL processes by placing your staging data and lookup data on the fastest disks possible. My suggestion is that when you’re spec’ing out your DW servers don’t forget your disk subsystem. DW’s usually require lots of space. Most of that space doesn’t have to be on very fast disks. But you should try and keep your ETL process on fast disk when possible.

In my experience with many different BI situations DW is the space where the BI professionals usually have some say in what hardware they want to use and they have control over it. Staging databases typically do not need to have huge disaster recovery plans with them as they are frequently re-created by the source data coming in. This makes it an ideal situation for Fusion type drives/SSD and fast disks.

I did some backup tests as well with the card and will get those posted as soon as some tests finish running.  🙂