Wednesday, April 23, 2008

PART IV



Price Matrix


Let's now calculate how much our superNAS is going to cost:


Item Name

Price

3ware 9650SE-12ML PCI Express x8 SATA II Controller Card

$734

Corsair CM73DD1024R-667 1GB DDR2-667 PC2-5400 ECC

$320

3ware Battery Backup Unit for 3ware 9650SE

$114

WD RE2 RAID Edition SATA Hard Drive, 500GB, 7200 RPM, 16MB Cache x 4

$600

AMD Opteron 2212 2.0GHz 2 x 1MB L2 Cache Socket F Dual Core Processor

$255

TYAN S2927A2NRF ATX Server Motherboard

$302

Supermicro SuperChassis 745TQ-800

$620

Accessories (rubber rudders to prevent vibration, static discharge etc)

$50

Total

$2995

*Prices are the best deals available from various e-stores as on 05/15/2007

That did not seem very bad...looks like we are on target. We jumped from 2000$ to 3000$ because we assembled the stuff. If some billion dollar company made this, I'm guessing it would be around half the price, because of mass production.


Feature List

We can now officially announce the feature list of our fileserver! Combining all of the enterprise class features of the products that we used, we can state the storage server configuration as follows (the proud moment!):


Product Name:

FS4V01

Form Factor:

Tower / 4U chassis (178mmx452mmx648mm)

Processor:

Standard:

Maximum:

AMD Opteron 2212 Santa Rosa 2.0GHz 2 x 1MB L2 Cache

2x AMD Opteron 2212 Santa Rosa 2.0GHz 2 x 1MB L2 Cache

Memory:

Standard:

Maximum:

4 GB PC2-5400 667 MHz ECC DDR2 RAM

32GB

Storage:

Standard:

Maximum:

2.0 TB SATA2 (500GB x 4) 3.0 Gbps Hot Swappable

9.0 TB SATA2 (750 GB x 12) + 4.5 TB SATA2 (750 x 6) + 1.5 TB IDE(750 x 2)

Network Interface:

Standard:

Maximum:

4 GbE Ports + 2 GbE Ports/FC interface

Expansion Slots:

3x PCI v 2.3 32-bit/33MHz slots

Power Supply:

800W Redundant Power Supply

Protocol Support:

NFS, CIFS, iSCSI, FTP, HTTP, WebDAV

Authentication:

NIS, LDAP, Kerberos, Local

Network Servers Support:

DNS, Active Directory, NTP


Theoretical Metrics:

Maximum Throughput: 4Gbps (bonded 1 x 4 GbE ports, 802.3ad link aggregation)

User Base: 25

Maximum Raw Storage: 9 TB, optional 4 TB SATA + 1.5 TB IDE

Comparisons

As this is a hardware design in concept, the actual performance and user base have not been measured. The configurations and price are compared below. The performance data deduced, if any, is deduced from the configuration and empirical computations. Once the actual hardware is available, it can be subjected to various benchmarks and tests. (that is, if any of you are ready to fund me ;)... its just 3000 bucks, c'mon!)

Configuration Table:


Configuration

Our Fileserver

FS4V01

HP StorageWorks 600 All-in-One Storage System - 3.0TB SATA

EMC® NS350

Sun StorageTek™ 5220 NAS Appliance

NetApp StoreVault™ S500

Price


$9,649.00

$47,000

$ 27,000.00

>$6,000

Processor

AMD Opteron 2212 Santa Rosa 2.0GHz 2 x 1MB L2 Cache

Dual Core Intel® Xeon 2.67 GHz/1333 MHz FSB with 4MB (1x4MB) L2 Cache

Dual 1.6 GHz Pentium IV

One 2.2-GHz AMD Opteron™ processor

Not Specified

Memory

4 GB DDR2 ECC Memory ( 2 x 2 NUMA configuration)

256 MB DDR2 533 ECC Memory onboard RAID chip

1GB/4GB Fully Buffered DIMM PC2-5300

4 GB 266 MHz DDR RAM

Two-GB DDR1/400 ECC-registered DRAM

DDR2 Memory (System RAM) 1024MB

Nonvolatile SDRAM (NVRAM) 256MB – to protect in flight transactions for 3 days.

Form Factor

4U (tower or rack)

5U (tower or rack)

8U

NAS Head: 1U,
Storage Enclosure: 3U

2U, 19” rack-mountable

Storage

2.0 TB: 4 x 500 SATA2 (Hot Plug)

3.0TB: 6 x 500GB SATA (Hot Plug)

Not Included

2.0 TB (8 x 250 GB) 7200 rpm SATA Disks

Not Specified

Maximum Raw Storage Expansion

9.0 TB (12 x 750 GB SATA2) + 4 TB(6 x 750 GB SATA2) + 1.5 TB (2 x 750 GB IDE)

3.0 TB

18 TB, 10 TBs usable, 60 disks of 73 GB and 146 GB 15K rpm FC drives; 73 GB, 146 GB and 300 GB 10K rpm FC drives; 500 GB FC 7200 rpm drives; 500 GB SATA II 7200 rpm drives

SATA: 24 TB (raw)/18 TB (usable, RAID 5 with hot spare)

6TB

Up to 12 250GB SATA I and 500GB SATA II

Expansion slots

(3) PCI v2.3 32-bit/33MHz slots

(3) x4 PCI-Express (x8 connectors)
(1) 64 bit, 133MHz 3.3V PCI-X
(2) 64 bit, 100MHz 3.3V PCI-X


Two internal PCI-X 64-bit slots (one available for expansion)

68-pin VHDCI connector for LVD SCSI devices

Backup

Remote synchronisation to disk or tar to tape

Software to backup to disk, tape, or other removable media

NDMP

Optional NDMP

NDMP

SAN Protocol Support

Optional

No

2x2 Gbps Fibre Channel ports for array/switch connectivity, 1x 2 Gbps Fibre Channel port for Tape connection.

One standard dual-port 2 Gb/sec FC HBA.

One port for RAID connectivity and another for tape backup

Fiber Channel and iSCSI Up to 64 LUNs, QLogic SAN Starter Kit, available from the StoreVault Division

Redundancy

Redundant Power

Redundant Power

Redundant Power, bus and IO subsystems.

Redundant hot-swappable power supplies.

Dual, redundant, hot-pluggable, integrated power supply/fan, Dual, redundant loop resiliency circuit (LRC) modules or dual, redundant electronically switched hub (ESH) modules.

Network Interface

4 Gbe Ports + 2 Onboard Gbe ports

Embedded NC373i Multifunction Gigabit Network Adapter with TCP/IP Offload Engine

4 copper Gigabit Ethernet (GigE) ports per data mover, maximum of 8 ports

Four standard 10/100/1000BaseT Ethernet ports; one

optional dual-port (optical or copper) Gb Ethernet NIC

2 Ethernet 10/100/1000 Copper

Snapshots

Yes

Yes

Yes

Yes

Yes, Upto 255 per volume

Battery Backup

Yes

Yes, 3 hours

Yes

Yes

Yes, Upto 3 days

Local File system

Ext2/ReiserFS/NTFS

NTFS

Extended Universal File System (UxFS)

64-bit journaling file system, NTFS streams support

WAFL

File access protocols

NFS, CIFS, iSCSI, FTP, HTTP, WebDAV

CIFS and NFS

NFS v2, v3, & v4, CIFS, FTP, TFTP

CIFS/SMB, NetBIOS, NFS v2 and v3, FTP

NFS V2/V3/V4 over UDP or TCP, NFS client authentication, Microsoft® CIFS

Block access protocol

iSCSI target

Microsoft iSCSI Software Target

iSCSI

iSCSI target

Fiber Channel and iSCSI Up to 64 LUNs

Directory and name services

NIS, LDAP, Kerberos, Local, DNS, Active Directory, NTP, SNTP

CIFS, NFS

NTP, SNTP, NIS, LDAP, Kerberos, Local, DNS, Active Directory, NTP

ADS (LDAP, Kerberos v5), NT 4.0 multiple master domains

(MMD), DNS, WINS, NIS, NIS+, local files

UNIX® NIS, Macintosh® NIS, Windows® Active Directory and Windows Workgroup (local) integration

System administration and Monitoring

Web based OpenFiler Interface, Linux tools(ssh)

iLO, All-in-One Storage Manager

EMC Celerra Manager, SNMP, Web Interface, Telnet

Web (HTTP/based on Java™ platform) GUI, telnet, Rlogin,

Rsh, SSH, console command line interface (CLI), SNMP, Remote Syslog

Web Interface, SNMP

Dimensions (HxWxD) mm

178 x 452 x 648

468 x 220 x 640

268.8 x 450 x 603.3

438 x 445 x 640

133 x 447 x 559

OS

Updated OpenFiler Linux based RamDisk OS

Windows Storage Server 2003 R2


StorageTek 5000 NAS OS storage-optimized

operating system

Network Appliance™ Data ONTAP® StoreVault Edition




*Some important aspects such as heating and power consumption have not been considered because the data is not available for our fileserver.

Conclusions

Our disk server has some of the best features when compared to commercially available NAS boxes, at a much lower price. Agreements with OEM manufacturers will lead to a huge change in purchase price. Better disk configurations too, are possible if the product was manufactured. The price of the same device manufactured under bulk purchase or agreements will reduce the price to as low as 30-50% of the current price. Upgrade path and future roadmap of the product has not been thought of, but can be developed if the product expands into a product line. We may be able to reduce the price a bit more with our choice for the chassis, eliminating a few of high end features. We also have planned a second type of NAS server, where the storage controller and NAS controller box are separate, interconnected by a high speed FC link, with various more redundancy support built in. It will be more expensive than the proposed model, but will be significantly cheaper than the competitive models available in the market.

References

Prices

http://www.newegg.com

http://www.ebay.com

http://www.directcanada.com

http://www.hp.com

http://www.pricegrabber.com

Sample Disk Server Configurations

http://www.accs.com/p_and_p/TeraByte/index.html

http://arstechnica.com/guides/buyer/guide-200701.ars

Product Details

http://www.tyan.com

http://www.3ware.com/products/serial_ata2-9650.asp

http://www.intel.com/network/connectivity/products/pro1000pt_quad_server_adapter.htm

http://www.supermicro.com/products/chassis/4U/745/SC745TQ-800.cfm

OS

http://www.openfiler.com

http://www.openfiler.com/docs/openfiler-guide-1.1/#d0e77

http://servers.linux.com/servers/06/04/10/179239.shtml?tid=31&tid=100

Comparisons

http://www.sun.com/storagetek/nas/5220/specs.xml

http://www.emc.com/products/networking/servers/ns_series_i/pdf/H2051_NS350_series_SS_ldv.pdf

http://searchstorage.techtarget.com/generic/0,295582,sid5_gci1232435,00.html?topic=298663#snapshot07
http://h71016.www7.hp.com/dstore/MiddleFrame.asp?page=config&ProductLineId=450&FamilyId=2446&BaseId=19681&oi=E9CED&BEID=19701&SBLID=

http://www.storevault.com/products/hw_s500.html

http://www.gamepc.com/labs/view_content.asp?id=o2000&page=2

Other References

http://www.quepublishing.com/articles/article.asp?p=481869&seqNum=9&rl=1

http://linas.org/linux/raid.html

http://www.networkworld.com/news/2004/101104infports.html?page=2

http://www.directron.com/wd5000ys.html

http://www.gamepc.com/labs/view_content.asp?id=caviarre&page=1

http://www.supermicro.com/newsroom/pressreleases/2006/press091806.cfm

*All registered trademarks belong to their respective owners.

Disclaimer

Just to wash off my hands...don't believe anything that is said above, if you don't want to. I am a professional, but professionals are still human. There may be mistakes, which if you point out, I'll be glad to correct. There may disagreements, which if you point out, nothing is gonna happen. Because everyone has one opinion, and mine is the one you just read. Put yours as comments if you want to. ;)

And on top of it, I'm not getting any incentives to write this article, and am not responsible to any person or firm, except my professional integrity (ahem!). That being said, all my best efforts have been put in to make sure the information in here is complete and correct. Do post your comments!

Tuesday, April 22, 2008

Part III

Operating System: The operating system is the core software part which runs the servers (NFS, CIFS, etc) serving data, and is the platform on which the manageability tools run. Most of the high end disk servers vendors have their own OS specifically designed and tweaked for the sole purpose of file serving. Considering the market audience and cost structure of our fileserver, we cannot go over such an attempt. The storage server from Microsoft is another alternative, but it too costs an amount, and further licensing makes it impossible to scale in a cost effective manner. On the other hand, there are many ready to use general purpose distributions like Ubuntu, Redhat, CentOS etc which are available in server configurations, and can be installed to perform the functions of a fileserver, but with a lot of tweaking. There are some ready to use linux based NAS distributions too, which have been stripped off to contain only the tweaked kernel, servers (NFS, CIFS, iSCSI targets, etc), and other related filesystem utilities. They have been tweaked heavily, and can be further done so, to act as a fileserver. They also contain custom opensource software written for managing storage. Some of the examples are NASLite, and Openfiler which offers a complete management web server. Or we can always start from scratch with linux, adding just the required device drivers, servers and tools. But that is going to take an eon, however. I would also suggest taking the OS off the harddisk and put it completely on a Ramdisk, thus making performance comparable to the firmware based proprietary OSes. The memories spend for it is well worth the performance improvement.
Finally, Mr. X decided to use openfiler because of its many features, and further tweak it as necessary. Morover, its development and support seems pretty decent, and it seems to be the best in open source. There is a lot of room for performance tweaks, but it is possible only after the fileserver is assembled.
Display and Audio: Not really an important factor in a fileserver configuration, now is it? The Tyan Thunder n3600B (S2927) motherboard has onboard display chip and audio, if you want the occasional luxury ;).
Management Softwares: Openfiler provided a wide variety of management tools, including snapshots! All the fileserver features are available in a web interface, greatly simplifying configuration and management. All services like NFS, CIFS, user configuration, group configuration, LDAP, NIS, Samba, quotas, snapshots and much more can be configured by the Openfiler Web Interface. Check it out – www.openfiler.org

Down with the theory now. Let’s build an official configuration list now:
Some of the “great” features of the products that we have used in this configuration are here:

3ware 9650SE RAID Controller
8th-generation StorSwitch™ non-blocking switched architecture
On-board I/O RISC processor and RAID offload
SCSI device driver model
Bootable array support
Variable stripe size
64-bit LBA support (greater than 2TB volume)
256 MB DDR2 533 memory with ECC protection
32 pooled DMA channels
Complete configuration management suite - 3ware BIOS Manager 3ware Disk Manager ,CLI
SNMP support
SMTP support for email/pager notification
Auto carving allows LUNs > 2TB
Battery Backup Unit (BBU) support
Multiple logical unit sizes and RAID levels on one card
Hot-swap and hot-spare support
Dynamic sector repair
S.M.A.R.T. disk drive monitoring
Emergency Flash Recovery
Online Capacity Expansion and RAID Level Migration
Write journaling for improved performance and data protection against accidental drive removal
Chassis Control Unit for enclosure management via I2C
RoHS compliant

Intel® PRO/1000 PT Quad Port Server Adapter
Two Intel® 82571GB Gigabit Controllers
Load balancing on multiple CPUs
Intel® I/OAT2
Virtualization, Interrupt moderation
PCI Express slots
Remote management support
RoHS compliant
Advanced cable diagnostics

Tyan Thunder n3600B (S2927) Motherboard
Processor - Dual 1207-pin sockets support AMD Opteron™ (Rev. F) 2000 series processors
1 GHz Hyper-Transport link support
Chipset - nVIDIA NFP3600, SMSC SCH5017
Memory - Dual channel memory bus 8 x 240-pin ECC DDR2 400/533/667 DIMM, up to 32GB
Expansion Slots - PCI-E x16 (x 16 signals), PCI-E x16 (x 8 signals), PCI v2.3 32-bit/33MHz slots
Integrated PCI Graphics
Integrated PCI IDE
6 x SATA2 3.0Gb/s ports
Monitoring - Temperature & voltage monitoring, Watchdog timer support, Port 80 code display LED
Two integrated GbE ports
PCI FireWire (1394) controller
Two 1394 ports
Optional Modules - M3291, IPMI 2.0 Remote System Mgmt. card
BIOS - AMI BIOS on 8Mbit Flash, Supports ACPI 2.0PnP, DMI 2.0, WfM 2.0
Form Factor - ATX footprint

AMD Opteron Processor
AMD Opteron 2212 2.0 GHz
NUMA Architecture
Direct Connect Architecture
Socket F, 1207 pin
AMD Virtualization (AMD-V) technology
Dual-core design


Supermicro SuperChassis 745TQ-800
Form Factor - Tower / 4U chassis support.
Dimensions
178mm x 452mm x 648mm
Gross Weight - 58 lbs (26.3kg)
Expansion Slots - 7x PCI expansion slots
Drive Bays - Hot-swap - 8x 3.5" Hot-swap (SAS / SATA) Drive Bays
Peripheral Bay(s) - 2x 5.25" Peripheral Drive Bay, 1x 5.25" Bay For Floppy Drive
SAS / SATA Hard Drive Backplane with SES2
System Cooling - 3x 5000 RPM, 2x 5000 RPM Rear Exhaust Cooling Fans, Hot-Swappable
System Monitoring - Chassis intrusion switch
Operating Temperature Range - 10 - 35°C (50° to 95° F)
Humidity Range - 8 - 90% non-condensing
Power Supply - 800W AC power supply w/ PFC
AC Voltage - 100 - 240V, 50-60Hz, 10-4 A

+5V standby - 04 A

+12V - 66 A

+5V - 30 A

+3.3V - 24 A

-12V - 0.6 A


Proceed to the next part

Part II






Well...after all this groping around, let's see what Mr. X came up with!

Storage Scalability requirement: 13 TB
User Base: 25 simultaneous clients.
Mr. X decided to go with a standard, mid-tower setup with a standard ATX form factor. The first and the most crucial component is the selection of the chipset and a server motherboard that utilizes the full features of the chipset. There are quite a few server motherboards available in market. What do we have to look for w.r.t building a fileserver? Here:

Chipset and Motherboard: The Front Side Bus connects the processor and the chipset, which in turn interfaces to other devices of the system through various interfaces like PCI, AGP, etc. The speed of the FSB is a crucial component, which if selected improperly, can lead to bottlenecks in the system. A slow FSB will cause the CPU to spend significant amounts of time waiting for data to arrive from system memory. In today’s market, FSB is overtaken by other technologies, like the Intel Shared Bus Architecture, or the more industry standard HyperTransport technology adopted by many vendors. We shall choose the HyperTransport technology and AMD processors for its various advantages in both compatibility and performance. The Hypertransport speeds extend from 400MHz to 2.6GHz. The most common speed available in today’s chipsets are in the range of 1 GHz. There are various vendors making AMD chipsets in the market. nVidia nForce Professional 3600 chipset for our purpose. Considering the processor and chipset we chose, and comparing various server motherboards in the market, we’ve reached a conclusion on the Tyan Thunder n3600S S2927 dual proc motherboard with the nVidia nForce Professional 3600 chipset. A detailed explanation of the board’s features can be found at http://www.tyan.com/product_board_detail.aspx?pid=175.

I really wish it could be this one, I really really do:
http://www.tyan.com/product_board_detail.aspx?pid=554
But as you see, we’ve got a budget of 2000$ to catch up, and this buddy’s a big stretch. However, if your disk server really has a huge load to serve, go for it (along with scaling the other components).
The most notable features are:

FSB: 1 GHz HyperTransport link
Processor Support: 2 AMD Socket 1207 L1 Sockets supporting AMD Opteron™ 2000 rev F processors.
Chipset: nVidia nForce Professiona™l 3600 chipset
Memory Support: 32 GB DDR2 ECC 667/533/400 DIMMs
Expansion Slots: 1 PCI-E 16x + 1 PCI-E 8x + PCI

Processor: We have limited the choice of the processor between vendors Intel® and AMD, as they are the most commonly available in the market today. As both the processors provide comparable performances on a single processor motherboard, what is left to check is the architecture and scalability to more than one processor. Intel has been following the Shared Bus Architecture, which utilizes a single bus to fetch and write to the memory and devices. When we use more than one processor, it becomes difficult to scale. Intel has created many workarounds for this, mainly various caching and prefetching techniques, but the architectural constrain still remains. The AMD architecture on the other side, has adopted the industry standard point to point Hypertransport technology, eliminating the need for a shared FSB, and simplifying the architecture by eliminating many interfacing chipsets, and using standard tunnel chips for interfacing. (Wow, I didn’t know I could create such a big and complex sentence!). In multiprocessors AMD uses the Direct Connect Architecture where the microprocessor is directly connected to other CPUs through a proprietary extension running on top of two additional natively implemented HyperTransport interfaces; whereas Intel multiprocessors have to hit the shared bus each time it has to access the memory, resulting in latency. In Xeon processors, they try to circumvent this limitation by providing a good amount of cache (4MB). The argument between Intel and AMD architecture can continue, with both sides having its pros and cons. Anyways, I feel its silly to fight over the powers of processors, and if you wanna, check out the various discussion forums where they do that. However, in light of the points mentioned here, and considering the lower costs of the AMD chip, we chose to go with socket 1207 AMD 2000 series Rev F processors, specifically, the AMD Opteron 2212 (dual proc) 2.0 GHz. Besides, we chose an AMD motherboard, remember?

Memory: The speed of the memory is directly related to it’s interconnect with the processor. For a data-centric application we require the system to be of high stability, thus eliminate any possibility of over-clocking and choose ECC technology. Fully buffered memory has been seen to be used in some systems, especially HP and MAC systems for more stability, but considering the cost and performance degradation of using Fully Buffered memory, w e choose against using it. ECC proper cooling, and proper clocking should suffice our requirement. We can limit the speed on 667MHz, as a higher speed would increase the possibility of more memory errors. As memory in our setup is in a NUMA (local to each processor) configuration, the speed does not have to scale as more processors are added. The memory supported by the chipset is 32GB, 16GB per processor, thus we have plenty of room to scale.

Disk Array: The nVidia 3600 chipset has support for RAID 0, 1, 0+1, 5, and JBOD. However, it is better to offload the RAID handling from the chipset to a separate controller, more specifically built to handle RAID and has more manageability and backup features. The main features we look for this are the interconnect speeds, manageability and functionality is provides. Manageability includes local and remote monitoring, support for standards like SNMP, firmware updates, etc. Interconnect for most modern systems are PCIe, under various lane configurations (32, 16x, 8x, 4x and 1x). Considering the speed of the SATA disks available in market today, which support 3.0 Gbps transfer rates, and the number of disks handled by the controller, we probably would go for an 3ware 9650SE-12ML, which has superb RAID levels support, including RAID 6. It also provides a PCIe interface of 8 lanes, which theoretically should give a maximum transfer speed of 20Gbps under PICe 1 specification of 2.5 Gbps per lane. The 3ware card does automatic assignment of LUN’s for volumes greater than 2TB for OSes which doesn’t support it. It also provides 64 bit LBA addressing. If you calculate on a very vague level, whether there might be any bottlenecks, by taking the average disk speed as 50-54 MBps, for a 12 disk configuration; we find that it leaves plenty of space on the PCIe 8x bus.


Disks: The cost per GB of hard disks has reduced to as low as 0.014$ in 2007 for SATA disks, and still going down steeply. Capacity and speed are main metrics to measure hard disk performance, but there are many other server class features in today’s hard disks, like reliability (measured in MTT, mean time between failures), noise level, heat generation, NCQ, RAID optimization, etc. The traditional fileserver storage component is SCSI disks, offering high reliability and speed. However, today the SATA and SAS technologies have become so mature that major storage giants have entire product lines released on these technologies. We choose SATA disks for our storage server, due to the increased reliability at a relatively low price. Specifically, we choose the Western Digital RE2 RAID Edition 500GB SATA disks. They have special optimizations made in for performing better in RAID environments and have server class storage metrics like high MTT, 16MB Cache, NCQ support, etc. They are not as fast as the 10k RPM disks, but overall, comparing the RAID performance, reliability and other metrics, we can say that these disks are one of the best choices for building a storage server at this price.
Network Interface Cards:
No matter how much the performance of the disk server is, what comes out of a NAS box is what the user gets, and that’s what the really matters. It is essential that whatever throughput the server can give under a specified load should be what the user base should get. For this, we have to carefully choose a NIC, which can handle such throughputs. Comparing various products in the market, we have chosen the Intel PRO/1000 PT Quad Port Server Adapter. The Intel PRO/1000 PT Quad Port Server Adapter has a Quad port configuration (4x 1Gbps) and a PCIe 4 lane interface to the mainboard. It has dual controllers handle the network traffic of the 4 ports, and the ports can be bonded in failover aggregate mode, providing a virtual 4Gbps link under a single IP address. The PCIe interface ensures that data can be supplied to the controllers under the rate it wants, eliminating any bottleneck in the interface part(4x2.5 = 10 Gbps). Remember that these being standard PCIe devices, can be replaced by an FC interface card to convert the box to SAN storage, if needed, in future (provided you have the software to support it).


Chassis: The choice of the chassis is one of the often overlooked aspects while building a server, not just in the issue of cooling, but also of various features that can be integrated into it. Many of the important features are redundant power supply, hot swappable disks, cooling, expansion, etc. The stability of the system depends a lot on the running temperature, the power input, proper shielding from interference, etc. Due to these reasons, even though a bit costlier option, I’m gonna go with a chassis which is specifically built to handle file server type loads and environments. The Supermicro SuperChassis 745TQ-800 is a high availability chassis, with lot of useful features needed in server type architectures. Some of the key features are redundant 800W High-efficiency Power Supply, 8 x 3.5" SAS/SATA Hot-swappable drive bays, 7x full-height, full-length expansion slot, 3x Hot-Swappable Cooling Fans and 2x 5000 RPM Hot-Swappable Rear-Exhaust Fans.

Thursday, April 17, 2008

How to make a NAS box in less than 2000 $ a.k.a "Cheap tension-free storage for SME's"

Part I



Storage, as we know, is one of the main troubles in the head of an IT dude taking care of his company’s infrastructure.

Especially if the company is a small one going large, and the CTO is screaming atya for every dollar being spend for IT goods.
Here's the story of a system administrator Mr X. (...and let's call his manager Mr. Pointy Hair - for obvious reasons...).
He was recruited by a startup company (let’s call it Conundrum exporters limited). His responsibility was to setup and lead the technical department (not that there will be lots of them to lead, tech department in a non tech SME will be around…1). When he joined, they were around 10 people in all the various departments. The company was projected to grow to around 100 people, and thereafter, maintain a slow rate of recruitment.

When you are a small company, you start small. With a population of only around 10 people, Mr. X thought, "Well, what do we need more than a RAID box and rsync backup to a 500GB local disk??? I love my simple life!"


Hmmm...Well, that happiness lasted about 20 days and 3 hours and 43 seconds. Then the enchanting moment came...and the RAID storage crashed. Not just one disk, but two. - great old Sir Murphy used Mr. X to prove his theory, as he regularly does to most of us.
After 14 cups of coffee, hours of fiddling around with rsync options and the backup drives, and running around the cubicles to ask people to restore whatever they had on their local machines, he got around 90% of the data restored. For the rest 10%, he was barbecued by management (without sauce!).
Mr. X knew now the time had come to get a proper storage infrastructure. He went to his boss:

Mr X: I need 10,000$ to build a good storage and backup system here.
Mr. Pointy Hair: I’ll give you 10 bucks!
Mr. X: Wha??...buta…10…dadda…mama…jaba…
Mr. Pointy Hair: Okay, now you are making no sense!

After hours of crying begging, pleading and signing a bond to stay in the company for a year more in return for the budget allocated for storage, Mr. X bargained up a low but decent sum – 3000$. Mr. X had to stop bargaining when his boss started to look something similar to this:


He knew it was time to escape. But what was he supposed to do with 3000$? He could not even build a chicken farm with that!
So Mr. X sat down, got his genius working, and set out to build something that is the gist of this entire post...he built a NAS box with his own hands; and mankind was never the same again!

But how did he do that the pocket money that Mr. Pointy Hair gave him? Read on…




Mr. X is just one of the hundreds system administrators in the emerging markets like India and China. There are a lot of SME’s who cannot afford enterprise class storage and backup solutions, but they need it all the same. And to top it, these organizations typically have multinational clients, and have to maintain high standards when it comes to data protection and reliability.
Struggling with all these issues at hand, a typical Mr. X would look for a solution which is as cost effective as possible to start with, and which is scalable in both size and speed to grow with the organization, all without compromising on reliability.
Mr. X have been reading a few tech journals, and once remembered reading an article on NAS. It was something about a black box which stores your files, was easy to setup and maintain, and it did the laundry and did cooking too! At least that is what Mr. X remembered. So he decided to do some digging, and found out more about NAS boxes. Let's do that too.

Well, what is NAS? The definition of NAS is very broad. It reminds me of p2p concept. There are a hundred ways to implement both, the concepts being so broad. Network Attached Storage – does it mean that anything that I can attach to the network and which serves files can be a NAS?
Frankly, I don’t know. And I don’t want to know. I have seen people sell NAS boxes which are nothing but extended ATX boxes with SATA disks attached to a RAID controller, running Windows Storage Server 2003. I have also seen better, more modular and highly extensible solutions, where an external storage array(s) will be attached to NAS controller(s), through an interconnection fabric, running embedded Linux. Let's just leave the definition at "A multi platform black box which exports various network filesystems, is easy to setup, and is supposed to run without much downtime or maintenance". Whatever comes beyond that is fancy stuff...desirable, but fancy (like Jeremy Jones).

So Mr. X finally settled on a NAS box as the solution to his burdens. No fancy stuff to start with…no clustering, no virtualization…just a plain vanilla NAS box. But for the money he has, he cannot get a good NAS box. He looked around some online shops and found out that a decent NAS box with the features that he needs will cost around 5000$. If he wants good returns for every penny he got, he’s got to build it on his own.
The main feature Mr. X focused was low cost, but he had to achieve those without compromising on those much talked about enterprise standards.

Most of us know that for a growing business, a scalable and extensible storage solution is needed. SAN would not make sense, because most of SME's (Small and Medium Enterprises) do not even grow to that kind of complexity, and more importantly the company would close down (because Mr. Pointy Hair boss got a stroke after reading the check for the FC switches). The ideal solution will be a NAS box. But again, not just "any" NAS box...

Mr. X sat down and did a small requirement analysis for his company. And finally, when the vending machine was out of coffee, and when the clock struck midnight, his notebook looked something like this:




1 . Bandwidth required
Mr. X’s company was a typical SME. They had a steadily increasing workforce, and most of the people works on spreadsheets, documents and some images. Some of them store movies on their home directories also, but such people are not accounted for (they will be hunted down and torched!). This means that bandwidth per user is not huge, and the IOP’s (I/O operations per second) not that demanding. We will talk later, about another guy - Mr. Y (ya, right) who works in a completely different scenario, a Visual Effects Studio, or a Geologic or weather research center, for example.

So in the end he tossed the coin around a couple of times, did some adjustments, and came up with a ballpark figure of about 8 Mbps (1 MBPS) per user. (Okay, it not a ballpark figure, he netstat-stalked a dozen of the pretty employees and averaged the results.)
100 Mbps backplane would be a congestion if everyone started using the network at the same time, because with hundred users, we have 1 MBps x 100 = 100 MBps total, and a 100 Mbps network practically gives only around 10 MBps. Even a 1 Gbps backplane will not be enough if not properly planned, because 1 Gbps = 125 MBps, and practically, 90 MBps). So Mr. X better think ahead in planning the network properly with separate VLANs, routings and gateways. Anyways, this is not a network tutorial, so we will concern more with storage. Since the prices were cheaper for 1Gbps nowadays, Mr. X decided ithat Gigabit would make a better option than 100 Mbps.


2. Reasonable Scalability
There are NAS boxes in the market which offer a scalability in the range of 2TB to 100’s of TB. A customer would want his storage to be as scalable as possible, so that it expands with the needs of the organization. Scalability is a feature on which most SME’s are ready to make a one time investment, because typically their data would not span more than a few dozens of terabytes, but they want to be adding the storage only when required.
But the box should be built such a way that, if tomorrow, by some wild stroke of luck, Mr. X’s company grows into a big enterprise; they can still reuse it. Most vendors today provide this by providing iSCSI support.


3. Sustained Availability

Usually, the most important component of a system fails when it is most needed. And it remains a headache to avoid or deal with the failures in equipment, especially storage. Typical levels of fault tolerance that we can provide include:
· different levels of RAID protection
· multiple standby power inputs
· redundant system components (multiple interconnect between NAS engine and storage arrays, redundant internal buses, etc).
Most of the SMBs will be having one or two run-of-the-mill general purpose system administrators who will not be experts in storage. So, if you give them a storage server, tell them to decide the RAID levels assign LUNs and trunk the network ports – all based on their needs; then even if you are a friendly sales guy, they will think you are hostiles from mars. So instead of attracting angry mob with torches, make the box as user friendly as possible, hiding most of the unnecessary complexities. Put in a few self healing abilities if possible, and some redundant failover hardware.


4. Performance

The performance of the system must be acceptable and scalable upto a certain threshold. And that means once you tell that the box gives x MBps bandwidth, it should always give x bandwidth, no matter what. Typically, the performance starts to drop when either the userbase increases, or when the available storage in the NAS reduces beyond a limit. We can define a limit on the number of users, iops and bandwidth attainable, but we should deliver the promised performance irrespective of the external factors.


5. Price

The price is one of the main deciding factors for an SMB customer. Especially with a non IT sector company, the management has to be convinced about the advantages in the investment made for storage and servers. The drive from the management will be most of the time to start with whatever is immediately necessary. You want 2 GB now, buy 2 GB now, and upgrade later if we have to. “Why should we invest in something which will not give us value for the next 1 year? “ - would one sensible question that management would ask most of the time.
Let’s see what Mr. X came up with!