How to build cheap cloud storage (67 terabyte 4U servers for $7,867) - Backblaze Blog - http://blog.backblaze.com/2009...
Sep 2, 2009
from
HealingBrush,
Louis Gray,
Dave Winer,
chris,
Nick Mutton,
Marco Willemse,
David Berrebi,
Tzury Bar Yochay,
Karthick R,
Adriano,
Dušan Šimonovič,
Cosmin Lehene,
Captain Jack,
Daniel Brusilovsky,
Nikolas Koumoundouros,
BRҰANSAҰS,
Youssef EL ABASSI,
travispuk,
Erdinc YILMAZEL,
David Elks,
Hisa Itami,
Илья Сегалович,
Garin Kilpatrick,
ibrahim özdemir,
TH Schee,
Mac Sharp,
Thor Hanks,
Anders Norgaard,
Ozer Wrzl (EN),
Rob Cairns,
Onur Baykal,
Kostantinos Koukopoulos,
acelya akpinar,
Melanie Reed,
ZN Moment,
AlpB.,
Mike Chelen,
Andrew Leyden,
Arnaud Fischer,
Enes TAYLAN,
Christopher A. Wichura,
onurc,
Sezgin Aktas,
Derek Collison,
Lewis Shepherd,
bopm,
Hameedullah Khan,
Aaron Kurtz,
Jamie Ginsberg,
Francisco Tomé Costa,
Ahmet Usta,
Andrew Terry,
Sue - Friendfeed is best,
cool_ni_ikou,
Robert DeBord,
Jacque,
Roger Pettett,
Niklas Sjostrom,
Clifford Kennedy,
Meryn Stol,
goldberg_sizzles,
Ercu,
BraGiu,
Can Koklu,
Bahadır Yağan,
Cumhur Kızıları aka Zeus,
Taisuke Yanase,
Derrick,
Majento,
Oguz Serdar,
Jay,
arda,
A Mitchell,
Durukan Duru,
Luis Martín Vallejo,
Akın İdil,
Ho John Lee,
Burcu Dogan,
Ozan Eicher,
阿石,
Stephen Edgar,
Just Ken,
Kishore Balakrishnan,
LANjackal,
Juvenn Woo,
Peter Renshaw,
Emre Savaş,
alperyz,
motownmutt,
Gtp19,
Aaron Draczynski,
Tony Vota,
James Stanbridge,
Kazuki Nakajima,
Matt Ellsworth,
Michael R. Bernstein,
Mike Reynolds,
Robert Scoble,
consumed,
Alexander Arsky,
Jeremy,
abdellah,
Hutch Carpenter,
Aviv,
Neil Bernhart,
Mushin,
Russellreno,
Ali Sözkesen,
Daniel Rowley,
Mitchell Tsai,
Altan Khendup,
Benny,
Tracy,
Mark,
jh,
Sean McBride,
Andy Jenks,
Daniel Chow,
CES & nootropics,
Konrad Förstner,
Daniel J. Pritchett,
timepilot,
Michael Carter,
Andy Dustman,
Jason Benway,
Jayavasanthan J,
хорошо хоть, что хуй цел,
Joel Webber,
Igor Poltavskiy,
sdfx,
Jacob Old,
Benjamin Golub,
Sebastian,
zeroinfluencer,
Dustin Sallings,
Toni @ NavinoT,
Barry Mitchelson,
Private Sanjeev,
Krishnamoorthy,
Aram Zucker-Scharff,
Bart Muskala - AdNerd Sr.,
Ken Sheppardson,
Stefano,
Kazutaka Ogaki,
James Robertson,
Semih Levi,
Chris Myles,
Onur Şentüre,
patrick,
David Vasileff,
Jason Wehmhoener,
Ritu,
Andrew C (✔),
MG Siegler,
OCoG of FF, Jimminy,
scott willeke,
Atul Arora,
Richard Chen,
Matt M (inactive),
and
Rob H.
liked this
Amazing level of transparency and detail about their custom storage servers. HN discussion at http://news.ycombinator.com/item... (discusses why this is appropriate for backup, but perhaps not generic storage needs)
- Bret Taylor
Put me down for two.
- Jason Shellen
45 drives per unit and many units means they must be constantly replacing failed hard drives - just due to the sheer quantity of them in use
- Jacob Old
It wasn't entirely clear to me from the blog post what you have to do to replace a drive. Looks like at minimum you have to remove the unit from the rack, and I don't see any drawer guides or similar to assist with that. And do they have to take the unit offline to replace a single drive?
- Jason Wehmhoener
Geez. Back in 1998, Microsoft was bragging about their 1 TB cloud... :-) Millions of $ then I think.
- Mitchell Tsai
One happy Backblaze customer checking in.
- Russellreno
sounds neat - now what to do with 67 TB of storage...
- Matt Ellsworth
So, they store their data "securely" in Palo Alto? That makes me scared.
- Jonas S Karlsson
Quoted from blog- "Backblaze Storage Pods are building blocks upon which a larger system can be organized that doesn’t allow for a single point of failure." They have indicated an amazing amount of cost savings.
- Wins Fern
Mitchell: I don't think 1TB was "millions of dollars" in 1998.
- Steve de Mena
Nice idea. Pity that it only supports a HTTPS interface, not surprising at that cost though (the software that runs the filesystems on the NetApp and other devices isn't exactly cheap to write). Anyone see if they quoted transfer speeds? I'm wondering what impact the four SATA cards each with SATA multipliers on them has when it comes to access speeds.
- Russ
Steve: according to http://www.littletechshoppe.com/ns1625... disk cost ~$0.08 / mb in 1998, which comes out to >$800,000 for 1 TB or just over a million bucks in todays dollars. so maybe not millions, but a million!
- Karl Rosaen
Russ: It runs Debian. If you were rolling your own (and they don't sell these units), you could turn on NFS or some other protocol (CIFS, iSCSI). They only use HTTP because it's cloud storage. NFS license is a major expense on NetApp, but all the major Linux distributions can act as NFS servers, CIFS servers, and probably iSCSI targets.
- Andy Dustman
Andy: I know that you could do that on them but it leaves the problem of what to do with the storage. You could merge the 3 volumes into an LVM VG but the performance could become an issue with any load on it. It seems I wasn't the only one to question the performance, while the views of a Sun engineer aren't exactly unbiased it does highlight some of the downsides: http://www.c0t0d0s0.org/archive...
- Russ
Russ, great find, thanks.
- Jason Wehmhoener
Fascinating article; but more questions: "In rough terms, every time one of our customers buys a hard drive, Backblaze needs another hard drive." -- so what happens when a drive fails; how much redundancy is there? What happens when a meteorite destroys the whole building; is there off site backup too? (I know this *is* the off-site backup, but still...) I wonder how much data flows in and out over time. Maybe I should just read their website.
- Rob Fisher
Rob: they mention using 15 drive RAID6 volumes that can lose up to 2 drives before failure
- Mike Chelen
The worst part about this cluster design is the fact that I couldn’t shut up about it for the first couple days after finding out about it. It was the solution I proposed to every problem. There were complaints.
- A Mitchell
IMO RAID6 is not that great. Granted, it's highly unlikely to lose 3 drives at the same time, but there's still possibility. Besides, for write-intensive app, parity calculation is quite time-consuming. I personally prefer RAID 10 (striped array of RAID1 pairs). Yes the effective usable space is less than half total capacity, but for backups -- which will sooner or later be used to restore something -- I prefer data integrity over usage efficiency.
- Pandu ● IT Optimizer
IMO RAID6 is not that great. Granted, it's highly unlikely to lose 3 drives at the same time, but there's still possibility. Besides, for write-intensive app, parity calculation is quite time-consuming. I personally prefer RAID 10 (striped array of RAID1 pairs). Yes the effective usable space is less than half total capacity, but for backups -- which will sooner or later be used to restore something -- I prefer data integrity over usage efficiency.
- Pandu ● IT Optimizer