Bakul Shah wrote:
>>Hmm. I'm not sure if it can or not. I'll try to explain what I'm
>>dreaming of. I currently have about 1000 clients needing access to the
>>same pools of data (read/write) all the time. The data changes
>>constantly. There is a lot of this data. We use NFS currently.

> Sounds like you want SGI's clustered xfs....

Maybe so - also, GFS looks perfect as well, only side effect is license,
but that isn't *that* big of an issue. Porting may be easier?

>>I'll be honest here - I'm not a code developer. I would love to learn
>>some C here, and 'just do it', but filesystems aren't exactly simple, so
>>I'm looking for a group of people that would love to code up something
>>amazing like this - I'll support the developers and hopefully learn
>>something in the process. My goal personally would be to do anything I
>>could to make the developers work most productively, and do testing. I
>>can probably provide equipment, and a good testbed for it.

> If you are not a seasoned programmer in _some_ language, this
> will not be easy at all.

Well, depends on if you call perl programming 'real' programming.
Perl just isn't quite the tool for this job (although I'm sure there are
some out there that might actually argue).

> One suggestion is to develop an abstract model of what a CFS
> is. Coming up with a clear detailed precise specification is
> not an easy task either but it has to be done and if you can
> do it, it will be immensely helpful all around. You will
> truly understand what you are doing, you have a basis for
> evaluating design choices, you will have made choices before
> writing any code, you can write test cases, writing code is
> far easier etc. etc. Google for clustered filesystems.
> The citeseer site has some papers as well.

Thanks - this is a great suggestion. I'll try to come up with
something. Really, the truth is (now that I have read even more docs),
RedHat's GFS is exactly what I would like for FreeBSD. They already
have all the components, etc. I would prefer a BSD licensed piece of
software, but I just want something that works on FreeBSD mostly.

> A couple FS specific suggestions:
> - perhaps clustering can be built on top of existing
> filesystems. Each machine's local filesystem is considered
> a cache and you use some sort of cache coherency protocol.
> That way you don't have to deal with filesystem allocation
> and layout issues.

I see - that's an interesting idea. Almost like each machine could
mount the shared version read-only, then slap a layer on top that is
connected to a cache coherency manager (maybe there is a daemon on each
node, and the nodes sync their caches via the network) to keep the
filesystems 'in sync'. Then maybe only one elected node actually writes
the data to the disk. If that node dies, then another node is elected.

> - a network wide stable storage `disk' may be easier to do
> given GEOM. There are atleast N copies of each data block.
> Data may be cached locally at any site but writing data is
> done as a distributed transaction. So again cache
> coherency is needed. A network RAID if you will!

I'm not sure how this would work. A network RAID with geom+ggate is
simple (I've done this a couple times - cool!), but how does that get me
shared read-write access to the same data?

> But again, let me stress that one must have a clear *model*
> of the problem being solved. Getting distributed programs
> right is very hard even at an abstract model level.
> Debugging a distributed program that doesn't have a clear
> model is, well, for masochists (nothing against them -- I
> bet even they'd rather get their pain some other way:-)

I understand. Any nudging in the right direction here would be


Eric Anderson Sr. Systems Administrator Centaur Technology
A lost ounce of gold may be found, a lost moment of time never.
_______________________________________________ mailing list
To unsubscribe, send any mail to ""