Managing data sprawl - Storage

This is a discussion on Managing data sprawl - Storage ; Over last couple of years the certain types of data is exploding in our enterprise (probably true for others too). These include source code, binary images (and many many variations of those), all kinds of documents (word, excel), wikis etc. ...

+ Reply to Thread
Results 1 to 11 of 11

Thread: Managing data sprawl

  1. Managing data sprawl

    Over last couple of years the certain types of data is exploding in
    our enterprise (probably true for others too). These include source
    code, binary images (and many many variations of those), all kinds of
    documents (word, excel), wikis etc. These are kept like regular files
    (available via NAS) because they must be available on demand. These
    are not being used heavily but it is not something that can be backed
    up and retrieved. In a sense they are like reference data but some
    modifications do happen from time to time.

    This kind of data is growing like mushroom, almost increasing by
    20-40TB+ per year and increasing. Is there a way to reduce the amount
    of these data by doing,
    - inline compression. I saw some discussion about storewiz, but
    nobody seems to have used that. Wonder why?
    - Can it be moved to some kind of CAS box transparently and then
    pulled in when there is an access.

    Is this type of growth is happening in other enterprises, businesses?
    I would suspect it is.
    How do you manage this?

    Sorry if I am repeating some earlier discussion.

    Thanks

    Sam


  2. Re: Managing data sprawl

    On Feb 6, 2:15 pm, lal2g...@gmail.com wrote:
    > Over last couple of years the certain types of data is exploding in
    > our enterprise (probably true for others too). These include source
    > code, binary images (and many many variations of those), all kinds of
    > documents (word, excel), wikis etc. These are kept like regular files
    > (available via NAS) because they must be available on demand. These
    > are not being used heavily but it is not something that can be backed
    > up and retrieved. In a sense they are like reference data but some
    > modifications do happen from time to time.
    >
    > This kind of data is growing like mushroom, almost increasing by
    > 20-40TB+ per year and increasing. Is there a way to reduce the amount
    > of these data by doing,
    > - inline compression. I saw some discussion about storewiz, but
    > nobody seems to have used that. Wonder why?
    > - Can it be moved to some kind of CAS box transparently and then
    > pulled in when there is an access.
    >
    > Is this type of growth is happening in other enterprises, businesses?
    > I would suspect it is.
    > How do you manage this?
    >
    > Sorry if I am repeating some earlier discussion.
    >
    > Thanks
    >
    > Sam


    Some new companies are looking into these types of technologies... its
    very new, but promising... I know we can use something like a StoreWiz
    in our data center... The issue I had with them so far was their
    scalability. If they performed faster, my boss would probably
    consider them I think

    Dvy


  3. Re: Managing data sprawl

    On Feb 6, 3:13 pm, dvymil...@yahoo.com wrote:
    > On Feb 6, 2:15 pm, lal2g...@gmail.com wrote:
    >
    >
    >
    > > Over last couple of years the certain types of data is exploding in
    > > our enterprise (probably true for others too). These include source
    > > code, binary images (and many many variations of those), all kinds of
    > > documents (word, excel), wikis etc. These are kept like regular files
    > > (available via NAS) because they must be available on demand. These
    > > are not being used heavily but it is not something that can be backed
    > > up and retrieved. In a sense they are like reference data but some
    > > modifications do happen from time to time.

    >
    > > This kind of data is growing like mushroom, almost increasing by
    > > 20-40TB+ per year and increasing. Is there a way to reduce the amount
    > > of these data by doing,
    > > - inline compression. I saw some discussion about storewiz, but
    > > nobody seems to have used that. Wonder why?
    > > - Can it be moved to some kind of CAS box transparently and then
    > > pulled in when there is an access.

    >
    > > Is this type of growth is happening in other enterprises, businesses?
    > > I would suspect it is.
    > > How do you manage this?

    >
    > > Sorry if I am repeating some earlier discussion.

    >
    > > Thanks

    >
    > > Sam

    >
    > Some new companies are looking into these types of technologies... its
    > very new, but promising... I know we can use something like a StoreWiz
    > in our data center... The issue I had with them so far was their
    > scalability. If they performed faster, my boss would probably
    > consider them I think
    >
    > Dvy

    Thanks.
    What other companies? Any pointers.

    Also come to think about, doing this might require a newer type of
    filesystem. That means let go our big irons, which may not hapen so
    easily. If it is read only it might be better but read/write (I
    should say the update) is going to make it harder. Do you see this
    sprawl in your datacenter


  4. Re: Managing data sprawl

    On Feb 6, 4:46 pm, lal2g...@gmail.com wrote:
    > On Feb 6, 3:13 pm, dvymil...@yahoo.com wrote:
    >
    > > On Feb 6, 2:15 pm, lal2g...@gmail.com wrote:

    >
    > > > Over last couple of years the certain types of data is exploding in
    > > > our enterprise (probably true for others too). These include source
    > > > code, binary images (and many many variations of those), all kinds of
    > > > documents (word, excel), wikis etc. These are kept like regular files
    > > > (available via NAS) because they must be available on demand. These
    > > > are not being used heavily but it is not something that can be backed
    > > > up and retrieved. In a sense they are like reference data but some
    > > > modifications do happen from time to time.

    >
    > > > This kind of data is growing like mushroom, almost increasing by
    > > > 20-40TB+ per year and increasing. Is there a way to reduce the amount
    > > > of these data by doing,
    > > > - inline compression. I saw some discussion about storewiz, but
    > > > nobody seems to have used that. Wonder why?
    > > > - Can it be moved to some kind of CAS box transparently and then
    > > > pulled in when there is an access.

    >
    > > > Is this type of growth is happening in other enterprises, businesses?
    > > > I would suspect it is.
    > > > How do you manage this?

    >
    > > > Sorry if I am repeating some earlier discussion.

    >
    > > > Thanks

    >
    > > > Sam

    >
    > > Some new companies are looking into these types of technologies... its
    > > very new, but promising... I know we can use something like a StoreWiz
    > > in our data center... The issue I had with them so far was their
    > > scalability. If they performed faster, my boss would probably
    > > consider them I think

    >
    > > Dvy

    >
    > Thanks.
    > What other companies? Any pointers.
    >
    > Also come to think about, doing this might require a newer type of
    > filesystem. That means let go our big irons, which may not hapen so
    > easily. If it is read only it might be better but read/write (I
    > should say the update) is going to make it harder. Do you see this
    > sprawl in your datacenter


    I dont know of the companies by name. I do see this sprawl in my
    DC... I suspect everyone has similar problems...


  5. Re: Managing data sprawl

    Where do you store these not so often used files (but often
    enuf).....netapp R300 types?

    On Feb 6, 4:57 pm, dvymil...@yahoo.com wrote:
    > On Feb 6, 4:46 m, lal2g...@gmail.com wrote:
    >
    >
    >
    > > On Feb 6, 3:13 pm, dvymil...@yahoo.com wrote:

    >
    > > > On Feb 6, 2:15 pm, lal2g...@gmail.com wrote:

    >
    > > > > Over last couple of years the certain types of data is exploding in
    > > > > our enterprise (probably true for others too). These include source
    > > > > code, binary images (and many many variations of those), all kinds of
    > > > > documents (word, excel), wikis etc. These are kept like regular files
    > > > > (available via NAS) because they must be available on demand. These
    > > > > are not being used heavily but it is not something that can be backed
    > > > > up and retrieved. In a sense they are like reference data but some
    > > > > modifications do happen from time to time.

    >
    > > > > This kind of data is growing like mushroom, almost increasing by
    > > > > 20-40TB+ per year and increasing. Is there a way to reduce the amount
    > > > > of these data by doing,
    > > > > - inline compression. I saw some discussion about storewiz, but
    > > > > nobody seems to have used that. Wonder why?
    > > > > - Can it be moved to some kind of CAS box transparently and then
    > > > > pulled in when there is an access.

    >
    > > > > Is this type of growth is happening in other enterprises, businesses?
    > > > > I would suspect it is.
    > > > > How do you manage this?

    >
    > > > > Sorry if I am repeating some earlier discussion.

    >
    > > > > Thanks

    >
    > > > > Sam

    >
    > > > Some new companies are looking into these types of technologies... its
    > > > very new, but promising... I know we can use something like a StoreWiz
    > > > in our data center... The issue I had with them so far was their
    > > > scalability. If they performed faster, my boss would probably
    > > > consider them I think



    >
    > > > Dvy

    >
    > > Thanks.
    > > What other companies? Any pointers.

    >
    > > Also come to think about, doing this might require a newer type of
    > > filesystem. That means let go our big irons, which may not hapen so
    > > easily. If it is read only it might be better but read/write (I
    > > should say the update) is going to make it harder. Do you see this
    > > sprawl in your datacenter

    >
    > I dont know of the companies by name. I do see this sprawl in my
    > DC... I suspect everyone has similar problems...




  6. Re: Managing data sprawl

    On Feb 7, 9:43 am, lal2g...@gmail.com wrote:
    > Where do you store these not so often used files (but often
    > enuf).....netapp R300 types?
    >
    > On Feb 6, 4:57 pm, dvymil...@yahoo.com wrote:
    >
    > > On Feb 6, 4:46 m, lal2g...@gmail.com wrote:

    >
    > > > On Feb 6, 3:13 pm, dvymil...@yahoo.com wrote:

    >
    > > > > On Feb 6, 2:15 pm, lal2g...@gmail.com wrote:

    >
    > > > > > Over last couple of years the certain types of data is exploding in
    > > > > > our enterprise (probably true for others too). These include source
    > > > > > code, binary images (and many many variations of those), all kinds of
    > > > > > documents (word, excel), wikis etc. These are kept like regular files
    > > > > > (available via NAS) because they must be available on demand. These
    > > > > > are not being used heavily but it is not something that can be backed
    > > > > > up and retrieved. In a sense they are like reference data but some
    > > > > > modifications do happen from time to time.

    >
    > > > > > This kind of data is growing like mushroom, almost increasing by
    > > > > > 20-40TB+ per year and increasing. Is there a way to reduce the amount
    > > > > > of these data by doing,
    > > > > > - inline compression. I saw some discussion about storewiz, but
    > > > > > nobody seems to have used that. Wonder why?
    > > > > > - Can it be moved to some kind of CAS box transparently and then
    > > > > > pulled in when there is an access.

    >
    > > > > > Is this type of growth is happening in other enterprises, businesses?
    > > > > > I would suspect it is.
    > > > > > How do you manage this?

    >
    > > > > > Sorry if I am repeating some earlier discussion.

    >
    > > > > > Thanks

    >
    > > > > > Sam

    >
    > > > > Some new companies are looking into these types of technologies... its
    > > > > very new, but promising... I know we can use something like a StoreWiz
    > > > > in our data center... The issue I had with them so far was their
    > > > > scalability. If they performed faster, my boss would probably
    > > > > consider them I think

    >
    > > > > Dvy

    >
    > > > Thanks.
    > > > What other companies? Any pointers.

    >
    > > > Also come to think about, doing this might require a newer type of
    > > > filesystem. That means let go our big irons, which may not hapen so
    > > > easily. If it is read only it might be better but read/write (I
    > > > should say the update) is going to make it harder. Do you see this
    > > > sprawl in your datacenter

    >
    > > I dont know of the companies by name. I do see this sprawl in my
    > > DC... I suspect everyone has similar problems...


    We leave them on our 960. We dont know when a file will be accessed,
    so we leave them there.


  7. Re: Managing data sprawl

    On 7 Feb 2007 10:02:25 -0800, dvymiller@yahoo.com wrote:

    >On Feb 7, 9:43 am, lal2g...@gmail.com wrote:
    >> Where do you store these not so often used files (but often
    >> enuf).....netapp R300 types?
    >>
    >> > > Thanks.
    >> > > What other companies? Any pointers.

    >>
    >> > > Also come to think about, doing this might require a newer type of
    >> > > filesystem. That means let go our big irons, which may not hapen so
    >> > > easily. If it is read only it might be better but read/write (I
    >> > > should say the update) is going to make it harder. Do you see this
    >> > > sprawl in your datacenter

    >>
    >> > I dont know of the companies by name. I do see this sprawl in my
    >> > DC... I suspect everyone has similar problems...

    >
    >We leave them on our 960. We dont know when a file will be accessed,
    >so we leave them there.


    Simpley put in a DFS or automount infrastructure (depending on CIFS or
    NFS) and move them as you see fit. As long as users are not mounting
    the filers directly you can move to R200's, 3020's with SATA, or even
    a Sun host with disk hanging off of it.
    Anything that's cheaper than a 900 class filer.

    ~F

  8. Re: Managing data sprawl

    You could look at products such as the Centera from EMC, or RISS from
    HP. Both are designed to store reference data. They both have good
    indexing tools to enable you to find the information once you off line
    it!

    Jc


  9. Re: Managing data sprawl

    On Feb 8, 9:50 am, "Jc" wrote:
    > You could look at products such as the Centera from EMC, or RISS from
    > HP. Both are designed to store reference data. They both have good
    > indexing tools to enable you to find the information once you off line
    > it!
    >
    > Jc


    Yes. But that requires you to move data from one type of storage to
    another. So the data has to be first classified. I haven't looked at
    these products, do they offer a NFS front end? Also modification is
    rare but not exactly uncommon. In that case data has to move from RISS
    type of platform to another filer and then migrated.


  10. Re: Managing data sprawl

    On Feb 8, 6:23 pm, lal2g...@gmail.com wrote:
    > On Feb 8, 9:50 am, "Jc" wrote:
    >
    > > You could look at products such as the Centera from EMC, or RISS from
    > > HP. Both are designed to store reference data. They both have good
    > > indexing tools to enable you to find the information once you off line
    > > it!

    >
    > > Jc

    >
    > Yes. But that requires you to move data from one type of storage to
    > another. So the data has to be first classified. I haven't looked at
    > these products, do they offer a NFS front end? Also modification is
    > rare but not exactly uncommon. In that case data has to move from RISS
    > type of platform to another filer and then migrated.


    Sumandra, there are many types of platforms that try and move data
    inline, like neopath and acopia.. .from my experience with them, they
    are like a forklift change of my network and too much hassle... If
    what you are experiencing is too much file growth, then do look at
    storewiz... it will actually help reduce duplicates and manage chaotic
    growth. I'd caution to really test out their performance... unless
    they have made too many improvements, you are dead in the water.
    Another area where we could use their help is with backup.... I had a
    seperate post related to this which Faender eloquently helped me with
    as usual, but if I can keep my backups on tier2 and have them
    compressed with a device like storewiz, that would truly aid me. I
    dont know if they can do that or not. It would involve them speaking
    to a backup server I suppose.


  11. Re: Managing data sprawl

    On 8 Feb 2007 20:17:05 -0800, dvymiller@yahoo.com wrote:

    >On Feb 8, 6:23 pm, lal2g...@gmail.com wrote:
    >> On Feb 8, 9:50 am, "Jc" wrote:
    >>
    >> > You could look at products such as the Centera from EMC, or RISS from
    >> > HP. Both are designed to store reference data. They both have good
    >> > indexing tools to enable you to find the information once you off line
    >> > it!

    >>
    >> > Jc

    >>
    >> Yes. But that requires you to move data from one type of storage to
    >> another. So the data has to be first classified. I haven't looked at
    >> these products, do they offer a NFS front end? Also modification is
    >> rare but not exactly uncommon. In that case data has to move from RISS
    >> type of platform to another filer and then migrated.

    >
    >Sumandra, there are many types of platforms that try and move data
    >inline, like neopath and acopia.. .from my experience with them, they
    >are like a forklift change of my network and too much hassle... If
    >what you are experiencing is too much file growth, then do look at
    >storewiz... it will actually help reduce duplicates and manage chaotic
    >growth. I'd caution to really test out their performance... unless
    >they have made too many improvements, you are dead in the water.
    >Another area where we could use their help is with backup.... I had a
    >seperate post related to this which Faender eloquently helped me with
    >as usual, but if I can keep my backups on tier2 and have them
    >compressed with a device like storewiz, that would truly aid me. I
    >dont know if they can do that or not. It would involve them speaking
    >to a backup server I suppose.


    I do not recall StorWiz doing any de-dpulication, merely compression.

    And like so many other things in life, your mileage will vary with
    compression.

    ~F

+ Reply to Thread