Gigaswift adapter in an old e250 - SUN

This is a discussion on Gigaswift adapter in an old e250 - SUN ; There was an unused E250 sitting in a closet and I thought, hey that could be a solid fileserver. So I bought some disks, memory, and a gigaswift adapter, threw solaris 10 on it and created a nice zfs nfs ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Gigaswift adapter in an old e250

  1. Gigaswift adapter in an old e250

    There was an unused E250 sitting in a closet and I thought, hey that could
    be a solid fileserver. So I bought some disks, memory, and a gigaswift
    adapter, threw solaris 10 on it and created a nice zfs nfs server for all
    our solaris and linux clients. About 10 clients and 20 users.
    Everything seemed to work fine in my tests and I rolled it out into
    production and things started slowing down. Not all the time, just
    transient slow downs where all the users interactive response went
    horribly wrong. Then it would speed back up and be fine.

    I believe I tracked it down to the gigaswift board being too fast for
    the 2 400MHz CPUs. The old rule of thumb that I heard was a 1MHz of
    cpu power for 1 MHz of ethernet speed, so I knew I was cutting it
    close.

    What happens is that during an iperf test, or multiple users doing a large
    cvs checkout, or multiple users doing a build, or any single large(>500MB)
    file transfer, the machine goes from 50% to 70% kernel.

    I've updated to the latest ce driver, done the latest cluster patch for
    solaris 10, placed the board in the 33MHz and 66MHz PCI slot, played with
    /etc/system parameters, changed the ce.conf file, changed other ce
    parameters with ndd and nothing really works. I can slow my iperf down
    from 350Mbits/sec to 120Mbits/sec by increasing rx_intr_time and
    rx_intr_pkts to 300 and 600 respectively. This decreases the kernel
    cpu utilization to around 35%. But this still will slow down any
    other users to their home directories. Also the kstat program reports
    no overflows, failures, drops, errors, or anything that screams out
    at me.

    I'm guessing that I shouldn't have put a 1Gbit ethernet in an old 2
    400MHz CPU E250, or solaris 10 is optimized for the newer machines and
    its defaults just don't match my case, or I don't know. I had a lousy
    weekend trying to get things to work.

    Any ideas on how I can get this to work?
    Thanks,
    Mark

  2. Re: Gigaswift adapter in an old e250

    On Apr 9, 7:18 pm, Mark Valery wrote:

    > I believe I tracked it down to thegigaswiftboard being too fast for
    > the 2 400MHz CPUs.

    ..
    ..
    ..
    > What happens is that during an iperf test, or multiple users doing a large
    > cvs checkout, or multiple users doing a build, or any single large(>500MB)
    > file transfer, the machine goes from 50% to 70% kernel.
    >

    ..
    ..
    ..
    > Any ideas on how I can get this to work?


    Maybe http://freshmeat.net/projects/slowdown/ will help?

    "slowdown is a program that contends with I/O-hungry processes. The
    "nice" program does a good job of handling CPU priorities, but doesn't
    help much when you have a process that is moving tons of data; other
    processes can continue to starve for I/O, making a system painful to
    use, as during a backup, while tripwire is running, etc. slowdown
    manages another process by sleeping for a user-specified number of
    seconds or fractions of seconds, each time some data is moved using,
    for example, read(), write(), send(), recv(), etc."


  3. Re: Gigaswift adapter in an old e250

    Hi,

    Mark Valery wrote:
    > There was an unused E250 sitting in a closet and I thought, hey that could
    > be a solid fileserver. So I bought some disks, memory, and a gigaswift
    > adapter, threw solaris 10 on it and created a nice zfs nfs server for all
    > our solaris and linux clients. About 10 clients and 20 users.
    > Everything seemed to work fine in my tests and I rolled it out into
    > production and things started slowing down. Not all the time, just
    > transient slow downs where all the users interactive response went
    > horribly wrong. Then it would speed back up and be fine.
    >
    > I believe I tracked it down to the gigaswift board being too fast for
    > the 2 400MHz CPUs. The old rule of thumb that I heard was a 1MHz of
    > cpu power for 1 MHz of ethernet speed, so I knew I was cutting it
    > close.

    I would't say that the GbE board is to fast, but moving from 100Mb/s to
    1000Mb/s will show the next bottleneck just as I understand your posting.


    >
    > What happens is that during an iperf test, or multiple users doing a large
    > cvs checkout, or multiple users doing a build, or any single large(>500MB)
    > file transfer, the machine goes from 50% to 70% kernel.
    >
    > I've updated to the latest ce driver, done the latest cluster patch for
    > solaris 10, placed the board in the 33MHz and 66MHz PCI slot, played with
    > /etc/system parameters, changed the ce.conf file, changed other ce
    > parameters with ndd and nothing really works. I can slow my iperf down
    > from 350Mbits/sec to 120Mbits/sec by increasing rx_intr_time and
    > rx_intr_pkts to 300 and 600 respectively. This decreases the kernel
    > cpu utilization to around 35%. But this still will slow down any
    > other users to their home directories. Also the kstat program reports
    > no overflows, failures, drops, errors, or anything that screams out
    > at me.
    >
    > I'm guessing that I shouldn't have put a 1Gbit ethernet in an old 2
    > 400MHz CPU E250, or solaris 10 is optimized for the newer machines and
    > its defaults just don't match my case, or I don't know. I had a lousy
    > weekend trying to get things to work.
    >
    > Any ideas on how I can get this to work?
    > Thanks,
    > Mark

    What about disk IO, check iostat -xtcn 1 and vmstat 1 outputs when
    performance goes down.

    Also, is not ZFS more CPU intensive that UFS?

    We only have a E250 as fileserver, sure we can bring it down on its
    knees if using GbE interface and transfering lot of files, but thats
    just the way it is.

    I would guess that the SCSI interface is the next bottleneck before the
    CPU, if disk is blocked then nothing happends.



    /michael

  4. Re: Gigaswift adapter in an old e250

    On Thu, 12 Apr 2007 00:19:42 +0200, Michael Laajanen wrote:

    > Hi,
    >


    > What about disk IO, check iostat -xtcn 1 and vmstat 1 outputs when
    > performance goes down.
    >
    > Also, is not ZFS more CPU intensive that UFS?

    Yes and I was making it worse with compression running on the raidz
    pool.
    >
    > We only have a E250 as fileserver, sure we can bring it down on its
    > knees if using GbE interface and transfering lot of files, but thats
    > just the way it is.

    I like the E250 as a fileserver for our size group. These old machines are
    like tanks. A bit slow, but they just keep running and I feel much safer
    with all the home dirs in a raidz pool.
    >
    > I would guess that the SCSI interface is the next bottleneck before the
    > CPU, if disk is blocked then nothing happends.

    I don't get close to this bottleneck yet when using nfs. I can max it out
    when doing copies on the fileserver, but no other user log into this
    machine, so its only a test to let me know what speeds I can get into
    the raidz pool.
    >

    This is what I ended up doing:
    I've got it to a manageable level now until we get a real system
    administrator. I turned off compression on the zfs raidz storage,
    cleaned up a users windows machine that was causing samba on another
    solaris machine to constantly communicate with ypserv running on the
    E250. Using ethereal, ypserv was sending 1/4 of the amount of data that
    nfs was doing. I also slowed down the 1GHz ethernet board to around a
    125MHz throughput measured by iperf by increasing the values of
    rx_intr_time and rx_intr_pkts to 300 and 600 respectively. All of
    these together has now decreased kernel cpu percentage to around 20%
    during peak usage and I've haven't any complaints about interactive
    respond going to hell.

  5. Re: Gigaswift adapter in an old e250

    On Wed, 11 Apr 2007 11:57:53 -0700, dan.nygre wrote:

    >> Maybe http://freshmeat.net/projects/slowdown/ will help?

    >
    > "slowdown is a program that contends with I/O-hungry processes. The
    > "nice" program does a good job of handling CPU priorities, but doesn't
    > help much when you have a process that is moving tons of data; other
    > processes can continue to starve for I/O, making a system painful to
    > use, as during a backup, while tripwire is running, etc. slowdown
    > manages another process by sleeping for a user-specified number of
    > seconds or fractions of seconds, each time some data is moved using,
    > for example, read(), write(), send(), recv(), etc."


    Thanks. Not sure how I can specifically apply this to the ethernet
    board tcp communications. If I do it to nfsd, then I slow all
    connections.

    I've got it to a manageable level now until we get a real system
    administrator. I turned off compression on the zfs raidz storage,
    cleaned up a users windows machine that was causing samba on another
    solaris machine to constantly communicate with ypserv running on the
    E250. Using ethereal, ypserv was sending 1/4 of the amount of data that
    nfs was doing. I also slowed down the 1GHz ethernet board to around a
    125MHz throughput measured by iperf by increasing the values of
    rx_intr_time and rx_intr_pkts to 300 and 600 respectively. All of
    these together has now decreased kernel cpu percentage to around 20%
    during peak usage and I've haven't any complaints about interactive
    respond going to hell.

  6. Re: Gigaswift adapter in an old e250

    > On Wed, 11 Apr 2007 11:57:53 -0700, dan.nygre wrote:
    >> Maybe http://freshmeat.net/projects/slowdown/will help?

    >
    > > "slowdown is a program that contends with I/O-hungry processes. The
    > > "nice" program does a good job of handling CPU priorities, but doesn't
    > > help much when you have a process that is moving tons of data; other
    > > processes can continue to starve for I/O, making a system painful to
    > > use, as during a backup, while tripwire is running, etc. slowdown
    > > manages another process by sleeping for a user-specified number of
    > > seconds or fractions of seconds, each time some data is moved using,
    > > for example, read(), write(), send(), recv(), etc."


    > On Apr 18, 6:41 pm, Mark Valery wrote:
    > Thanks. Not sure how I can specifically apply this to the ethernet
    > board tcp communications. If I do it to nfsd, then I slow all
    > connections.
    >
    > I've got it to a manageable level now until we get a real system
    > administrator.


    I'm glad you have the problem resolved, and thanks for the follow up,
    I was curious how it turned out.

    As for "slowdown", I though you could apply it to CVS or other
    applications that were causing I/O related problems. It seemed like
    you knew what the I/O intensive applications were that were causing
    the problem, and by just capping the I/O allowed on them, the rest of
    the I/O accesses would not be starved.


  7. Re: Gigaswift adapter in an old e250

    On Apr 18, 6:41 pm, Mark Valery wrote:
    > Thanks. Not sure how I can specifically apply this to the ethernet
    > board tcp communications. If I do it to nfsd, then I slow all
    > connections.


    In other words, I didn't realize this E250 was configured only as
    fileserver. I thought it was also running some client applications
    accessing the filesystem which you could throttle with "slowdown".



  8. Re: Gigaswift adapter in an old e250

    On Mon, 23 Apr 2007 12:46:52 -0700, dan.nygre wrote:

    > On Apr 18, 6:41 pm, Mark Valery wrote:
    >> Thanks. Not sure how I can specifically apply this to the ethernet
    >> board tcp communications. If I do it to nfsd, then I slow all
    >> connections.

    >
    > In other words, I didn't realize this E250 was configured only as
    > fileserver. I thought it was also running some client applications
    > accessing the filesystem which you could throttle with "slowdown".


    I wasn't thinking beyond the fileserver. I could slow the
    client applications on the users computers that's accessing their home
    directories where their cvs workspace resides.

+ Reply to Thread