Context Switching and Flash Memory - Linux

This is a discussion on Context Switching and Flash Memory - Linux ; Hi, I was thinking about doing a small undergraduate research project for a class and wanted to get some thoughts on whether or not this is worth doing, or if there is something obviously wrong with the concept. As I ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: Context Switching and Flash Memory

  1. Context Switching and Flash Memory

    Hi,

    I was thinking about doing a small undergraduate research project for a
    class and wanted to get some thoughts on whether or not this is worth
    doing, or if there is something obviously wrong with the concept.

    As I understand it, traditional filesystem calls like read()/write()
    have the following general flow:

    1)the system call will cause one context switch to jump into the kernel
    2)the kernel will process that call and queue up a request with the DMA
    driver
    3)the kernel will then context switch into the next scheduled process
    4)the second process runs for a while
    5)the DMA interrupt occurs, causing an additional switch back into the
    kernel
    6)Kernel wakes up original process (one more switch)

    Context switching is generally one of the most expensive operations in
    the OS. First, there is the direct cost of saving and loading the CPU
    state for each context, but there are also the secondary costs of cache
    misses once you're in the new context.

    Traditionally, each system call will cause the first process to go to
    sleep while the DMA request is being processed. Hard drives have
    traditionally had extremely long seek times ~10ms, so you should
    schedule another process in the meantime.

    With newer flash memory and new solid state disks, however, seek time is
    several orders of magnitude faster (eg. I found one solid state drive
    with a 20 microsecond access time here:
    http://www.curtisssd.com/products/drives/hyperxclr/).

    To compare, I found a recent paper here
    (http://portal.acm.org/citation.cfm?id=1281700.1281702) that quantifies
    the cost of context switch as anywhere from several microseconds to a
    thousand microseconds per switch.

    So here's my question: Would it perhaps be worthwhile to actually just
    stay in the kernel and have it busy wait rather than switch into another
    process? This would save two context switches.

    Here are some more questions:

    How could this be implemented?

    Use PIO in the kernel? Could DMA actually be used in this scheme?

    If there are in fact any speed performances imparted by this scheme, at
    what point does the cycles wasted busy waiting outstrip the cycles that
    could have been spent doing useful things (after the costs of the
    switches + cache misses are factored in)?

    Is this even worth looking into?

    I do not have a huge amount of experience with the linux kernel so
    forgive me if any of these were stupid questions.

    thanks,
    -perry

  2. Re: Context Switching and Flash Memory

    Perry Hung wrote:

    > With newer flash memory and new solid state disks, however, seek time is
    > several orders of magnitude faster (eg. I found one solid state drive
    > with a 20 microsecond access time here:
    > http://www.curtisssd.com/products/drives/hyperxclr/).


    > So here's my question: Would it perhaps be worthwhile to actually just
    > stay in the kernel and have it busy wait rather than switch into another
    > process? This would save two context switches.


    It may depend how much data you're trying to read/write. If the actual
    data transfer will take significant amounts of time you're better off
    doing something else while you wait.

    It may well be worth doing for small accesses on fast devices.

    Chris

  3. Re: Context Switching and Flash Memory

    Perry Hung wrote:

    > As I understand it, traditional filesystem calls like
    > read()/write() have the following general flow:
    >
    > 1)the system call will cause one context switch to jump into
    > the kernel
    > 2)the kernel will process that call and queue up a
    > request with the DMA driver
    > 3)the kernel will then context switch into the next scheduled
    > process
    > 4)the second process runs for a while
    > 5)the DMA interrupt occurs, causing an additional switch back
    > into the kernel
    > 6)Kernel wakes up original process (one more switch)


    You forgot about the I/O cache. Linux uses up almost all free RAM
    for a cache, that it fills with readahead data (there's actually
    a syscall one can use to force the kernel to readahead a certain
    file). Once a file has been readahead into the cache the read
    call can be finished within very short time. Storage I/O is only
    intiated, if the requested data is not in the cache.

    The more RAM loaded applications use, that smaller that cache
    gets, so if you need a highly responsive system with large I/O
    loads, having a lot of RAM can give you a great performance
    boost.

    Also storage I/O is not performed on a per-process base (that has
    been until 2.4.18), as with the introduction of kernel 2.6 a
    sophisticated I/O scheduler subsystem was introduced, that
    queues up I/O requests of multiple programs, which are executed
    in batches, thus saving a lot of usermode/kernelmode
    transistions. And for what I know a usermode->kernelmode
    transistion is not that expensive, as it is not a full context
    switch.

    Wolfgang Draxinger
    --
    E-Mail address works, Jabber: hexarith@jabber.org, ICQ: 134682867


  4. Re: Context Switching and Flash Memory

    Perry Hung writes:

    > Hi,
    >
    > I was thinking about doing a small undergraduate research project for
    > a class and wanted to get some thoughts on whether or not this is
    > worth doing, or if there is something obviously wrong with the concept.
    >
    > As I understand it, traditional filesystem calls like read()/write()
    > have the following general flow:
    >
    > 1)the system call will cause one context switch to jump into the kernel


    There are many different kinds of context switches. A system call
    which is a very optimized path for once is cheaper than a context
    switch between threads. A context switch between threads sharing the
    same memory is cheaper. A context switch to or from a kernel thread is
    also as cheap. Then there are context switches who have to switch the
    virtual memory context.

    > 2)the kernel will process that call and queue up a request with the
    > DMA driver
    > 3)the kernel will then context switch into the next scheduled process
    > 4)the second process runs for a while
    > 5)the DMA interrupt occurs, causing an additional switch back into the
    > kernel
    > 6)Kernel wakes up original process (one more switch)
    >
    > Context switching is generally one of the most expensive operations in
    > the OS.


    Even the most expensive context switches tend to be cheaper
    than let's say a page fault requiring to zero a fresh page.


    > First, there is the direct cost of saving and loading the CPU
    > state for each context, but there are also the secondary costs of
    > cache misses once you're in the new context.


    Before you do great theoretical designs I would suggest you
    measure how costly things really are. Using a profiler
    like oprofile can be very instructive.

    Sometimes great designs are build on very faulty assumptions. For
    example I remember reading the data sheets of a now long dead ethernet
    NIC. Normally most modern NICs are quite similar (all small
    variations over the old AMD Lance theme), but this one had a very unusual
    design which seemed to be built on the basic assumption that copies
    are faster than converting virtual memory to physical memory
    addresses. Now on Linux in most cases such a conversion option is only
    a simple subtraction from the address, so that was totally
    wrong. Still they built a briefly commercially shipped design around
    that faulty idea.

    Your assumptions seem to be a bit similar I'm afraid.

    > To compare, I found a recent paper here
    > (http://portal.acm.org/citation.cfm?id=1281700.1281702) that
    > quantifies the cost of context switch as anywhere from several
    > microseconds to a thousand microseconds per switch.


    Relying on third party numbers is always dangerous. Measure, measure, measure.

    > Traditionally, each system call will cause the first process to go to
    > sleep while the DMA request is being processed. Hard drives have
    > traditionally had extremely long seek times ~10ms, so you should
    > schedule another process in the meantime.


    In most cases idle takes over during the delay and switching from/to
    idle is very cheap.

    > Use PIO in the kernel? Could DMA actually be used in this scheme?


    PIO is generally very very slow compared to anything else in a modern
    system. Don't even think about it.

    > Is this even worth looking into?


    I doubt it is worthwhile with hard disks to be honest. It might be useful
    with extremly fast devices -- e.g. if you program a GPU (graphics card)
    you would probably poll for its completion. But hard disks are still
    incredibly slow compared to the CPU.

    > I do not have a huge amount of experience with the linux kernel so
    > forgive me if any of these were stupid questions.


    I would suggest learning about a profiler like oprofile and use it.

    -Andi

+ Reply to Thread