What still uses the block layer? - Kernel

This is a discussion on What still uses the block layer? - Kernel ; On Mon, 15 Oct 2007 03:04:00 CDT, Rob Landley said: > I note that the eth0 and eth1 names are dynamically assigned on a first come > first serve basis (like scsi). This never causes me a problem because the ...

+ Reply to Thread
Page 5 of 5 FirstFirst ... 3 4 5
Results 81 to 91 of 91

Thread: What still uses the block layer?

  1. Re: What still uses the block layer?

    On Mon, 15 Oct 2007 03:04:00 CDT, Rob Landley said:
    > I note that the eth0 and eth1 names are dynamically assigned on a first come
    > first serve basis (like scsi). This never causes me a problem because the
    > driver loading order is constant, and once you figure out that eth0 is
    > gigabit and eth1 is the 80211g it _stays_ that way across reboots, reliably.
    > Yeah, it's a heuristic. Hands up everybody relying on such a heuristic in
    > the real world.


    I've gotten burned by that heuristic enough times to not rely on it. My last
    laptop had an ethernet on the motherboard, a *separate* ethernet in the docking
    station, an ethernet on a multifunction pcmcia card (I usually just used the
    modem side), and a wireless that looked like an ethernet - so it was possible
    for a given interface to be eth1 (if no dock and no pcmcia card) or eth3 (if
    both were present). And that's on a laptop from almost 5 years ago.

    And then there's the recent Sun and Dell 1U rack-mounts that have 4 ethernets
    on the motherboard, and they *never* seem to assign in a 0,1,2,3 order that
    matches the 0 1 2 3 printed above the 4 RJ45's

    So I have for years been a proponent of 'ethN is nailed by MAC address'

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.7 (GNU/Linux)
    Comment: Exmh version 2.5 07/13/2001

    iD8DBQFHFZ7gcC3lWbTT17ARAsZXAJ4xuUB1ebX7Vk0jcCwwfa DOEL+g4gCg7TJf
    rz3Is447x34Dx3ZHEU167KY=
    =glX8
    -----END PGP SIGNATURE-----


  2. Re: What still uses the block layer?

    On Wed, 17 Oct 2007, Valdis.Kletnieks@vt.edu wrote:

    > On Mon, 15 Oct 2007 03:04:00 CDT, Rob Landley said:
    >> I note that the eth0 and eth1 names are dynamically assigned on a first come
    >> first serve basis (like scsi). This never causes me a problem because the
    >> driver loading order is constant, and once you figure out that eth0 is
    >> gigabit and eth1 is the 80211g it _stays_ that way across reboots, reliably.
    >> Yeah, it's a heuristic. Hands up everybody relying on such a heuristic in
    >> the real world.

    >
    > I've gotten burned by that heuristic enough times to not rely on it. My last
    > laptop had an ethernet on the motherboard, a *separate* ethernet in the docking
    > station, an ethernet on a multifunction pcmcia card (I usually just used the
    > modem side), and a wireless that looked like an ethernet - so it was possible
    > for a given interface to be eth1 (if no dock and no pcmcia card) or eth3 (if
    > both were present). And that's on a laptop from almost 5 years ago.
    >
    > And then there's the recent Sun and Dell 1U rack-mounts that have 4 ethernets
    > on the motherboard, and they *never* seem to assign in a 0,1,2,3 order that
    > matches the 0 1 2 3 printed above the 4 RJ45's
    >
    > So I have for years been a proponent of 'ethN is nailed by MAC address'


    on the other hand, I have two systems in my lab with identical hardware,
    loaded with the same OS image, but one calls the interfaces eth0, eth1,
    eth2 while the other calls them eth12, eth13, eth14 becouse it had three
    quad cards installed in it for a few days several months ago.

    also think what happens to a system if you replace a failed NIC with an
    card identical except the MAC addresses. instead of everything just
    working as before, you now have new ethX devices and are missing the old
    ethX devices.

    both ways of doing things can yield nonsense results in cases where the
    other one gives perfectly useable results.

    nobody is arguing that the ability to nail things down by MAC address
    (or drives by UUID) should be removed, we're just arguing that the option
    to get useable consistant names from hardware that is consistant is being
    removed and that it shouldn't be, it has it's place just like the 'best
    effort' naming.

    David Lang
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: What still uses the block layer?

    On Tue, Oct 16, 2007 at 01:55:07PM -0700, david@lang.hm wrote:

    > why is this any different from the external enclosures? they have always
    > appeared as the type of device that connects them to the motherboard, (and
    > even with SCSI, there are some controllers that don't generate sdX devices)


    In the past enclosures supported only one kind of connector so this
    assumption was fine. But nowadays an external disk may have several
    connectors (like USB, Firewire and eSata). Why should the disk's name
    depend on what type of cable did I manage to grab first? It is the
    _same_ disk regardless of the cable type.

    There is one thing however that could be improved: renaming a disk in an
    udev rule should propagate the new name back to the kernel, just like
    renaming an ethernet interface does. That way mapping error messages to
    physical disk locations could be made much easier.

    Gabor

    --
    ---------------------------------------------------------
    MTA SZTAKI Computer and Automation Research Institute
    Hungarian Academy of Sciences
    ---------------------------------------------------------
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: What still uses the block layer?

    Gabor Gombas wrote:
    > On Tue, Oct 16, 2007 at 01:55:07PM -0700, david@lang.hm wrote:
    >> why is this any different from the external enclosures? they have always
    >> appeared as the type of device that connects them to the motherboard, (and
    >> even with SCSI, there are some controllers that don't generate sdX devices)

    >
    > In the past enclosures supported only one kind of connector so this
    > assumption was fine. But nowadays an external disk may have several
    > connectors (like USB, Firewire and eSata). Why should the disk's name
    > depend on what type of cable did I manage to grab first? It is the
    > _same_ disk regardless of the cable type.


    Yes, but even udev won't give you one and the same symlink to the disk's
    device file then. There isn't a persistent unique target/unit property
    which all of these transports have in common.

    The only thing that could be common in the best case is the symlink to
    the partition's device file, based on filesystem UUID or filesystem label.

    ) unless you write your own rule specific to this on particular enclosure
    --
    Stefan Richter
    -=====-=-=== =-=- =---=
    http://arcgraph.de/sr/
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: What still uses the block layer?

    On Wed, 17 Oct 2007, Gabor Gombas wrote:

    > On Tue, Oct 16, 2007 at 01:55:07PM -0700, david@lang.hm wrote:
    >
    >> why is this any different from the external enclosures? they have always
    >> appeared as the type of device that connects them to the motherboard, (and
    >> even with SCSI, there are some controllers that don't generate sdX devices)

    >
    > In the past enclosures supported only one kind of connector so this
    > assumption was fine. But nowadays an external disk may have several
    > connectors (like USB, Firewire and eSata). Why should the disk's name
    > depend on what type of cable did I manage to grab first? It is the
    > _same_ disk regardless of the cable type.


    the right type for the type of cable you choose to use. yes it's the same
    disk, but by choosing to hook it up in a different way you get different
    results from it (different performance, different predictability)

    again, if you want to have a udev rule that then maps these different name
    onto the same name, more power to you, but why do you insist on makeing
    _everyone_ work that way (or go to significant extra effort to find the
    info in the changing directory structure of sysfs to track down the info
    that you throw away)

    > There is one thing however that could be improved: renaming a disk in an
    > udev rule should propagate the new name back to the kernel, just like
    > renaming an ethernet interface does. That way mapping error messages to
    > physical disk locations could be made much easier.


    definantly.

    David Lang
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: What still uses the block layer?

    Jeff Garzik wrote:

    >> But again, please remember that these USB devices are really SCSI
    >> devices. Same for SATA devices. There is a reason they are using the
    >> SCSI layer, and it isn't just because the developers felt like it

    >
    > /somewhat/ true I'm afraid: libata uses the SCSI layer for ATAPI
    > devices because they are essentially bridges to SCSI devices. It uses
    > the SCSI layer for ATA devices because the SCSI layer provided a huge
    > amount of infrastructure that would need to have been otherwise
    > duplicated, /then/ massaged into coordinating between > layer> and when dealing with ATAPI.
    >
    > There is also a detail that was of /huge/ value when introducing a new
    > device class: distro installers automatically work, if you use SCSI. If
    > you use a new block device type, that behaves differently from other
    > types and is on a different major, you have to poke the distros into
    > action or do it yourself.
    >
    > IOW, it was the high Just Works(tm) value of the SCSI layer when it came
    > to ATA (not ATAPI) devices.
    >
    > For the future, ATA will eventually be more independent (though the SCSI
    > simulator will be available as an option, for compat), but the value is
    > big enough to put that task on the back-burner.
    >

    I remember being told that I didn't understand the problem when I
    suggested using ide-scsi for everything and just hiding the transport. I
    get great pleasure from having been (mostly) right on that one. I still
    have old systems running ZIP drives as scsi...

    --
    Bill Davidsen
    "We have more to fear from the bungling of the incompetent than from
    the machinations of the wicked." - from Slashdot
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: OOM killer gripe (was Re: What still uses the block layer?)

    On Tue, Oct 16, 2007 at 05:34:15PM +1000, Nick Piggin wrote:
    > > It's a hard call. The I/O time for 1MB of contiguous disk data
    > > is about the I/O time of 512 bytes of contiguous disk data.

    >
    > And if you're thrashing, then by definition you need to throw
    > out 1MB of your working set in order to read it in.


    Right. But you need a differential hit rate of only a few percent on
    that 1020 extra kb of data you swapped in versus the 1Mb of data you
    swapped out for this to be advantageous.

    With "differential hit rate" I mean the chances of getting a hit on
    the 1Mb of data just paged in, minus the chances of getting a hit on
    the 1Mb of data just paged out.

    With a little luck that 1Mb that is paged out didn't get used for
    quite a while, while there is a hint that the 1Mb you're paging in
    is active, as one of its sub-pages just got a hit.

    So... IMHO, it would be useful to implement something that pages out
    chunks of memory larger than a single hardware page. This would reduce
    the size of the memory management tables (*), as well as improve disk
    throughput if things DO come to paging....

    This should of course be configurable. Some workloads are better off
    with a virtual page size of 8k, some with 128k. some with 1M.

    As far as I can see, the "page-cluster" parameter defines how many
    pages at a time are selected for page-out at a time. This increases
    the page-out efficiency. Improving the page-in efficiency is also
    useful: It is the other half of hte equation.

    Roger.


    (*) If the kernel starts working with a 1Mb virtual page size, you
    need a 256 times smaller mapping table between processes and memory or
    swap. Of course, the hardware doesn't support this (actually, it does
    for 1Mb virtual pages), so you'll have to create 256 page table
    entries for the hardware instead of just one.



    --
    ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
    ** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
    *-- BitWizard writes Linux device drivers for any device you may have! --*
    Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
    Does it sit on the couch all day? Is it unemployed? Please be specific!
    Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: OOM killer gripe (was Re: What still uses the block layer?)

    On Thursday 18 October 2007 8:00:49 am Rogier Wolff wrote:
    > So... IMHO, it would be useful to implement something that pages out
    > chunks of memory larger than a single hardware page. This would reduce
    > the size of the memory management tables (*), as well as improve disk
    > throughput if things DO come to paging....


    I believe that was more or less the topic of this paper:
    http://kernel.org/doc/ols/2006/ols20...ages-73-78.pdf

    Although these seem sort of tangentially related:
    http://kernel.org/doc/ols/2006/ols20...es-369-384.pdf
    http://kernel.org/doc/ols/2006/ols20...es-125-130.pdf

    Rob
    --
    "One of my most productive days was throwing away 1000 lines of code."
    - Ken Thompson.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: OOM killer gripe (was Re: What still uses the block layer?)

    On Fri, Oct 19, 2007 at 01:49:31AM -0500, Rob Landley wrote:
    > On Thursday 18 October 2007 8:00:49 am Rogier Wolff wrote:
    > > So... IMHO, it would be useful to implement something that pages out
    > > chunks of memory larger than a single hardware page. This would reduce
    > > the size of the memory management tables (*), as well as improve disk
    > > throughput if things DO come to paging....

    >
    > I believe that was more or less the topic of this paper:
    > http://kernel.org/doc/ols/2006/ols20...ages-73-78.pdf


    Not really. They are talking about doing this for the page
    cache. That's where filesystem files are cached in memory. I'm talking
    about the memory that programs use while they are running.

    Roger.

    --
    ** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
    ** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
    *-- BitWizard writes Linux device drivers for any device you may have! --*
    Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
    Does it sit on the couch all day? Is it unemployed? Please be specific!
    Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: OOM killer gripe (was Re: What still uses the block layer?)

    Hi!

    > > Would an oom-kill-someone-now sysrq be of help, I wonder?

    >
    > *shrug* It might. I was a letting it run hoping it would complete itself when


    sysrq-f, IIRC.

    > it locked solid. (The keyboard LEDs weren't flashing, so I don't _think_ it
    > paniced. I was in X so I wouldn't have seen a message...)
    >
    > (To be honest, I can never remember how to trigger sysrq on a laptop keyboard.
    > Presumably X won't intercept it the way it does alt-f1 and ctrl-alt-del...)


    sysrq works even in X, and should be pressable on todays laptop
    keyboards...
    Pavel
    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: OOM killer gripe (was Re: What still uses the block layer?)

    Hi!

    > I suppose I should just configure suspending to a file instead of a
    > swap partition, but I've just historically trusted suspend/resume to a
    > swap partition much more than to a file. Or maybe I should hack in a
    > sysctl to prevent any swapping even though the swap partition is
    > configured (so only suspend/resume will use it).


    swapon -a; swsusp; swapoff -a?

    Pavel

    --
    (english) http://www.livejournal.com/~pavelmachek
    (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pav...rses/blog.html
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 5 of 5 FirstFirst ... 3 4 5