What still uses the block layer? - Kernel

This is a discussion on What still uses the block layer? - Kernel ; On Monday 15 October 2007 1:00:15 am Greg KH wrote: > If you hate USB storage devices using scsi, please use the ub driver, > that is what it was written for. For the embedded space, the ability to configure ...

+ Reply to Thread
Page 2 of 5 FirstFirst 1 2 3 4 ... LastLast
Results 21 to 40 of 91

Thread: What still uses the block layer?

  1. Re: What still uses the block layer?

    On Monday 15 October 2007 1:00:15 am Greg KH wrote:
    > If you hate USB storage devices using scsi, please use the ub driver,
    > that is what it was written for.


    For the embedded space, the ability to configure out the scsi layer is
    interesting from a size perspective. I bookmarked that a while back, but had
    forgotten about it. Thanks for the reminder.

    For the desktop I don't object to the scsi layer. I object to the naming.
    Merging a half-dozen different types of devices into a single name space, and
    then warning us that the order they appear within that namespace could be the
    result of race conditions... Seems like an artificially inflated problem to
    me. Don't merge them together and each namespace is a smaller problem, often
    with only a single device or with a stable relationship between the devices.

    (That said, the answer to my original question, "is the block layer still in
    use" seems to be yes, so creating a 00-INDEX for Documentation/block is a
    good thing, and I'll go do that. I acknowledge that I asked this question
    _horribly_, due to having other unresolved issues with the scsi layer...)

    > When did usb-storage devices ever show up as /dev/usb0? USB flash disks
    > are really SCSI devices, look at the USB storage spec for proof of that.


    Um, possibly I _was_ playing with the ub driver and got a /dev/ub0. (I
    vaguely recall playing with back around... February? When did it wander
    across Pavel's blog... I don't actually remember if I got it to work or
    not.) Possibly this is from playing with a usb scanner back around 2004. (I
    just dragged out my other USB device from that period, an ethernet dongle,
    but it doesn't create /dev anything. Just shows up as usb2.

    The point I was trying to make is that it seems to me like it would be
    possible to keep the namespace separate here, and thus reduce the enumeration
    problems to the point where common cases (like my laptop) aren't impacted by
    them during early boot. I don't think anybody (outside the embedded space)
    is actually upset that /dev/hda now goes through the scsi layer: they're
    upset Ubuntu 7.04 no longer calls it /dev/hda.

    > thanks,
    >
    > greg k-h


    Thank you,

    Rob
    --
    "One of my most productive days was throwing away 1000 lines of code."
    - Ken Thompson.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: OOM killer gripe (was Re: What still uses the block layer?)

    On Monday 15 October 2007 19:52, Rob Landley wrote:
    > On Monday 15 October 2007 8:37:44 am Nick Piggin wrote:
    > > > Virtual memory isn't perfect. I've _always_ been able to come up with
    > > > examples where it just doesn't work for me. This doesn't mean VM
    > > > overcommit should be abolished, because it's useful more often than
    > > > not.

    > >
    > > I hate to go completely offtopic here, but disks are so incredibly
    > > slow when compared to RAM that there is really nothing the kernel
    > > can do about this.

    >
    > I know.
    >
    > > Presumably the job will finish, given infinite
    > > time.

    >
    > I gave it about half an hour, then it locked solid and stopped writing to
    > the disk at all. (I gave it another 5 minutes at that point, then held
    > down the power button.)


    Maybe it was a bug then. Hard to say without backtraces


    > > You really shouldn't configure
    > > so much unless you do want the kernel to actually use it all, right?

    >
    > Two words: "Software suspend". I've actually been thinking of increasing
    > it on the next install...


    Kernel doesn't know that you want to use it for suspend but not
    regular swapping, unfortunately.


    > > Because if we're not really conservative about OOM killing, then the
    > > user who actually really did want to use all the swap they configured
    > > gets angry when we kill their jobs without using it all.

    >
    > I tend to lower "swappiness" and when that happens all sorts of stuff goes
    > weird. Software suspend used to say says it can't free enough memory if I
    > put swappiness at 0 (dunno if it still does). This time the OOM killer
    > never triggered before hard deadlock. (I think I had it around 20 or 40 or
    > some such.)
    >
    > > Would an oom-kill-someone-now sysrq be of help, I wonder?

    >
    > *shrug* It might. I was a letting it run hoping it would complete itself
    > when it locked solid. (The keyboard LEDs weren't flashing, so I don't
    > _think_ it paniced. I was in X so I wouldn't have seen a message...)


    If you can work out where things are spinning/sleeping when that happens,
    along with sysrq+M data, then it could make for a useful bug report. Not
    entirely helpful, but if it is a reproducible problem for you, then you
    might be able to get that data from outside X.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: OOM killer gripe (was Re: What still uses the block layer?)

    On Monday 15 October 2007 8:37:44 am Nick Piggin wrote:
    > > Virtual memory isn't perfect. I've _always_ been able to come up with
    > > examples where it just doesn't work for me. This doesn't mean VM
    > > overcommit should be abolished, because it's useful more often than not.

    >
    > I hate to go completely offtopic here, but disks are so incredibly
    > slow when compared to RAM that there is really nothing the kernel
    > can do about this.


    I know.

    > Presumably the job will finish, given infinite
    > time.


    I gave it about half an hour, then it locked solid and stopped writing to the
    disk at all. (I gave it another 5 minutes at that point, then held down the
    power button.)

    Lost about 50 open konqueror tabs...

    > How much swap do you have configured?


    2 gigs, same as ram.

    > You really shouldn't configure
    > so much unless you do want the kernel to actually use it all, right?


    Two words: "Software suspend". I've actually been thinking of increasing it
    on the next install...

    > Because if we're not really conservative about OOM killing, then the
    > user who actually really did want to use all the swap they configured
    > gets angry when we kill their jobs without using it all.


    I tend to lower "swappiness" and when that happens all sorts of stuff goes
    weird. Software suspend used to say says it can't free enough memory if I
    put swappiness at 0 (dunno if it still does). This time the OOM killer never
    triggered before hard deadlock. (I think I had it around 20 or 40 or some
    such.)

    > Would an oom-kill-someone-now sysrq be of help, I wonder?


    *shrug* It might. I was a letting it run hoping it would complete itself when
    it locked solid. (The keyboard LEDs weren't flashing, so I don't _think_ it
    paniced. I was in X so I wouldn't have seen a message...)

    (To be honest, I can never remember how to trigger sysrq on a laptop keyboard.
    Presumably X won't intercept it the way it does alt-f1 and ctrl-alt-del...)

    Rob
    --
    "One of my most productive days was throwing away 1000 lines of code."
    - Ken Thompson.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: What still uses the block layer?

    On Monday 15 October 2007 4:06:20 am Julian Calaby wrote:
    > On 10/15/07, Rob Landley wrote:
    > > I note that the eth0 and eth1 names are dynamically assigned on a first
    > > come first serve basis (like scsi). This never causes me a problem
    > > because the driver loading order is constant, and once you figure out
    > > that eth0 is gigabit and eth1 is the 80211g it _stays_ that way across
    > > reboots, reliably. Yeah, it's a heuristic. Hands up everybody relying on
    > > such a heuristic in the real world.

    >
    > Umm, not quite, from my experiences with pre-production wireless
    > drivers, (another story, another time) fancy stuff is being done in
    > udev to make sure that your gigabit card is always assigned to eth0.


    I remember building a 2.4 kernel, statically linking in all the drivers, and
    getting the ethernet devices showing up in a reliable order for years. Where
    does the need for fancy stuff come in?

    Rob
    --
    "One of my most productive days was throwing away 1000 lines of code."
    - Ken Thompson.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: What still uses the block layer?

    2007/10/15, Rob Landley :
    > On Sunday 14 October 2007 8:45:03 pm Theodore Tso wrote:
    >> On Sun, Oct 14, 2007 at 06:45:44PM -0500, Rob Landley wrote:
    >>> I admit a certain amount of personal annoyance that once the SCSI
    >>> layer consumes a category of device (USB, SATA, PATA), they can
    >>> often _only_ be used by going through the SCSI midlayer. (This
    >>> strikes me as analogous to TCP/IP claiming ethernet and PPP devices
    >>> so thoroughly that you can no longer address them as eth1 or
    >>> /dev/ttyS0.)

    >>
    >> That's because modern USB, ATAPI (what was once known as IDE), SATA
    >> really *all* using the SCSI command protocols at the low level,

    >
    > Ok, I'll bite. If it's all "real" scsi, why does ioctl(SG_EMULATED_HOST)
    > exist? exist if it's all "real" scsi?


    How do you define real SCSI ? The definition of SCSI in the kernel is
    "a device that accept the SCSI command set" (more precisely "a
    suitably large subset a the SCSI command set". It looks as if you
    definition of SCSI is "a device that is sold with written SCSI on the
    box and that attaches to a card with SCSI written on the box"; is it
    correct ?

    The host is the expansion card that connects the device to the
    motherboard. If it is emulated this means that it is not a native
    SCSI host. In case of USB drives/keys this is probably the case.

    >> just as Ethernet and PPP interfaces really are fundamentally the
    >> same thing.

    >
    > They're the same thing?
    >
    > Do you mean that on a system with both, going:
    > ifconfig eth1 66.92.53.140
    > ifconfig ppp 192.168.0.42
    >
    > Would be functionally equivalent to:
    > ifconfig eth1 192.168.0.42
    > ifconfig ppp 66.92.53.140
    >
    > So if on one boot the addresses are assigned the first way, and upon reboot
    > they're assigned in the second way by exact the same set of commands... well
    > that's not IMPORTANT, is it? (Or is it that everyone everywhere should use
    > dhcp for everything, and static addressing is obsolete and no longer
    > supported?


    You are really looking like you are out for a fight.

    > Apparently dhcp addresses should be delivered by machines with
    > only one network interface of any type...)


    I don't understand this one.

    > This is my objection. Even when enumerating multiple devices of the same type
    > is tricky, enumerating multiple devices of _different_ types should not be.
    > There's a great big type indicator that is being _deliberately_ ignored, and
    > large classes of devices (millions of laptops) where you know there's only
    > going to be _one_ instance of a given type.


    Your objection is interesting. It is lost in the middle of e-mails which,
    to the untrained eye, look like you are trying to fight everyone and
    everybody.

    > By the way, ethernet cards contain a unique MAC address. Hard drives do not
    > seem to, or if they do it's not being consistently exposed in a way I can
    > find. This is sad. (No, reading data from the device to determine this gets
    > us back to the "spinning up the external USB drive to find my root partition"
    > gripe mentioned earlier.)


    As far as I can tell the hard drives do not have serial numbers easily
    readable by the kernel (I think it's only printed on the label). However
    (feverishly plugging his USB key in the laptop), you can tell how a drive
    is attached to the motherboard:

    Laptop's SATA drive:
    cognac $ readlink /sys/block/sda/device
    .../../devices/pci0000:00/0000:00:12.0/host0/target0:0:0/0:0:0:0

    USB key:
    coghac $ readlink /sys/block/sdb/device
    .../../devices/pci0000:00/0000:00:13.5/usb6/6-3/6-3:1.0/host4/target4:0:0/4:0:0:0

    By the way, did you look in /dev/disk/by-id (udev magic) ? It's probably
    not very difficult to reconfigure udevd to not read the UUIDs of the
    partitions and not spin up your holy external disk at each reboot. I think
    the one that is spinning up your holy external hard drive is udevd. By
    the way, how many time do you reboot instead of resuming from
    suspend-to-disk ? Have you given a try to TuxOnIce ?

    If you had asked your first question in a way similar to this one:

    "I have my laptop hard drive that shows as different devices depending
    whether there are USB drives plugged in or not, what should I do ?
    Shouldn't SATA/USB drives/PATA/iSCSI drives be enumerated in different
    queues ?"

    You would probably have received more interesting answers and less
    insults.

    >> You can rail against it, but that's the mark of someone who
    >> refuses to accept reality.

    >
    > Let me clarify: I'm talking about device enumeration.
    >
    > I've never had trouble enumerating a device that was _not_ routed through the
    > scsi layer, largely because the systems I work with don't usually have more
    > than one device of the same type. (There are millions of laptop and desktop
    > devices out there where this is the common case. As I said, I may have four
    > USB ports and the ability to plug hubs into them, but you can't add another
    > SATA hard drive to my laptop without a soldering iron.)
    >
    > However, as soon as a device _is_ routed through the scsi layer (as PATA was a
    > few versions back), it gets conflated with numerous other devices. This
    > creates problems. SATA isn't hard to enmerate in my laptop, USB potentially
    > is. Dumping all the SATA devices into the same bucket with the USB devices
    > makes both harder to enumerate.


    Indeed. Propose a solution. Remember that it is indispensable that all
    goes through the SCSI layer(s) because all those devices respond to the
    same command set thus you do not want several implementations of
    the routines.

    >>> This has the annoying effect of bundling together different types of
    >>> devices and making device enumeration unnecessarily difficult: my
    >>> laptop only has one SATA hard drive and can't gain another without a
    >>> soldering iron, but that drive could move from /dev/sda to /dev/sdb
    >>> if I reboot the system with a USB key plugged in. This seems like a
    >>> regrettable loss of orthogonality to me. I remember back when
    >>> /dev/usb0 and /dev/hda were separate devices that showed up in /dev,
    >>> but these days "it's SCSI" seems to trump "it's USB", "it's ATA", or
    >>> "it's SATA". (Even though none of those are actually SCSI hardware,
    >>> they just send a similar packet protocol across the wire.)

    >>
    >> You're showing your ignorance here.

    >
    > I have buckets of ignorance. It's why I ask questions.


    Once again. You are so aggressive in your asking that it does not
    lead to an interesting discussion.

    >> In fact in the past few years,
    >> ATA and SCSI has been converging significantly,

    >
    > And down far enough all these devices are powered by electricity. Are we
    > going to wind up with /dev/electric[1-999]?


    Out for a fight ?

    > SATA != PATA != USB. But /dev/sda can be PATA, /dev/sdb SATA, and /dev/sdc
    > USB. And they can move relative to each other. This didn't used to be the
    > case. Why is it considered an improvement?


    Each device is different from each other (they do not share their atoms).
    Where do you want to put the line between using a single driver for them
    or not ?

    In that case the sd driver registers all disk-like devices that respond to
    (a suitably large subset of) the SCSI command set. It is an improvment
    to have all devices share a driver because if you improve it, you improve
    it for all the devices; if you debug it, you debug it for all your
    devices; you
    use less memory.

    The enumeration of the devices is not the nightmare you are trying to
    imply. If it is hard in your particular case, many people are likely to want
    to help you. Just try to ask politely, without shouting, without saying that
    block layer is useless, without saying that device enumeration in the
    SCSI layer (or sd device driver) is braindead. If you want to propose a
    change, propose it: "we could also do it that other way". "The way the
    SCSI disk is attached should show in the name of the device". I suspect
    this will be refused anyway because that would mean that you need
    a (series of) major block number(s) for each type of SCSI attachment
    (if the device has not a different major block number, there is nothing
    short of udev that can give it a different name).

    >> with the ATAPI
    >> specification has essentially incorporating the SCSI protocol by
    >> reference and by value --- with the point that SAS was developed by
    >> the SCSI Trade Association, and SAS is effectively a superset of SATA,
    >> to the point where with care, you can actually mix SAS and SATA drives
    >> on the same in enclosure (SAS and SATA are physically compatible on
    >> the connector level).

    >
    > I'm aware of this, and under the impression they're both modified gigabit
    > ethernet at the PHY level. Should the hard drive become eth2?


    Out for a fight ?

    >> More to the point, with SATA, hot plugging has been designed in, so
    >> probing order is not going to be well defined,

    >
    > The spec may define the capability to hotplug, but your average
    > laptop doesn't not offer the capability to hotplug anything into its
    > SATA controllers.


    How long before eSATA enabled laptops (with eSATA enumerated
    before SATA obviously) ?

    > The hard drive is screwed in (due to the portability part of laptopness),
    > all the controllers wired onto the motherboard are accounted for, none
    > are exposed externally. What _is_ exposed externally is USB, and if you
    > want to add an extra hard drive you can buy a cheap USB one at Fry's.
    >
    > In such a case, which is common, the first SATA hard drive is reliably the
    > disk containing the root partition, and there's no need to stick a UUID
    > in /etc/fstab.
    >
    > The problem is, "the first SATA hard drive" is not a stable identifier in a
    > system where SATA and USB devices are dumped in the same bucket
    > and given big stir. Dumping SATA and USB devices into the same
    > bucket (because they smell a bit like SCSI) is what I am objecting to.


    You should have told it in the first place -- with cooler tone.

    >> just as with USB
    >> devices. And there are already relatively common situations where the
    >> same disk can show up via multiple different interfaces.

    >
    > It was also possible to buy a hotplug PATA ide enclosure. So what? The vast
    > majority of traditional IDE users happily ignored this, and went on with
    > their lives.
    >
    > > For example, if you have a modern Thinkpad with an secondary SATA hard
    > > drive in an Ultrabay, and you plug it into the Ultrabay in your T60,
    > > it will show up as a SATA drive.

    >
    > I remember the config option about enumerating onboard IDE controllers first.
    > It didn't really matter what order they were enumerated in as long as it was
    > controllable.
    >
    > Presumably if the primary SATA hard drive was /dev/sata and the slot
    > with "secondary" in its name got /dev/satb, life would be good. And the
    > presence or absence of /dev/satb wouldn't affect USB devices and such if they
    > weren't in the same namespace.
    >
    >> However, if you plug it into the
    >> Advanced dock, it shows up as a USB device.

    >
    > You plug it in somewhere else, it shows up somewhere else. This sounds
    > familiar to old IDE users.
    >
    > How is it harder for udev to make a stable symlink for this drive that
    > sometimes points to /dev/satb and sometimes to /dev/usb1? (Harder than a
    > symlink that sometimes points to /dev/sdb and sometimes to /dev/sdd? You
    > don't have persistent naming _now_, so the objection seems to be that
    > maintaining the distinction between device types would not be a perfect
    > solution in all cases. I agree. So?)
    >
    > > And with iSCSI not only
    > > can you encapsulate a SCSI command stream over USB, you can do so over
    > > IP as well.

    >
    > Yup. And you've been able to make a network block device for years. They
    > showed up as /dev/nd0, a distinct type of block device which you (and your
    > scripts) could find. Now yet another way of doing the same thing is mixed
    > into the same scsi bucket and given a stir...
    >
    > > In any case, regardless of how the physical SATA drive is
    > > attached to the system, you want it to show up as the same device and
    > > be mounted in the same location.

    >
    > If my laptop's hard drive reliably showed up as /dev/sda every time, and I
    > could count on that, I wouldn't be complaining about it. The entire problem
    > is that it isn't guaranteed to do that, and thus /etc/fstab is a nightmare I
    > can't edit.
    >
    > You could meet this definition of "the same" by having every block device in
    > the system show up as /dev/block[a-z] no matter what type it was, and all the
    > char devices show up as /dev/char[aa-zz], shuffle them all each reboot, and
    > then have all the programs iterate through all of them any time they wanted
    > something specific.
    >
    > I'm rather glad that /dev/ttyS0 and /dev/zero aren't easy to mix up.
    >
    > > That's why identifying filesystem by UUID's or Labels is so critical.
    > > This is not a new concept; we've had the capability to do this for
    > > over a decade, and I always knew it would be necessary for us to do
    > > this sooner or later --- which is why I added the UUID support to ext2
    > > back in 1996.

    >
    > It's necessary for IBM big iron to do this. It's generally not necessary for
    > laptops or embedded systems to do this if they distinguish between _types_ of
    > devices, which is something they until recently did for the types of devices
    > I was interested in, and something they _stopped_ doing when everything got
    > merged into the scsi layer, and I consider this a regression.
    >
    > No, distinguishing between types of devices is not a perfect solution to
    > device enumeration, but it was sufficient for all my use cases for many
    > years, and would still be if the kernel still did it, and I'm not alone here.
    >
    > > > The fact that udev can theoretically unwind this hairball is not an
    > > > excuse for conflating different categories of devices in the first
    > > > place.

    > >
    > > See the thinkpad Ultrabay drive example above.

    >
    > Last week I drove my laptop so deep into swap (with a "make -j" on qemu) that
    > after half an hour trying to repaint my kmail window, it locked solid.
    > Again. You'd think the oom killer would come to the rescue, but it didn't.
    > Maybe Ubuntu disabled it. I have _2_gigs_ of ram in this sucker, on a stock
    > Ubuntu 7.04 install (with the "upgrade all" tab pressed a few times), and yet
    > I managed to make it swap itself to death one more time.
    >
    > Virtual memory isn't perfect. I've _always_ been able to come up with
    > examples where it just doesn't work for me. This doesn't mean VM overcommit
    > should be abolished, because it's useful more often than not.
    >
    > So you have a counterexample. Ok. I can't actually see how your
    > counterexample would be worse off than it is now; just differently worse off.
    >
    > > You address hosts by
    > > IP address; it doesn't matter whether you access them via a PPP
    > > interface, or a wireless interface, or a ethernet interface.

    >
    > It does when I'm configuring the interfaces.
    >
    > > Similarly, a disk could in theory be accessible over USB, SATA, or
    > > iSCSI, and the Thinkpad example is only one such where the same
    > > filesystem might be accessible over multiple interfaces. And with
    > > multipath fiber channel SAN's (and I hate to break it to you, but FC
    > > also uses SCSI protocols) storage is very much looking more and more
    > > like networking.

    >
    > And in the networking world I'm able to say that this local machine has a
    > static IP that is not world-routable. It is separate, manually configured, I
    > put it _right_here_, and I personally know that it's not going to move
    > because I'm the one who put it there and I'm the only one who would move it.
    >
    > Over on the networking side of things I can "ifconfig lo 127.0.0.1" without
    > first probing all the interfaces to figure out which one's loopback and which
    > one's the wireless card.
    >
    > I note that the eth0 and eth1 names are dynamically assigned on a first come
    > first serve basis (like scsi). This never causes me a problem because the
    > driver loading order is constant, and once you figure out that eth0 is
    > gigabit and eth1 is the 80211g it _stays_ that way across reboots, reliably.
    > Yeah, it's a heuristic. Hands up everybody relying on such a heuristic in
    > the real world.


    It' also easier because you have the MAC address to help.

    > Possibly one solution here is to document that the SATA drivers load
    > before any other scsi device, and the driver subsystem _waits_ for
    > that to finish enumerating before trying any other kind of scsi device,
    > with a barrier of some kind), and then any SATA devices present at
    > boot time will reliably get those names in that order (no races, no
    > variation) and anything after that is a separate problem. (Of course
    > this would involve making it true if it currently isn't. It's still a mess to
    > dump all sorts of different devices in the same namespace, but at
    > least for the common case of a laptop with a SATA root partition this
    > would let us get the UUID out of /etc/fstab).


    There is also the need to ensure that each distribution loads the ahci
    driver before any other SCSI or emulated SCSI host. It looks as if Ubuntu
    does not do it that way (I've tried to understand the module probing logic
    in the initramfs but have miserably failed; too much magic probably).

    Remember that udevd starts your holy external hard disk, not the
    UUID in /etc/fstab (obiously if you have an UUID in fstab, you need udev
    to probe for it, but it'll do it regardless of whether or not UUSID is present
    in fstab.

    Regards,

    Loc
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: What still uses the block layer?

    On Monday October 15, rob@landley.net wrote:
    >
    > This is my objection. Even when enumerating multiple devices of the same type
    > is tricky, enumerating multiple devices of _different_ types should not be.
    > There's a great big type indicator that is being _deliberately_ ignored, and
    > large classes of devices (millions of laptops) where you know there's only
    > going to be _one_ instance of a given type.


    My perspective is different.

    The range of addressing option for "all disk devices" is far too rich
    to be able to assign a stable device number every device: there are
    multiple, multi-dimensional addressing scheme, and some devices might
    not even have a stable address at all (e.g. USB?).
    So the reality of dealing with disk devices is that you cannot provide
    a stable single-number naming scheme for all devices on all machines.

    Therefore it is best to not have stable single-number naming schemes
    for any devices on any machines. Why? Because it ensure there will
    not be any second class citizens.

    If some devices that are even reasonably common (e.g. IDE drives) are
    stable, then some application developers or system integrators will
    work under the assumption of stability and whatever they build will
    break when you try it on different hardware. This happened during the
    early days of SCSI support - code assumed the stability of
    major/minor numbers and so did not work properly with SCSI which
    cannot provide that stability in general.

    Having a totally uniform approach makes development and testing a lot
    easier - there are fewer special cases.

    I would prefer that 'total uniformity' went even further than
    /dev/sd?? to /dev/disk??. i.e. Anything that is or behaves
    substantially like a disk drive should be "/dev/diskXX", where numbers
    are assigned sequentially on discovery. (I wonder why we need
    /dev/scdX to be separate from /dev/sdX).

    Note that stable names a still a very real option. udev provides
    several. /dev/disk-by-path/XXX will be stable for lots of "screwed
    in" devices. /dev/disk-by-id will be stable for devices the report a
    unique id. etc.

    The different between IDE, SATA, SCSI and even USB is peripheral for
    the large majority of uses, and I think maintaining the distinction in
    the major/minor number or in the "primary" /dev name is - for the
    above reasons - more of a cost that a value.

    NeilBrown
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: OOM killer gripe (was Re: What still uses the block layer?)

    On Mon, Oct 15, 2007 at 11:37:44PM +1000, Nick Piggin wrote:
    > I hate to go completely offtopic here, but disks are so incredibly
    > slow when compared to RAM that there is really nothing the kernel
    > can do about this. Presumably the job will finish, given infinite
    > time.


    About 6 weeks ago, on a 2.6.23-rc kernel, I accidentally typed "make
    -j", and left off the 4 before I hit the return key. About 2-3
    minutes later, the box locked pretty tight. I managed to switch to a
    VT console before I lost total control of X (took many, many minutes
    to do the switch), but after many minutes, managed to get logged into
    the console, but I wasn't able to get a ps command to complete so I
    could start killing processes. (I probably should have just done a
    "killall make" right away, but hindsight is 20/20.)

    The console was showing that the OOM killer was attempting to kill
    processes, but apparently not fast enough to stem the tide of all of
    the new processes getting generated by the make -j. (I'm guessing
    that it was killing the gcc processes and not the make processes.)

    > Would an oom-kill-someone-now sysrq be of help, I wonder?


    I tried sysrq-f (oom_kill), but no dice. Given that the oom killer
    was active and apparently triggering on its own, this wasn't all that
    surprising.

    The interesting thing is I tried to do an sysrq-e (send SIGTERM to all
    processes except), waited 5 minutes or so, then tried an alt-sysrq-i
    (send SIGKILL to all processes except init), and the system was still
    thrashing itself to death, even after giving it plenty of time to try
    to recover.

    I finally gave up and held down the power button. This was on a box
    with 4 gigs memory (but only 3 gigs visible thanks a cheap
    BIOS/chipset) and 4 gigs swap (mainly intended for suspend/resume).

    I chalked it up to me being stupid (I should have noticed and
    Ctrl-C'ed the make -j much more quickly, or if I were a sysadmin on a
    time-sharing system with users I didn't trust, configured RLIMIT_NPROC
    and/or per-user container resource limits) and the OOM killer not
    being aggressive enough in such a situation. But having better things
    to do, I didn't go whining on LKML about it, although I have to say
    that the kernel behavior isn't exactly ideal. One of these days when
    I have time, I'll try investigating it with a few memlocked processes
    running at real-time priorities and Systemtap and figure out what the
    heck was going on....

    I suppose I should just configure suspending to a file instead of a
    swap partition, but I've just historically trusted suspend/resume to a
    swap partition much more than to a file. Or maybe I should hack in a
    sysctl to prevent any swapping even though the swap partition is
    configured (so only suspend/resume will use it).

    - Ted
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: What still uses the block layer?

    > For the desktop I don't object to the scsi layer. I object to the naming.
    > Merging a half-dozen different types of devices into a single name space, and


    They *are* SCSI devices. USB storage is a SCSI over USB transport. ATAPI
    is a SCSI over ATA transport. SAS is much the same thing, as is FC, and
    it continues.

    With the exception of ATA disk for historical reasons SCSI essentially
    won the battle of command formats.

    > problems to the point where common cases (like my laptop) aren't impacted by
    > them during early boot. I don't think anybody (outside the embedded space)
    > is actually upset that /dev/hda now goes through the scsi layer: they're
    > upset Ubuntu 7.04 no longer calls it /dev/hda.


    For the emedded CF using world we could do with a truely dumb ATA only CF
    driver, possibly even with pure polled support that used neither the IDE
    or the ATA layer.

    Alan
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: What still uses the block layer?

    On Sun, 2007-10-14 at 18:45 -0500, Rob Landley wrote:
    > On Sunday 14 October 2007 5:24:32 pm James Bottomley wrote:
    > > On Sat, 2007-10-13 at 16:05 -0600, Matthew Wilcox wrote:
    > > > On Thu, Oct 11, 2007 at 08:11:21PM -0500, Rob Landley wrote:
    > > > > My impression from asking questions on the linux-scsi mailing list is
    > > > > that the scsi upper/middle/lower layers doesn't use the block layer
    > > > > described in Documentation/block/*.
    > > >
    > > > Entirely incorrect.

    > >
    > > OK, right ... could we please get a sense of decorum back on this list.

    >
    > Did I reply to the insult?
    >
    > > Rob, if you didn't ask your alleged questions in such a pejorative
    > > manner, we'd get a lot further

    >
    > I'm not attempting to be pejorative.


    OK, so could we get back to the original discussion? The question I
    think you meant to ask is "does SCSI use the block layer, and if so;
    how?"

    The answer is yes (just do an ls /sys/block on any scsi machine). The
    how is that it bascially uses the block layer as a service library (i.e.
    most SCSI services are built on top of those already provided by block).
    The email you cited was basically from our one area of confusion: SCSI
    and block both provide services to decode the SG_IO ioctl. This is
    partly historical; block and SCSI are very much intertwined; so much so
    that they both tend to drive each other's development. The programme
    over the last few years has been to identify features in SCSI that
    should be more generic (and hence moved to block). SG_IO is one of
    these, so we end up with the situation where Block provides this as a
    service (and sr, st and sd make use of it) while the sg driver still
    doesn't use what the block layer provides but rolls its own. I think
    the layout of how all this works is illustrated at a reasonably high
    level here on slide 15:

    http://licensing.steeleye.com/suppor...005_slides.pdf


    > I admit a certain amount of personal annoyance that once the SCSI layer
    > consumes a category of device (USB, SATA, PATA), they can often _only_ be
    > used by going through the SCSI midlayer. (This strikes me as analogous to
    > TCP/IP claiming ethernet and PPP devices so thoroughly that you can no longer
    > address them as eth1 or /dev/ttyS0.)


    OK. But that's the bit I need you to separate from your inquiry into
    how SCSI actually works. You can't go on a research trip if you allow
    preconceived notions to spill over into it.

    For the record, USB and firewire are SCSI at their core, so they can
    never really be separated. SATA (but not SATAPI) is a separate
    protocol, so it can theoretically be separated later, and we are
    actually working on that. It's only in SCSI because there's a well
    defined and standardised way to place it their (called the SAT
    layer---SCSI to ATA Translation) and because it's a lot easier since
    SCSI has all the features and quite a few of the necessary ones aren't
    yet migrated to block.

    > This has the annoying effect of bundling together different types of devices
    > and making device enumeration unnecessarily difficult: my laptop only has one
    > SATA hard drive and can't gain another without a soldering iron, but that
    > drive could move from /dev/sda to /dev/sdb if I reboot the system with a USB
    > key plugged in. This seems like a regrettable loss of orthogonality to me.
    > I remember back when /dev/usb0 and /dev/hda were separate devices that showed
    > up in /dev, but these days "it's SCSI" seems to trump "it's USB", "it's ATA",
    > or "it's SATA". (Even though none of those are actually SCSI hardware, they
    > just send a similar packet protocol across the wire.)
    >
    > The fact that udev can theoretically unwind this hairball is not an excuse for
    > conflating different categories of devices in the first place. Avoiding an
    > unnecessary problem seems superior to trying to get udev to solve it. Note
    > that Ubuntu 7.04 solves it by sticking a UUID on every _partition_, and then
    > spinning up my external USB hard drive trying to find the root partition on a
    > reboot. Tell me how this can be considered progress:
    >
    > > # /etc/fstab: static file system information.
    > > #
    > > #
    > > proc /proc proc defaults 0 0
    > > # /dev/sda1
    > > UUID=04d1b984-bd65-46f1-9a77-c158cf4bed1b / ext3

    > defaults,errors=remount-ro,noatime 0 1
    > > # /dev/sda5
    > > UUID=cdf0936d-9f19-42c6-b131-9fefcf1321ef none swap sw

    > 0 0
    > > /dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0
    > > UUID=86bbb512-ab7e-4a12-8618-1190f032c082 /boot ext3 defaults 0 0

    >
    > Conflating categories of hardware that cannot easily be enumerated (USB) with
    > categories that can (the SATA hard drive in my laptop, of which there can be
    > only one) strikes me as a bad thing. Putting them in a common "scsi device
    > pool" within which they do not enumerate consistently is not something I
    > enjoy dealing with.


    However, by design choice, we got the SCSI layer in the kernel out of
    the business of trying to provide a stable name space, since Richard
    Gooch did a brilliant job of demonstrating the insoluability of that
    problem. There are many ways to identify a device (UUID being just one
    of them). It seems much more desirable to give the users the choice.
    You can even have what you seem to want (SATA stably at /dev/sda) simply
    by ensuring that you have a modular kernel and that libata always loads
    before USB or any other storage device (not that I'd recommend doing
    this, because it will fail for a large configuration, but it would work
    for you).

    > However, the response to my attempts to express this dissatisfaction on the
    > SCSI list a few months ago came too close to a flamewar for me to consider
    > continuing it productive. I'd still love to update the "2.4 scsi howto" and
    > corresponding sg howto, but lack the expertise. The SCSI layer really isn't
    > my area, and I was much happier back when I could avoid using it at all.


    That was because your initial inquiry came across as "I'm trying to
    document this, and by the way it's rubbish". By all means have an
    inquiry and an argument, but saying effectively I don't understand this
    but I know it's wrong is a guaranteed way to antagonise everyone who's
    worked to try to make all of this as functional as possible. Find out
    the facts first then argue from them.

    > The question I was trying to ask _here_ was about the block layer. I seem not
    > to have asked it very well. Sorry 'bout that.


    OK, so look at the diagram and the other SCSI documents and come back
    for further clarification as you need it.

    James


    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: What still uses the block layer?

    On Mon, Oct 15, 2007 at 03:04:00AM -0500, Rob Landley wrote:
    > Ok, I'll bite. If it's all "real" scsi, why does ioctl(SG_EMULATED_HOST)
    > exist? exist if it's all "real" scsi?


    SG_EMULATED_HOST was added before Linux 2.4, at least six or seven
    years ago. Back then the migration of ATA devices through the various
    versions the ATAPI specification and then into SATA was very early in
    its evolution, and back then, yes there were people who considered
    anything that didn't use the honking huge parallel SCSI cables not
    "real" SCSI. Over time, that distinction at both the physical
    connector level and logical level has declined to the point of almost
    non-existence. It's note quite at the point where SAS exists only to
    justify massive prices differences between commodity and "data-center
    grade" disks to the benefit of hard drive manufacturers, but it's
    darned close. (There are differences such as voltage levels so that
    the max cable differences for SAS are larger, etc., but those could
    have been optional additions to the SATA spec, and allegedly SAS
    drives are supposedly manufactered to be more robust --- although some
    recent papers published at FAST have raised some interesting questions
    about how true those marketing claims really are in practice.)

    > > just
    > > as Ethernet and PPP interfaces really are fundamentally the same
    > > thing.

    >
    > They're the same thing?
    >
    > Do you mean that on a system with both, going:
    > ifconfig eth1 66.92.53.140
    > ifconfig ppp 192.168.0.42
    >
    > Would be functionally equivalent to:
    > ifconfig eth1 192.168.0.42
    > ifconfig ppp 66.92.53.140


    No, of course not. But we don't have separate IP stacks for ethernet
    and ppp devices. And how we connect to a host via ssh makes no
    difference whether we accessed it via Ethernet or PPP. And I would
    argue that how we address a filesystem should also make no difference
    depending on the path to hard drive.

    > By the way, ethernet cards contain a unique MAC address. Hard
    > drives do not seem to, or if they do it's not being consistently
    > exposed in a way I can find.


    You can pull a Model and Serial number via hdparm -i, but it's not as
    easy to manipulate as a fixed-length MAC address. That's why people
    tend to use filesystem UUID's.

    > > More to the point, with SATA, hot plugging has been designed in, so
    > > probing order is not going to be well defined,

    >
    > The spec may define the capability to hotplug, but your average
    > laptop doesn't not offer the capability to hotplug anything into its
    > SATA controllers. The hard drive is screwed in (due to the
    > portability part of laptopness), all the controllers wired onto the
    > motherboard are accounted for, none are exposed externally. What
    > _is_ exposed externally is USB, and if you want to add an extra hard
    > drive you can buy a cheap USB one at Fry's.


    That may be true for laptops today, but Linux doesn't run just on
    servers. You can easily get home servers with hot-swap SATA bays. My
    home fileserver, which is a white box I purchased on my own nickel,
    NOT IBM big iron, has 3TB of raw storage for less than $10,000 a year
    ago. Today, that amount of home storage with hot-swap SATA drives and
    a battery-backed hardware RAID controller could probably be purchased
    for about half that price.

    And even for laptops, if you need the performance, you can get Cardbus
    cards that will allow you to connect eSATA drives to your laptop at
    Fry's.

    So even if you ignore "big data center" interconnects like FC, this
    problem exists even for commodity grade SATA devices.

    I agree at the moment we have an issue where if the root device isn't
    guaranteed, it forces people to use initrd's, and the quality and
    debuggability of initrd's between distro's is highly variable and not
    standardized. In practice though the /dev/sda is actually pretty
    stable on laptops, especially if you end up compiling ehci and uhci
    support as modules (which is a good idea from a power savings point of
    view anyway). The reason why Ubuntu and other distributions are using
    UUID-based labels is not just because of the root device, but also for
    all of the other disks that might be mounted on the system, including
    some that might be using USB devices that don't have stable /dev
    names.

    > It's necessary for IBM big iron to do this. It's generally not
    > necessary for laptops or embedded systems to do this if they
    > distinguish between _types_ of devices, which is something they
    > until recently did for the types of devices I was interested in, and
    > something they _stopped_ doing when everything got merged into the
    > scsi layer, and I consider this a regression.


    As another example, it's easy to see a home media server running Linux
    which doesn't have any expansion bays for additional hard drive --- so
    the only way a user could expand their storage is by using one or more
    permanently connected USB disks. So we do need to solve the general
    device enumeration problem in the general case; it's not just the case
    of IBM "big iron" as you seem to think.

    > No, distinguishing between types of devices is not a perfect
    > solution to device enumeration, but it was sufficient for all my use
    > cases for many years, and would still be if the kernel still did it,
    > and I'm not alone here.


    News flash! The kernel wasn't built just for you, and over time, more
    and more people will have multiple disk drives of the same type, so we
    will need to solve the device naming problem sooner or later. Why not
    solve it sooner, especially given that a number of companies (not just
    IBM) are funding the organization that is paying *your* salary are
    interested in solving the general case?

    Furthermore, I've already pointed a number of situations where the
    home user might have multiple USB devices on their system today, and
    that is probably going to go up over time, not down. Have you seen
    how cheap 500GB USB disks are at Costco? And for a typical
    unsophisticated user, plugging in another 500G USB disk when they need
    more storage is a lot easier than cracking open the computer case and
    futzing with screws and disk cables and power connectors.

    - Ted
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: What still uses the block layer?

    > You can pull a Model and Serial number via hdparm -i, but it's not as
    > easy to manipulate as a fixed-length MAC address. That's why people
    > tend to use filesystem UUID's.


    ATA8 at the moment looks set to add a true "MAC" or "WWN" type identifier
    to each device.. Right now model/serial is not always unique.

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: What still uses the block layer?

    On Mon, Oct 15, 2007 at 02:29:45PM +0100, Alan Cox wrote:
    > > You can pull a Model and Serial number via hdparm -i, but it's not as
    > > easy to manipulate as a fixed-length MAC address. That's why people
    > > tend to use filesystem UUID's.

    >
    > ATA8 at the moment looks set to add a true "MAC" or "WWN" type identifier
    > to each device.. Right now model/serial is not always unique.


    True, but most manufacturers try to make the serial number unique for
    their own reasons (like warrantee service), and you can have
    manufacturing errors with MAC assignment just as easily as you can
    with serial numbers.

    I still remember when SGI shipped MIT 20 SGI Indy pizza boxes that all
    had the same MAC addresses (that we knew about --- we found out
    because all 20 were installed on the same subnet). That was a mildly
    entertaining bug to track down.... especially since IIRC, Irix at the
    time didn't print warning messages when someone else with a different
    IP addresses responded to your MAC address.

    - Ted
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: What still uses the block layer?

    On Mon, 15 Oct 2007 03:36:15 -0500
    Rob Landley wrote:

    > The point I was trying to make is that it seems to me like it would
    > be possible to keep the namespace separate here, and thus reduce the
    > enumeration problems to the point where common cases (like my laptop)
    > aren't impacted by them during early boot. I don't think anybody
    > (outside the embedded space) is actually upset that /dev/hda now goes
    > through the scsi layer: they're upset Ubuntu 7.04 no longer calls
    > it /dev/hda.


    that's a choice Ubuntu made in their udev scripts... if you don't like
    it, complain to them.
    I'm surprised you would even need to care about what device name things
    are though.... with mount-by-label (deployed for a bunch of years now
    in most distros), and various helpful links like /dev/cdrom ....

    anyway.. if you don't like your distros udev configuration, lkml is the
    wrong forum.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: What still uses the block layer?

    Theodore Tso wrote:
    > On Mon, Oct 15, 2007 at 03:04:00AM -0500, Rob Landley wrote:
    >> Ok, I'll bite. If it's all "real" scsi, why does ioctl(SG_EMULATED_HOST)
    >> exist? exist if it's all "real" scsi?

    >
    > SG_EMULATED_HOST was added before Linux 2.4, at least six or seven
    > years ago.


    SG_EMULATED_HOST was present when I started maintaining the
    the sg driver in 1997. Back then some folks (one German name
    comes to mind) toyed with the idea of sending SCSI Parallel
    Interface (SPI) messages through a pass through interface.
    SPI messages are obviously transport specific and hence any
    app trying to send them needed to ascertain what the transport
    was. There were really only two to choose from at the time
    (in linux): SPI and the ATA Packet Interface (ATAPI).

    If SG_EMULATED_HOST was every used I'm not sure. It is just
    an historical remnant now.

    Back then the migration of ATA devices through the various
    > versions the ATAPI specification and then into SATA was very early in
    > its evolution, and back then, yes there were people who considered
    > anything that didn't use the honking huge parallel SCSI cables not
    > "real" SCSI. Over time, that distinction at both the physical
    > connector level and logical level has declined to the point of almost
    > non-existence.


    On the contrary, the distinction between the logical
    (command) level and the transport level (down to the
    physical/connector level) is pivotal. There is one
    industry accepted storage architecture (SAM (yes, ATA
    documents defer to it)), two command sets: ATA and SCSI
    (and ways to tunnel one within the other and translate
    between the two) and about 10 transports (interconnects)
    that I can think of.

    Comparisons between PATA and SCSI (SPI) are now history.
    More precise terminology is now required.
    For example the "ATAPI specification" IMO is a handful
    of ATA commands designed to convey a packet based protocol
    (which the rest of the ATA command set is not). So ATAPI
    could be used to send IP over ATA! Is that what you meant?

    It's note quite at the point where SAS exists only to
    > justify massive prices differences between commodity and "data-center
    > grade" disks to the benefit of hard drive manufacturers, but it's
    > darned close. (There are differences such as voltage levels so that
    > the max cable differences for SAS are larger, etc., but those could
    > have been optional additions to the SATA spec, and allegedly SAS
    > drives are supposedly manufactered to be more robust --- although some
    > recent papers published at FAST have raised some interesting questions
    > about how true those marketing claims really are in practice.)


    You should read more about SAS.

    Anyway Seagate have announced a ES.2 family of 3.5" disks
    that rotate at 7200 rpm. One would not normally expect disks
    below 10000 rpm to come with a SCSI transport (FCP, SAS or
    SPI) but the ES.2 series breaks the pattern since it
    comes with either a SATA or a SAS interface. What will be
    really interesting is how Seagate will price the two versions.
    Apart from the SAS variant having dual ports it is pretty
    close to an apples versus apples comparison.

    A port selector could be added to the SATA variant to provide
    dual port functionality. However the SCSI command set offers
    persistent reservations which are beyond the scope of ATA
    command sets which assume a logical point to point connection.

    Doug Gilbert

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: What still uses the block layer?

    On Mon, Oct 15, 2007 at 04:26:04AM -0500, Rob Landley wrote:
    > For example, usb devices are never easy to order. IDE devices (back when they
    > had their own namespace) were trivial to order back when /dev/hda couldn't
    > move without use of a screwdriver.


    Ah, but it could. If you had more than one IDE controller (which is
    even possible on laptops; the Fujitsu P7120 is one that I'm familiar
    with that has more than one), the initialisation order *of the
    controllers* would change which was hda and which was hde.

    > Combining USB and IDE into the same /dev/sd? namespace makes enumerating the
    > IDE devices much harder than in the traditional "/dev/hdb doesn't move
    > without a screwdriver" model. The merger creates a new problem for IDE, one
    > which didn't exist before: the addition or removal of other unrelated types
    > of devices may change this device's location next boot. It may be possible
    > to add additional complication to the system to compensate, but what was the
    > advantage of merging the namespaces in the first place?


    It's not something anyone particularly set out to do, it's just how
    it worked out. It was justified by saying "ok, this goes from a 99%
    solution to a 96% solution, but there's 100% solution called uuids".
    I don't particularly agree with this line of argumentation, but it did
    hold sway.

    --
    Intel are signing my paycheques ... these opinions are still mine
    "Bill, look, we understand that you're interested in selling us this
    operating system, but compare it to ours. We can't possibly take such
    a retrograde step."
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: What still uses the block layer?

    Matthew Wilcox wrote:
    > On Mon, Oct 15, 2007 at 04:26:04AM -0500, Rob Landley wrote:
    >> Combining USB and IDE into the same /dev/sd? namespace makes enumerating the
    >> IDE devices much harder than in the traditional "/dev/hdb doesn't move
    >> without a screwdriver" model. The merger creates a new problem for IDE, one
    >> which didn't exist before: the addition or removal of other unrelated types
    >> of devices may change this device's location next boot. It may be possible
    >> to add additional complication to the system to compensate, but what was the
    >> advantage of merging the namespaces in the first place?

    >
    > It's not something anyone particularly set out to do, it's just how
    > it worked out. It was justified by saying "ok, this goes from a 99%
    > solution to a 96% solution, but there's 100% solution called uuids".
    > I don't particularly agree with this line of argumentation, but it did
    > hold sway.


    Low-level networking drivers suggest a default interface name (per
    interface or as a template like eth%d into which the networking core
    inserts a lowest spare number). Userspace can rename interfaces, but
    nevertheless it's nice to have different default kernel names for
    ethernet, wlan etc..

    Could low-level SCSI drivers provide similar name templates which give a
    hint on the transport involved? It's a bit more difficult as with
    networking interfaces though because
    - SCSI devices can have sd, sr, st, osst, ch, sg interfaces,
    - SCSI device files share a namespace with all other device files.

    E.g.
    /dev/sd-ide-b - second IDE HDD,
    /dev/sd-iscsi-e - fifth iSCSI direct access device,
    /dev/sr-sata-0 - first SATA CD-ROM,
    /dev/sr-usb-0 - a USB CD-ROM,
    /dev/st-fw-0 - a FireWire tape drive,
    /dev/sda - a device whose transport driver didn't propose a name

    Of course the really interesting names will still be provided by
    udev-generated symlinks.
    --
    Stefan Richter
    -=====-=-=== =-=- -====
    http://arcgraph.de/sr/
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: What still uses the block layer?

    On Mon, Oct 15, 2007 at 03:36:15AM -0500, Rob Landley wrote:
    >
    > The point I was trying to make is that it seems to me like it would be
    > possible to keep the namespace separate here, and thus reduce the enumeration
    > problems to the point where common cases (like my laptop) aren't impacted by
    > them during early boot.


    Proposals on how to do this would be gladly reviewed.

    But again, please remember that these USB devices are really SCSI
    devices. Same for SATA devices. There is a reason they are using the
    SCSI layer, and it isn't just because the developers felt like it

    > I don't think anybody (outside the embedded space) is actually upset
    > that /dev/hda now goes through the scsi layer: they're upset Ubuntu
    > 7.04 no longer calls it /dev/hda.


    Use mount-by-label instead, it's much saner and handles device name
    movement just fine (as does the UUID method that you seem to hate.)
    Look in /dev/disk/ for a wide range of options that you have in which to
    choose how to pick your block devices.

    Oh, and this seems like a very Ubuntu specific rant, might I suggest you
    contact the Ubuntu developers about this? The kernel doesn't dictate
    that the distro has to use these long identifiers, and there is nothing
    we can do about it.

    good luck,

    greg k-h
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  18. Re: What still uses the block layer?

    On Mon, Oct 15, 2007 at 05:08:36AM -0500, Rob Landley wrote:
    > On Monday 15 October 2007 4:06:20 am Julian Calaby wrote:
    > > On 10/15/07, Rob Landley wrote:
    > > > I note that the eth0 and eth1 names are dynamically assigned on a first
    > > > come first serve basis (like scsi). This never causes me a problem
    > > > because the driver loading order is constant, and once you figure out
    > > > that eth0 is gigabit and eth1 is the 80211g it _stays_ that way across
    > > > reboots, reliably. Yeah, it's a heuristic. Hands up everybody relying on
    > > > such a heuristic in the real world.

    > >
    > > Umm, not quite, from my experiences with pre-production wireless
    > > drivers, (another story, another time) fancy stuff is being done in
    > > udev to make sure that your gigabit card is always assigned to eth0.

    >
    > I remember building a 2.4 kernel, statically linking in all the drivers, and
    > getting the ethernet devices showing up in a reliable order for years. Where
    > does the need for fancy stuff come in?


    Because PCI devices reorder their bus numbers all the time. And we have
    ethernet devices hanging off of USB connections now (yes, even built-in
    to the machine), and we have network connections on other hot-pluggable
    busses (remember, PCI is hot pluggable.)

    So, the distros need to name network devices in a persistant way, that
    is why the distros now do this. If you don't like the distro doing it,
    complain to them, it's not a kernel issue

    thanks,

    greg k-h
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  19. Re: What still uses the block layer?

    Alan Cox wrote:
    >> You can pull a Model and Serial number via hdparm -i, but it's not as
    >> easy to manipulate as a fixed-length MAC address. That's why people
    >> tend to use filesystem UUID's.

    >
    > ATA8 at the moment looks set to add a true "MAC" or "WWN" type identifier
    > to each device.. Right now model/serial is not always unique.


    WWN was added in ATA-7, AFAICS.

    However, I've seen quite a few ATA-7 devices that do not bother to fill
    it in. I wonder if ATA-8 device firmwares will act with similar
    slackness.

    Jeff



    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  20. Re: What still uses the block layer?

    On Mon, Oct 15, 2007 at 10:25:13AM -0700, Greg KH wrote:
    > Use mount-by-label instead, it's much saner and handles device name
    > movement just fine (as does the UUID method that you seem to hate.)
    > Look in /dev/disk/ for a wide range of options that you have in which to
    > choose how to pick your block devices.


    But you still have to spin up the disc to read the label (which seems
    like a legitimate complaint to me).

    --
    Intel are signing my paycheques ... these opinions are still mine
    "Bill, look, we understand that you're interested in selling us this
    operating system, but compare it to ours. We can't possibly take such
    a retrograde step."
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread
Page 2 of 5 FirstFirst 1 2 3 4 ... LastLast