[devel] Fwd: Re: Discussion about UUID and ide-generic
Michael Shigorin
=?iso-8859-1?q?mike_=CE=C1_osdn=2Eorg=2Eua?=
Ср Мар 22 12:00:03 MSK 2006
(вдогонку)
----- Forwarded message from Scott James Remnant <scott/ubuntu.com> -----
Date: Tue, 21 Mar 2006 19:41:44 +0000
From: Scott James Remnant <scott/ubuntu.com>
To: ubuntu-devel-announce/lists.ubuntu.com
Subject: Re: Discussion about UUID and ide-generic
On Tue, 2006-03-21 at 11:00 -0500, Ben Collins wrote:
> This was a lengthy discussion that actually touched on two subjects.
>
Sorry I missed this discussion, I was offline all afternoon testing NM
and other network-related issues. We just had an additional discussion
about this, attached:
<Keybuk> BenC_: so it appears I missed a discussion this afternoon?
<BenC_> Keybuk: appears so :)
<Keybuk> BenC_: which is a shame, because your conclusions were all
wrong
* Keybuk wishes someone had phoned him
<Keybuk> anyway, am replying now via e-mail
<BenC_> I don't recall having any conclusions
<Keybuk> "wait 10s before loading ide-generic"
<BenC_> that's not a conclusion, that's a suggestion :)
<Keybuk> it's a decision, according to your e-mail
<Keybuk> <g>
<BenC_> based on Mith's argument
<Keybuk> it won't help a monkey's
<Keybuk> the kernel already has that "sleep 10" in the IDE init code
<Keybuk> which is in the module loading bit
<Keybuk> so blocks the return of modprobe
<BenC_> but if the module load is blocking, so are all other modules
<Keybuk> "all other modules" ?
<BenC_> I didn't know ide-generic had a sleep(10) i it
<Keybuk> ide-generic doesn't
<BenC_> or is this ide-core?
<Keybuk> every IDE driver does
<Keybuk> ide-core
<Keybuk> modprobe real-pci-driver
<Keybuk> consistently doesn't return until udev has the events
<BenC_> is this the "Probing later" thing?
<Keybuk> it can take a few seconds to return on most things
<Keybuk> I don't know what "probing later" is
<BenC_> are you sure you read Mith's arguments that promoted me
suggesting that?
<BenC_> prompted
* BenC has quit (Connection timed out)
<BenC_> he statement was that the main reason we don't use UUID more
widely is that it has some dependency on the bus type
<Keybuk> right
<Keybuk> and the sleep 10 won't change that
<Keybuk> it doesn't have a dependency on the bus type
<Keybuk> this is the bit where me being there would have helped
<BenC_> he stated differently
<Keybuk> ok, so here's how the initramfs bits work
<Keybuk> first we look at the ROOT= to detect the bus type
<Keybuk> either IDE or "everything else" (which are always scsi)
<BenC_> then it does have code that depends on ide or !ide
<mjg59> Keybuk: Eh? That's entirely untrue, as far as I can tell
<Keybuk> All the drivers under the scsi subsystem (which includes sata,
usb, firewire) behave one way
<mjg59> Keybuk: I can't find any sign of a sleep(10) in the ide-generic
load path
<Keybuk> the modprobe returns immediately, then at some point in the
future the driver finds its devices
<Keybuk> mjg59: not a literal sleep(10); not ide-generic; the ide-core
probes and binds for the REAL PCI DRIVERS before modprobe can return
<mjg59> Keybuk: How?
<Keybuk> mjg59: please let me finish first
<Keybuk> ok, so for scsi and friends; we have to iterate the bus,
modprobe all the drivers, then sit back and wait for the devices to show
up
* BenC_ is now known as BenC
<Keybuk> sit back and wait up to three minutes
<Keybuk> for especially slow scsi controllers (fabbione delights in
owning these)
<Keybuk> so we do it with a simple while loop; while the root device
hasn't shown up on disc, sleep another tenth of a second
<Keybuk> if it at the end of that loop, the root device still hasn't
shown up, we give up
<BenC> ok, the ide portion is more important :)
<Keybuk> for dapper+1 I'd like to extend that to "wait forever", and
have some pretty on screen message saying something along the lines of
"if your root disk isn't plugged in, could you do that?"
<Keybuk> so that's scsi
<Keybuk> ide we do it differently, because the ide subsystem has
different quirks
<Keybuk> the drivers behave the other way, the modprobe doesn't return
until the driver has done the probe-and-bind
<Keybuk> there's no asynchronous component to them
<Keybuk> we also may have to load ide-generic on legacy ISA machines
<Keybuk> and in some cases with a root device on an ancient PCMCIA
controller (which counts as "legacy ISA" in my mind)
<mjg59> Keybuk: I'm still not finding anywhere where the PCI IDE drivers
differ from normal PCI drivers (where there is an asynchronous
component)
<mjg59> Hm. PCMCIA ought to come under pcmcia_cs
<mjg59> Uh, ide_cs
<BenC> regular PCI IDE drivers use pci probing, and normal device layer
callbacks
<makx> pcmcia bridges are not incl. in latest initramfs by default
<Keybuk> mjg59: critical path. the ide core already knows what devices
are available. so there's no issuing of a probe command, and waiting
for it to return. it just iterates an already-in-memory structure and
does the magic
<mjg59> There's a corner case where it's not actually PCMCIA, but an
on-board IDE interface is connected via the PMCIA slot
<BenC> the difference is that IDE detects the drives at the same time it
detects the controllers
<BenC> so Keybuk is correct in how it works
<Keybuk> right
<Keybuk> if you detect the controller, you also just detected the drives
<BenC> how do you decide to load ide-generic?
<Keybuk> scsi after detecting the controller you have to ask the
controller nicely to go detect the drives and get back to you on that
<Keybuk> right
<Keybuk> so because of the way ide works in this regard, what we do is
<Keybuk> iterate each PCI device in order, and load the module for it,
waiting for everything to settle in between
<mjg59> Keybuk: You load piix. That registers itself as an IDE driver,
which in turn calls pci_register_driver. After pci_register_driver, the
probe routine in piix (piix_init_one) is called. If that binds to
something, it does ide_setup_pci_device (which actually allocates the
interface)
<Keybuk> mjg59: right, nothing there returns to userspace
<mjg59> Isn't that exactly the same as loading a network driver, except
that pci_register_device is done in the IDE layer rather than in the
module itself?
<Keybuk> thanks for verifying that for us :)
<Keybuk> so we know after we've loaded the PCI drivers, if the root
device still hasn't shown up, it isn't going to show up later
<Keybuk> so we load ide-generic as a last resort
<Keybuk> scsi looks like "load drivers; wait for root device to show up;
abort after 3 minutes"
<Keybuk> ide looks like "load drivers sequentially; if root device
hasn't shown up, load ide-generic"
<mjg59> So the original discussion appears to have been based on the
false premise that we need to know whether the root device is on ide or
not in order to only load ide-generic at the appropriate time?
<Keybuk> well, no
* mxpxpod (n=bryan на unaffiliated/mxpxpod) has joined #ubuntu-kernel
<Keybuk> we still need to know whether we may or may not need to load
ide-generic
<BenC> no, it's correct
<BenC> the only incorrect info I got was the loading ide-generic was
racy and could cause problems
<mjg59> Why not load scsi drivers first, then ide drivers (at this point
based on PCI IDs), then unconditionally load ide-generic if there's
still no rootfs?
<mjg59> Loading ide-generic isn't going to hurt in the case where the
PCI driver has already bound the io-ports
<Keybuk> mjg59: because of FUCKING SATA
<mjg59> Keybuk: But you've already loaded the driver
<Keybuk> SATA as far as I'm concerned is a scsi driver
<BenC> mjg59: according to Mith, Fabio's crappy scsi controller breaks
if you load ide-generic
<Keybuk> it behaves like scsi drivers
<Keybuk> most importantly, it returns from modprobe immediately, before
performing the controller-specific initialisation
<Keybuk> so modprobe piix-sata will return straight away, and udev will
get the devices some few seconds later
<Keybuk> on some older SATA controllers, on a busy system, this can be
10-15s
<BenC> Keybuk: ok, here's the problems I see....
<Keybuk> loading ide-generic at any point before the SATA SCSI driver
has finished can sometimes "steal" the devices from it
<Keybuk> the solution wwould be to only load ide-generic after the three
minutes are up
<Keybuk> which means everyone on ISA would have to wait 3 minutes before
their machine started booting
<mjg59> Keybuk: ata_piix calls piix_init, which calls pci_module_init,
which will then provide a probe method which registers the pors
<BenC> Keybuk: IMO, we should detect that ide-generic is needed at
install, and make a note of it for initramfs to use :)
<mjg59> Keybuk: I'm still not seeing how this is different to the IDE
case (though there may be something very simple I'm missing)
<BenC> or when initramfs is built, it could detect that
<Keybuk> mjg59: right, and then it returns to userspace, and libata
later calls the probe method
<Keybuk> BenC: I agree. We can look in /sys in the installer and see
whether the driver is ide-generic or not; if not, use UUID
<mjg59> Keybuk: Sorry, /which/ probe method?
<Keybuk> mjg59: I can't remember offhand; when I looked through the code
(a few months ago) the SATA stuff was very much callback orientated
<Keybuk> and hooked into the SCSI bits of the kernel which are totally
async
<BenC> mjg59: from what I can tell, scsi device probing happens in the
scsiX kernel process and not in the module probe code path
<Keybuk> whereas IDE was all one obvious line of function call codes
<mjg59> BenC: What are you defining as "scsi device probing"?
<Keybuk> mjg59: "devices handled by the kernel scsi subsystem", which
includes SATA ide drives
<BenC> where it probes for devices attached to the controller, as
opposed to the controller itself
<mjg59> BenC: But by that time it's already registered the i/o ports,
surely?
<Keybuk> mjg59: not always,no
<Keybuk> loading the controller driver doesn't guarantee anything other
than you've opened the controller
<BenC> mjg59: i/o for the controller yes, but for the ports no
<Keybuk> at that point, you may be lucky enough for the ports to be
registered, but not generally
<Keybuk> device probing on the controller happens later
<mjg59> pci_request_regions is done in ata_pci_init_one
<mjg59> Which is called directly from the ata_piix PCI probe routine
<Keybuk> mjg59: there's an easy way to test it btw
<Keybuk> boot with break=top
<mjg59> Keybuk: Well, if I had any SATA machines...
<Keybuk> modprobe -a piix-sata ide-generic
<mjg59> (That is, SATA machines with legacy i/o)
<Keybuk> about 7/10 times, your drives will be /dev/hda rather
than /dev/sda
<Keybuk> or just trawl malone for the ~100 bugs about that
<mjg59> Keybuk: Right, I'm willing to believe that, but I still don't
see any mechanism by which this can be true for sata but not for IDE
<Keybuk> one of them from mdz :) his laptop loves to trigger that case
<mjg59> Uh. When did mdz get a new laptop?
<Keybuk> sorry, strike that; mdz had a different bug, I think
<mjg59> Yeah, he had ide-generic binding before piix
<Keybuk> yeah
<Keybuk> he doesn't have a sata laptop
<mjg59> (Which you've just been telling me is impossible, but :) )
<Keybuk> no we loaded ide-generic first in his case
<Keybuk> that was a clear-out bug :p
<Keybuk> mjg59: if you start with IDE from the top, you'll see that
nothing returns to userspace until the drives are already probed for and
bound
<Keybuk> mjg59: if you start with SCSI/SATA from the top, there are
plenty of return to userspace b
<Keybuk> points before the drives are probed for
<mjg59> Keybuk: As far as I can tell, in both the IDE and sata cases,
the io regions are requested in the call into the core layer that occurs
at the end of the pci probe
<BenC> ok, from what I'm hearing, ide-generic should really be the
corner case instead of ide in general
<Keybuk> I'm not sure about registering of i/o ports, I wasn't looking
for that
<Keybuk> and I don't really understand that
<mjg59> Keybuk: It's the registration of i/o ports that's then /entire
point/
<Keybuk> I was looking at pure "when does the kernel create sdXY and
associate it with that driver"
<mjg59> If the i/o ports are registered, ide-generic won't bind.
Otherwise, it will.
<Keybuk> but ide-generic does bind
<mjg59> Keybuk: Yes, but that's not the problem
<BenC> yeah, which driver grabs the ports from the kernel is the winner
<mjg59> So. How can that situation arise in the sata case, but not the
ide one?
<BenC> but from what I can tell, there not only i/o per controller, but
i/o per device
<mjg59> BenC: No
<mjg59> One i/o region per channel
<BenC> ok, one per primary and secondary
<Keybuk> I'm quite happy to change the initramfs code to look like:
<Keybuk> - load pci drivers
<Keybuk> - load ide-generic
<Keybuk> - wait for up to three minutes for the root device to show up
<Keybuk> - abort
<mjg59> Keybuk: Could you take a look at drivers/scsi/ata_piix.c and
tell me where it enters userspace between loading and hitting line 4843
of libata-core?
<mjg59> I'm genuinely failing to get this
<Keybuk> I can't see where it gets to libata-core
<Keybuk> ah, sorry
<Keybuk> at the bottom
<Keybuk> right so piix_init calls pci_module_init calls piix_init_one
calls ata_pci_init_one
<mjg59> Yes
<mjg59> Which is identical to the IDE case, as far as I can tell
<Keybuk> right
<Keybuk> and is this true for every single sata driver?
<Keybuk> if it's true that all the sata drivers reserve their regions in
init, then I'm happy
<Keybuk> that may not have been true earlier, or we may have had a
different bug
<mjg59> ahci seems to do it internally, but doesn't do legacy IDE anyway
as far as I can tell
<Keybuk> and as long as reserving the regions guarantees ide-generic
can't steal the devices
<BenC> that still doesn't help out issue where scsi in general is
delayed device probing
<Keybuk> how does it effect that issue?
<mjg59> BenC: That's fine. Load all SCSI controllers, loda all PCI IDE,
load ide-generic, wait up to three minutes (or whatever)
<BenC> sata behaving like ide doesn't help anything :)
<Keybuk> this wouldn't distinguish between SCSI and PCI IDE drivers
<Keybuk> can anyone recall off-hand how we avoid loading both the SATA
and PATA driver for a controller?
<Keybuk> do we do that in the kernel?
<mjg59> They have different PCI IDs
<BenC> so basically, from my point of view is, load scsi, ide,
ide-generic, loop for 3 minutes checking for device path and/or UUID
Scott
--
Scott James Remnant
scott/ubuntu.com
----- End forwarded message -----
--
---- WBR, Michael Shigorin <mike на altlinux.ru>
------ Linux.Kiev http://www.linux.kiev.ua/
Подробная информация о списке рассылки Devel