[PATCH] Add io-mapping functions to dynamically map large device apertures - Kernel

This is a discussion on [PATCH] Add io-mapping functions to dynamically map large device apertures - Kernel ; From: Keith Packard Graphics devices have large PCI apertures which would consume a significant fraction of a 32-bit address space if mapped during driver initialization. Using ioremap at runtime is impractical as it is too slow. This new set of ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: [PATCH] Add io-mapping functions to dynamically map large device apertures

  1. [PATCH] Add io-mapping functions to dynamically map large device apertures

    From: Keith Packard

    Graphics devices have large PCI apertures which would consume a significant
    fraction of a 32-bit address space if mapped during driver initialization.
    Using ioremap at runtime is impractical as it is too slow. This new set of
    interfaces uses atomic mappings on 32-bit processors and a large static
    mapping on 64-bit processors to provide reasonable 32-bit performance and
    optimal 64-bit performance.

    The current implementation sits atop the io_map_atomic fixmap-based mechanism
    for 32-bit processors.

    This includes some editorial suggestions from Randy Dunlap for
    Documentation/io-mapping.txt

    Signed-off-by: Keith Packard
    Signed-off-by: Eric Anholt
    ---
    Documentation/io-mapping.txt | 76 +++++++++++++++++++++++++++
    include/linux/io-mapping.h | 118 ++++++++++++++++++++++++++++++++++++++++++
    2 files changed, 194 insertions(+), 0 deletions(-)
    create mode 100644 Documentation/io-mapping.txt
    create mode 100644 include/linux/io-mapping.h

    diff --git a/Documentation/io-mapping.txt b/Documentation/io-mapping.txt
    new file mode 100644
    index 0000000..cd2f726
    --- /dev/null
    +++ b/Documentation/io-mapping.txt
    @@ -0,0 +1,76 @@
    +The io_mapping functions in linux/io-mapping.h provide an abstraction for
    +efficiently mapping small regions of an I/O device to the CPU. The initial
    +usage is to support the large graphics aperture on 32-bit processors where
    +ioremap_wc cannot be used to statically map the entire aperture to the CPU
    +as it would consume too much of the kernel address space.
    +
    +A mapping object is created during driver initialization using
    +
    + struct io_mapping *io_mapping_create_wc(unsigned long base,
    + unsigned long size)
    +
    + 'base' is the bus address of the region to be made
    + mappable, while 'size' indicates how large a mapping region to
    + enable. Both are in bytes.
    +
    + This _wc variant provides a mapping which may only be used
    + with the io_mapping_map_atomic_wc or io_mapping_map_wc.
    +
    +With this mapping object, individual pages can be mapped either atomically
    +or not, depending on the necessary scheduling environment. Of course, atomic
    +maps are more efficient:
    +
    + void *io_mapping_map_atomic_wc(struct io_mapping *mapping,
    + unsigned long offset)
    +
    + 'offset' is the offset within the defined mapping region.
    + Accessing addresses beyond the region specified in the
    + creation function yields undefined results. Using an offset
    + which is not page aligned yields an undefined result. The
    + return value points to a single page in CPU address space.
    +
    + This _wc variant returns a write-combining map to the
    + page and may only be used with mappings created by
    + io_mapping_create_wc
    +
    + Note that the task may not sleep while holding this page
    + mapped.
    +
    + void io_mapping_unmap_atomic(void *vaddr)
    +
    + 'vaddr' must be the the value returned by the last
    + io_mapping_map_atomic_wc call. This unmaps the specified
    + page and allows the task to sleep once again.
    +
    +If you need to sleep while holding the lock, you can use the non-atomic
    +variant, although they may be significantly slower.
    +
    + void *io_mapping_map_wc(struct io_mapping *mapping,
    + unsigned long offset)
    +
    + This works like io_mapping_map_atomic_wc except it allows
    + the task to sleep while holding the page mapped.
    +
    + void io_mapping_unmap(void *vaddr)
    +
    + This works like io_mapping_unmap_atomic, except it is used
    + for pages mapped with io_mapping_map_wc.
    +
    +At driver close time, the io_mapping object must be freed:
    +
    + void io_mapping_free(struct io_mapping *mapping)
    +
    +Current Implementation:
    +
    +The initial implementation of these functions uses existing mapping
    +mechanisms and so provides only an abstraction layer and no new
    +functionality.
    +
    +On 64-bit processors, io_mapping_create_wc calls ioremap_wc for the whole
    +range, creating a permanent kernel-visible mapping to the resource. The
    +map_atomic and map functions add the requested offset to the base of the
    +virtual address returned by ioremap_wc.
    +
    +On 32-bit processors, io_mapping_map_atomic_wc uses io_map_atomic_prot_pfn,
    +which uses the fixmaps to get us a mapping to a page using an atomic fashion.
    +For io_mapping_map_wc, ioremap_wc() is used to get a mapping of the region.
    diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
    new file mode 100644
    index 0000000..1b56699
    --- /dev/null
    +++ b/include/linux/io-mapping.h
    @@ -0,0 +1,118 @@
    +/*
    + * Copyright © 2008 Keith Packard
    + *
    + * This file is free software; you can redistribute it and/or modify
    + * it under the terms of version 2 of the GNU General Public License
    + * as published by the Free Software Foundation.
    + *
    + * This program is distributed in the hope that it will be useful,
    + * but WITHOUT ANY WARRANTY; without even the implied warranty of
    + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
    + * GNU General Public License for more details.
    + *
    + * You should have received a copy of the GNU General Public License
    + * along with this program; if not, write to the Free Software Foundation,
    + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
    + */
    +
    +#ifndef _LINUX_IO_MAPPING_H
    +#define _LINUX_IO_MAPPING_H
    +
    +#include
    +#include
    +#include
    +#include
    +
    +/*
    + * The io_mapping mechanism provides an abstraction for mapping
    + * individual pages from an io device to the CPU in an efficient fashion.
    + *
    + * See Documentation/io_mapping.txt
    + */
    +
    +/* this struct isn't actually defined anywhere */
    +struct io_mapping;
    +
    +#ifdef CONFIG_X86_64
    +
    +/* Create the io_mapping object*/
    +static inline struct io_mapping *
    +io_mapping_create_wc(unsigned long base, unsigned long size)
    +{
    + return (struct io_mapping *) ioremap_wc(base, size);
    +}
    +
    +static inline void
    +io_mapping_free(struct io_mapping *mapping)
    +{
    + iounmap(mapping);
    +}
    +
    +/* Atomic map/unmap */
    +static inline void *
    +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset)
    +{
    + return ((char *) mapping) + offset;
    +}
    +
    +static inline void
    +io_mapping_unmap_atomic(void *vaddr)
    +{
    +}
    +
    +/* Non-atomic map/unmap */
    +static inline void *
    +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
    +{
    + return ((char *) mapping) + offset;
    +}
    +
    +static inline void
    +io_mapping_unmap(void *vaddr)
    +{
    +}
    +
    +#endif /* CONFIG_X86_64 */
    +
    +#ifdef CONFIG_X86_32
    +static inline struct io_mapping *
    +io_mapping_create_wc(unsigned long base, unsigned long size)
    +{
    + return (struct io_mapping *) base;
    +}
    +
    +static inline void
    +io_mapping_free(struct io_mapping *mapping)
    +{
    +}
    +
    +/* Atomic map/unmap */
    +static inline void *
    +io_mapping_map_atomic_wc(struct io_mapping *mapping, unsigned long offset)
    +{
    + offset += (unsigned long) mapping;
    + return iomap_atomic_prot_pfn(offset >> PAGE_SHIFT, KM_USER0,
    + __pgprot(__PAGE_KERNEL_WC));
    +}
    +
    +static inline void
    +io_mapping_unmap_atomic(void *vaddr)
    +{
    + iounmap_atomic(vaddr, KM_USER0);
    +}
    +
    +static inline void *
    +io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
    +{
    + offset += (unsigned long) mapping;
    + return ioremap_wc(offset, PAGE_SIZE);
    +}
    +
    +static inline void
    +io_mapping_unmap(void *vaddr)
    +{
    + iounmap(vaddr);
    +}
    +#endif /* CONFIG_X86_32 */
    +
    +#endif /* _LINUX_IO_MAPPING_H */
    --
    1.5.6.5


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH] Add io-mapping functions to dynamically map large device apertures


    * Eric Anholt wrote:

    > From: Keith Packard
    >
    > Graphics devices have large PCI apertures which would consume a significant
    > fraction of a 32-bit address space if mapped during driver initialization.
    > Using ioremap at runtime is impractical as it is too slow. This new set of
    > interfaces uses atomic mappings on 32-bit processors and a large static
    > mapping on 64-bit processors to provide reasonable 32-bit performance and
    > optimal 64-bit performance.
    >
    > The current implementation sits atop the io_map_atomic fixmap-based mechanism
    > for 32-bit processors.
    >
    > This includes some editorial suggestions from Randy Dunlap for
    > Documentation/io-mapping.txt
    >
    > Signed-off-by: Keith Packard
    > Signed-off-by: Eric Anholt
    > ---
    > Documentation/io-mapping.txt | 76 +++++++++++++++++++++++++++
    > include/linux/io-mapping.h | 118 ++++++++++++++++++++++++++++++++++++++++++


    I've applied your three patches to tip/core/resources for testing,
    thanks!

    One small detail:

    > +++ b/include/linux/io-mapping.h


    > +#ifdef CONFIG_X86_64


    it's ugly and inflexible to put x86 dependencies into generic headers.
    (even though with a high likelyhood 32-bit x86 will be the only arch
    to ever implement the iomap_atomic() APIs)

    Instead please add a HAVE_ATOMIC_IOMAP define to arch/x86/Kconfig:

    config HAVE_ATOMIC_IOMAP
    def_bool y
    depends on X86_32

    .... and use #ifndef HAVE_ATOMIC_IOMAP in include/linux/io-mapping.h
    instead of #ifdef CONFIG_X86_64.

    ( Other 32-bit architectures which need an atomic iomap implementation
    for address space reasons can then implement the iomap_atomic*()
    APIs too and set this same flag, to gain the same generic io_mapping
    implementation. )

    Please send this cleanup as a delta patch, ontop of your three
    patches.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH] Add io-mapping functions to dynamically map large device apertures

    On Fri, 2008-10-31 at 10:21 +0100, Ingo Molnar wrote:

    > it's ugly and inflexible to put x86 dependencies into generic headers.
    > (even though with a high likelyhood 32-bit x86 will be the only arch
    > to ever implement the iomap_atomic() APIs)
    >
    > Instead please add a HAVE_ATOMIC_IOMAP define to arch/x86/Kconfig:
    >
    > config HAVE_ATOMIC_IOMAP
    > def_bool y
    > depends on X86_32
    >
    > ... and use #ifndef HAVE_ATOMIC_IOMAP in include/linux/io-mapping.h
    > instead of #ifdef CONFIG_X86_64.


    Just to clarify the issue here: there are two separate implementations
    of the io_mapping API -- one for 'large address space' machines where
    ioremap_wc can handle the typical graphics aperture within the kernel
    virtual map, and the other using iomap_atomic_prot_pfn for machines with
    puny address spaces.

    All large address space machines can provide the io_mapping API without
    any archtecture-specific support. For efficient 32-bit io_mapping
    support, we require the new iomap_atomic_prot_pfn function.

    So, it seems like what I want to do is use the large address space code
    on any machine which supports it, and then use the iomap_atomic_prot_pfn
    version for small address space machines which have the
    iomap_atomic_prot_pfn function.

    What I think you're suggesting is to just assume that machines without
    iomap_atomic_prot_pfn have address spaces large enough to support the
    ioremap_wc path. The alternative is to create a third (slow) path (which
    I did before the iomap_atomic_prot_pfn API was introduced) that uses
    ioremap_wc at run time for small address space machines without
    iomap_atomic_prot_pfn.

    Let me know which you'd prefer and I'll get a patch out ASAP.

    --
    keith.packard@intel.com

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iD8DBQBJCzlxQp8BWwlsTdMRArB+AKCL846YX33OTFM8OX4swX aZZJrJlgCgu85r
    s19cgbwNPRHKyUMezx6V8HU=
    =F+ny
    -----END PGP SIGNATURE-----


  4. Re: [PATCH] Add io-mapping functions to dynamically map large device apertures


    * Keith Packard wrote:

    > On Fri, 2008-10-31 at 10:21 +0100, Ingo Molnar wrote:
    >
    > > it's ugly and inflexible to put x86 dependencies into generic headers.
    > > (even though with a high likelyhood 32-bit x86 will be the only arch
    > > to ever implement the iomap_atomic() APIs)
    > >
    > > Instead please add a HAVE_ATOMIC_IOMAP define to arch/x86/Kconfig:
    > >
    > > config HAVE_ATOMIC_IOMAP
    > > def_bool y
    > > depends on X86_32
    > >
    > > ... and use #ifndef HAVE_ATOMIC_IOMAP in include/linux/io-mapping.h
    > > instead of #ifdef CONFIG_X86_64.

    >
    > Just to clarify the issue here: there are two separate
    > implementations of the io_mapping API -- one for 'large address
    > space' machines where ioremap_wc can handle the typical graphics
    > aperture within the kernel virtual map, and the other using
    > iomap_atomic_prot_pfn for machines with puny address spaces.
    >
    > All large address space machines can provide the io_mapping API
    > without any archtecture-specific support. For efficient 32-bit
    > io_mapping support, we require the new iomap_atomic_prot_pfn
    > function.
    >
    > So, it seems like what I want to do is use the large address space
    > code on any machine which supports it, and then use the
    > iomap_atomic_prot_pfn version for small address space machines which
    > have the iomap_atomic_prot_pfn function.


    Correct.

    > What I think you're suggesting is to just assume that machines
    > without iomap_atomic_prot_pfn have address spaces large enough to
    > support the ioremap_wc path. The alternative is to create a third
    > (slow) path (which I did before the iomap_atomic_prot_pfn API was
    > introduced) that uses ioremap_wc at run time for small address space
    > machines without iomap_atomic_prot_pfn.
    >
    > Let me know which you'd prefer and I'll get a patch out ASAP.


    Please lets keep it simple: i.e. always use ioremap_wc() when there's
    no iomap_atomic_prot_pfn() 32-bit API provided.

    ( and by all means ioremap_wc() will just work fine on most 32-bit
    architectures out of box: they dont go about trying to map hundreds
    of megabytes of graphics aperture. If they nevertheless need it,
    they can implement iomap_atomic_prot_pfn() to add support. )

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread