[PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite() - Kernel

This is a discussion on [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite() - Kernel ; It seems that with the recent usage of ->page_mkwrite() a little detail was overlooked. ..22-rc1 merged OCFS2 usage of this hook ..23-rc1 merged XFS usage ..24-rc1 will most likely merge NFS usage Please consider this for .23 final and maybe ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

  1. [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

    It seems that with the recent usage of ->page_mkwrite() a little detail
    was overlooked.

    ..22-rc1 merged OCFS2 usage of this hook
    ..23-rc1 merged XFS usage
    ..24-rc1 will most likely merge NFS usage

    Please consider this for .23 final and maybe even .22.x

    ---
    Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()

    All the current page_mkwrite() implementations also set the page dirty. Which
    results in the set_page_dirty_balance() call to _not_ call balance, because the
    page is already found dirty.

    This allows us to dirty a _lot_ of pages without ever hitting
    balance_dirty_pages(). Not good (tm).

    Force a balance call if ->page_mkwrite() was successful.

    Signed-off-by: Peter Zijlstra
    ---
    include/linux/writeback.h | 2 +-
    mm/memory.c | 9 +++++++--
    mm/page-writeback.c | 4 ++--
    3 files changed, 10 insertions(+), 5 deletions(-)

    Index: linux-2.6/include/linux/writeback.h
    ================================================== =================
    --- linux-2.6.orig/include/linux/writeback.h
    +++ linux-2.6/include/linux/writeback.h
    @@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
    loff_t pos, loff_t count);
    int sync_page_range_nolock(struct inode *inode, struct address_space *mapping,
    loff_t pos, loff_t count);
    -void set_page_dirty_balance(struct page *page);
    +void set_page_dirty_balance(struct page *page, int page_mkwrite);
    void writeback_set_ratelimit(void);

    /* pdflush.c */
    Index: linux-2.6/mm/memory.c
    ================================================== =================
    --- linux-2.6.orig/mm/memory.c
    +++ linux-2.6/mm/memory.c
    @@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
    struct page *old_page, *new_page;
    pte_t entry;
    int reuse = 0, ret = 0;
    + int page_mkwrite = 0;
    struct page *dirty_page = NULL;

    old_page = vm_normal_page(vma, address, orig_pte);
    @@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
    page_cache_release(old_page);
    if (!pte_same(*page_table, orig_pte))
    goto unlock;
    +
    + page_mkwrite = 1;
    }
    dirty_page = old_page;
    get_page(dirty_page);
    @@ -1691,7 +1694,7 @@ unlock:
    * do_no_page is protected similarly.
    */
    wait_on_page_locked(dirty_page);
    - set_page_dirty_balance(dirty_page);
    + set_page_dirty_balance(dirty_page, page_mkwrite);
    put_page(dirty_page);
    }
    return ret;
    @@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
    struct page *dirty_page = NULL;
    struct vm_fault vmf;
    int ret;
    + int page_mkwrite = 0;

    vmf.virtual_address = (void __user *)(address & PAGE_MASK);
    vmf.pgoff = pgoff;
    @@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
    anon = 1; /* no anon but release vmf.page */
    goto out;
    }
    + page_mkwrite = 1;
    }
    }

    @@ -2375,7 +2380,7 @@ out_unlocked:
    if (anon)
    page_cache_release(vmf.page);
    else if (dirty_page) {
    - set_page_dirty_balance(dirty_page);
    + set_page_dirty_balance(dirty_page, page_mkwrite);
    put_page(dirty_page);
    }

    Index: linux-2.6/mm/page-writeback.c
    ================================================== =================
    --- linux-2.6.orig/mm/page-writeback.c
    +++ linux-2.6/mm/page-writeback.c
    @@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
    pdflush_operation(background_writeout, 0);
    }

    -void set_page_dirty_balance(struct page *page)
    +void set_page_dirty_balance(struct page *page, int page_mkwrite)
    {
    - if (set_page_dirty(page)) {
    + if (set_page_dirty(page) || page_mkwrite) {
    struct address_space *mapping = page_mapping(page);

    if (mapping)


    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

    On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
    > It seems that with the recent usage of ->page_mkwrite() a little detail
    > was overlooked.
    >
    > .22-rc1 merged OCFS2 usage of this hook
    > .23-rc1 merged XFS usage
    > .24-rc1 will most likely merge NFS usage
    >
    > Please consider this for .23 final and maybe even .22.x
    >
    > ---
    > Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
    >
    > All the current page_mkwrite() implementations also set the page dirty.
    > Which results in the set_page_dirty_balance() call to _not_ call balance,
    > because the page is already found dirty.
    >
    > This allows us to dirty a _lot_ of pages without ever hitting
    > balance_dirty_pages(). Not good (tm).
    >
    > Force a balance call if ->page_mkwrite() was successful.


    Would it be better to just have the callers set_page_dirty_balance()?


    > Signed-off-by: Peter Zijlstra
    > ---
    > include/linux/writeback.h | 2 +-
    > mm/memory.c | 9 +++++++--
    > mm/page-writeback.c | 4 ++--
    > 3 files changed, 10 insertions(+), 5 deletions(-)
    >
    > Index: linux-2.6/include/linux/writeback.h
    > ================================================== =================
    > --- linux-2.6.orig/include/linux/writeback.h
    > +++ linux-2.6/include/linux/writeback.h
    > @@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
    > loff_t pos, loff_t count);
    > int sync_page_range_nolock(struct inode *inode, struct address_space
    > *mapping, loff_t pos, loff_t count);
    > -void set_page_dirty_balance(struct page *page);
    > +void set_page_dirty_balance(struct page *page, int page_mkwrite);
    > void writeback_set_ratelimit(void);
    >
    > /* pdflush.c */
    > Index: linux-2.6/mm/memory.c
    > ================================================== =================
    > --- linux-2.6.orig/mm/memory.c
    > +++ linux-2.6/mm/memory.c
    > @@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
    > struct page *old_page, *new_page;
    > pte_t entry;
    > int reuse = 0, ret = 0;
    > + int page_mkwrite = 0;
    > struct page *dirty_page = NULL;
    >
    > old_page = vm_normal_page(vma, address, orig_pte);
    > @@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
    > page_cache_release(old_page);
    > if (!pte_same(*page_table, orig_pte))
    > goto unlock;
    > +
    > + page_mkwrite = 1;
    > }
    > dirty_page = old_page;
    > get_page(dirty_page);
    > @@ -1691,7 +1694,7 @@ unlock:
    > * do_no_page is protected similarly.
    > */
    > wait_on_page_locked(dirty_page);
    > - set_page_dirty_balance(dirty_page);
    > + set_page_dirty_balance(dirty_page, page_mkwrite);
    > put_page(dirty_page);
    > }
    > return ret;
    > @@ -2238,6 +2241,7 @@ static int __do_fault(struct mm_struct *
    > struct page *dirty_page = NULL;
    > struct vm_fault vmf;
    > int ret;
    > + int page_mkwrite = 0;
    >
    > vmf.virtual_address = (void __user *)(address & PAGE_MASK);
    > vmf.pgoff = pgoff;
    > @@ -2315,6 +2319,7 @@ static int __do_fault(struct mm_struct *
    > anon = 1; /* no anon but release vmf.page */
    > goto out;
    > }
    > + page_mkwrite = 1;
    > }
    > }
    >
    > @@ -2375,7 +2380,7 @@ out_unlocked:
    > if (anon)
    > page_cache_release(vmf.page);
    > else if (dirty_page) {
    > - set_page_dirty_balance(dirty_page);
    > + set_page_dirty_balance(dirty_page, page_mkwrite);
    > put_page(dirty_page);
    > }
    >
    > Index: linux-2.6/mm/page-writeback.c
    > ================================================== =================
    > --- linux-2.6.orig/mm/page-writeback.c
    > +++ linux-2.6/mm/page-writeback.c
    > @@ -460,9 +460,9 @@ static void balance_dirty_pages(struct a
    > pdflush_operation(background_writeout, 0);
    > }
    >
    > -void set_page_dirty_balance(struct page *page)
    > +void set_page_dirty_balance(struct page *page, int page_mkwrite)
    > {
    > - if (set_page_dirty(page)) {
    > + if (set_page_dirty(page) || page_mkwrite) {
    > struct address_space *mapping = page_mapping(page);
    >
    > if (mapping)

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

    On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
    > On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:
    > > It seems that with the recent usage of ->page_mkwrite() a little detail
    > > was overlooked.
    > >
    > > .22-rc1 merged OCFS2 usage of this hook
    > > .23-rc1 merged XFS usage
    > > .24-rc1 will most likely merge NFS usage
    > >
    > > Please consider this for .23 final and maybe even .22.x
    > >
    > > ---
    > > Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()
    > >
    > > All the current page_mkwrite() implementations also set the page dirty.
    > > Which results in the set_page_dirty_balance() call to _not_ call balance,
    > > because the page is already found dirty.
    > >
    > > This allows us to dirty a _lot_ of pages without ever hitting
    > > balance_dirty_pages(). Not good (tm).
    > >
    > > Force a balance call if ->page_mkwrite() was successful.

    >
    > Would it be better to just have the callers set_page_dirty_balance()?


    block_page_mkwrite() is just using generic interfaces to do this,
    same as pretty much any write() system call. The idea was to make it
    as similar to the write() call path as possible...

    However, unlike generic_file_buffered_write(), we are not calling
    balance_dirty_pages_ratelimited(mapping) between
    ->prepare/commit_write call pairs. Perhaps this should be added to
    block_page_mkwrite() after the page is unlocked....

    Cheers,

    Dave.
    --
    Dave Chinner
    Principal Engineer
    SGI Australian Software Group
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

    On Tuesday 09 October 2007 09:36, David Chinner wrote:
    > On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote:
    > > On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote:


    > > > Force a balance call if ->page_mkwrite() was successful.

    > >
    > > Would it be better to just have the callers set_page_dirty_balance()?

    >
    > block_page_mkwrite() is just using generic interfaces to do this,
    > same as pretty much any write() system call. The idea was to make it
    > as similar to the write() call path as possible...
    >
    > However, unlike generic_file_buffered_write(), we are not calling
    > balance_dirty_pages_ratelimited(mapping) between
    > ->prepare/commit_write call pairs. Perhaps this should be added to
    > block_page_mkwrite() after the page is unlocked....


    That sounds pretty sane, in terms of matching with
    generic_file_buffered_write.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

    On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
    > > block_page_mkwrite() is just using generic interfaces to do this,
    > > same as pretty much any write() system call. The idea was to make it
    > > as similar to the write() call path as possible...
    > >
    > > However, unlike generic_file_buffered_write(), we are not calling
    > > balance_dirty_pages_ratelimited(mapping) between
    > > ->prepare/commit_write call pairs. Perhaps this should be added to
    > > block_page_mkwrite() after the page is unlocked....

    >
    > That sounds pretty sane, in terms of matching with
    > generic_file_buffered_write.


    I agree. We could also insert a call to balance_dirty_pages_ratelimited() in
    __ocfs2_page_mkwrite.
    --Mark

    --
    Mark Fasheh
    Senior Software Developer, Oracle
    mark.fasheh@oracle.com
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

    On Tuesday 09 October 2007 12:12, Mark Fasheh wrote:
    > On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote:
    > > > block_page_mkwrite() is just using generic interfaces to do this,
    > > > same as pretty much any write() system call. The idea was to make it
    > > > as similar to the write() call path as possible...
    > > >
    > > > However, unlike generic_file_buffered_write(), we are not calling
    > > > balance_dirty_pages_ratelimited(mapping) between
    > > > ->prepare/commit_write call pairs. Perhaps this should be added to
    > > > block_page_mkwrite() after the page is unlocked....

    > >
    > > That sounds pretty sane, in terms of matching with
    > > generic_file_buffered_write.

    >
    > I agree. We could also insert a call to balance_dirty_pages_ratelimited()
    > in __ocfs2_page_mkwrite.


    Hmm, Peter's patch got merged -- I suppose that's fine for 2.6.23 though...

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread