kmemcheck caught read from freed memory (cfq_free_io_context) - Kernel
This is a discussion on kmemcheck caught read from freed memory (cfq_free_io_context) - Kernel ; Hi,
This appeared in my logs:
kmemcheck: Caught 32-bit read from freed memory (f7042348)
Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
EIP: 0060:[ ] EFLAGS: 00210202 CPU: 0
EIP is at call_for_each_cic+0x2d/0x44
EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: ...
-
kmemcheck caught read from freed memory (cfq_free_io_context)
Hi,
This appeared in my logs:
kmemcheck: Caught 32-bit read from freed memory (f7042348)
Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
EIP: 0060:[] EFLAGS: 00210202 CPU: 0
EIP is at call_for_each_cic+0x2d/0x44
EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff4ff0 DR7: 00000400
[] kmemcheck_read+0xa8/0xe0
[] kmemcheck_access+0x1a5/0x244
[] do_page_fault+0x622/0x6fc
[] error_code+0x72/0x78
[] cfq_free_io_context+0xf/0x70
[] put_io_context+0x4f/0x58
[] exit_io_context+0x60/0x6c
[] do_exit+0x4d9/0x6f0
[] do_group_exit+0x29/0x88
[] sys_exit_group+0xf/0x14
[] sysenter_past_esp+0x6d/0xa4
[] 0xffffffff
The error occurs in cfq_free_io_context()'s call to
call_for_each_cic() which looks like this:
rcu_read_lock();
hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
func(ioc, cic);
called++;
}
rcu_read_unlock();
The function that is called is cic_free_func(). It is postulated that
hlist_for_each_entry_rcu() will dereference the previously freed list
element to get the ->next pointer.
After a short discussion with Pekka Enberg and Peter Zijlstra, it
seemed evident that this list traversal should use
hlist_for_each_entry_safe_rcu() instead, which would buffer the next
pointer before the object is freed.
Does this report seem to be valid?
The kernel is 2.6.25-rc7.
Kind regards,
Vegard Nossum
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> Hi,
>
> This appeared in my logs:
>
> kmemcheck: Caught 32-bit read from freed memory (f7042348)
>
> Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> EIP is at call_for_each_cic+0x2d/0x44
> EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff4ff0 DR7: 00000400
> [] kmemcheck_read+0xa8/0xe0
> [] kmemcheck_access+0x1a5/0x244
> [] do_page_fault+0x622/0x6fc
> [] error_code+0x72/0x78
> [] cfq_free_io_context+0xf/0x70
> [] put_io_context+0x4f/0x58
> [] exit_io_context+0x60/0x6c
> [] do_exit+0x4d9/0x6f0
> [] do_group_exit+0x29/0x88
> [] sys_exit_group+0xf/0x14
> [] sysenter_past_esp+0x6d/0xa4
> [] 0xffffffff
>
> The error occurs in cfq_free_io_context()'s call to
> call_for_each_cic() which looks like this:
>
> rcu_read_lock();
> hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> func(ioc, cic);
> called++;
> }
> rcu_read_unlock();
>
> The function that is called is cic_free_func(). It is postulated that
> hlist_for_each_entry_rcu() will dereference the previously freed list
> element to get the ->next pointer.
>
> After a short discussion with Pekka Enberg and Peter Zijlstra, it
> seemed evident that this list traversal should use
> hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> pointer before the object is freed.
>
> Does this report seem to be valid?
>
> The kernel is 2.6.25-rc7.
The missing hlist for loop would look something like so:
#define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
for (pos = (head)->first; \
rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
pos = n)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Tue, Apr 01, 2008 at 11:36:28PM +0200, Peter Zijlstra wrote:
> On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > Hi,
> >
> > This appeared in my logs:
> >
> > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> >
> > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > EIP is at call_for_each_cic+0x2d/0x44
> > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > DR6: ffff4ff0 DR7: 00000400
> > [] kmemcheck_read+0xa8/0xe0
> > [] kmemcheck_access+0x1a5/0x244
> > [] do_page_fault+0x622/0x6fc
> > [] error_code+0x72/0x78
> > [] cfq_free_io_context+0xf/0x70
> > [] put_io_context+0x4f/0x58
> > [] exit_io_context+0x60/0x6c
> > [] do_exit+0x4d9/0x6f0
> > [] do_group_exit+0x29/0x88
> > [] sys_exit_group+0xf/0x14
> > [] sysenter_past_esp+0x6d/0xa4
> > [] 0xffffffff
> >
> > The error occurs in cfq_free_io_context()'s call to
> > call_for_each_cic() which looks like this:
> >
> > rcu_read_lock();
> > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > func(ioc, cic);
> > called++;
> > }
> > rcu_read_unlock();
> >
> > The function that is called is cic_free_func(). It is postulated that
> > hlist_for_each_entry_rcu() will dereference the previously freed list
> > element to get the ->next pointer.
> >
> > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > seemed evident that this list traversal should use
> > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > pointer before the object is freed.
> >
> > Does this report seem to be valid?
> >
> > The kernel is 2.6.25-rc7.
>
> The missing hlist for loop would look something like so:
>
> #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> for (pos = (head)->first; \
> rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> pos = n)
Why the heck is cic_free_func() immediately doing a kmem_cache_free()
on the cfq_io_context structure??? Shouldn't we have a call_rcu() or a
synchronize_rcu() in there somewhere??? Given the way this is written,
wouldn't readers on other code paths get dumped onto the freelist?
This would not be a good thing...
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Tue, 2008-04-01 at 15:51 -0700, Paul E. McKenney wrote:
> On Tue, Apr 01, 2008 at 11:36:28PM +0200, Peter Zijlstra wrote:
> > On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > > Hi,
> > >
> > > This appeared in my logs:
> > >
> > > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> > >
> > > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > > EIP is at call_for_each_cic+0x2d/0x44
> > > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > DR6: ffff4ff0 DR7: 00000400
> > > [] kmemcheck_read+0xa8/0xe0
> > > [] kmemcheck_access+0x1a5/0x244
> > > [] do_page_fault+0x622/0x6fc
> > > [] error_code+0x72/0x78
> > > [] cfq_free_io_context+0xf/0x70
> > > [] put_io_context+0x4f/0x58
> > > [] exit_io_context+0x60/0x6c
> > > [] do_exit+0x4d9/0x6f0
> > > [] do_group_exit+0x29/0x88
> > > [] sys_exit_group+0xf/0x14
> > > [] sysenter_past_esp+0x6d/0xa4
> > > [] 0xffffffff
> > >
> > > The error occurs in cfq_free_io_context()'s call to
> > > call_for_each_cic() which looks like this:
> > >
> > > rcu_read_lock();
> > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > > func(ioc, cic);
> > > called++;
> > > }
> > > rcu_read_unlock();
> > >
> > > The function that is called is cic_free_func(). It is postulated that
> > > hlist_for_each_entry_rcu() will dereference the previously freed list
> > > element to get the ->next pointer.
> > >
> > > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > > seemed evident that this list traversal should use
> > > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > > pointer before the object is freed.
> > >
> > > Does this report seem to be valid?
> > >
> > > The kernel is 2.6.25-rc7.
> >
> > The missing hlist for loop would look something like so:
> >
> > #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> > for (pos = (head)->first; \
> > rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> > ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> > pos = n)
>
> Why the heck is cic_free_func() immediately doing a kmem_cache_free()
> on the cfq_io_context structure??? Shouldn't we have a call_rcu() or a
> synchronize_rcu() in there somewhere??? Given the way this is written,
> wouldn't readers on other code paths get dumped onto the freelist?
> This would not be a good thing...
I was told, but didn't check for myself it is using SLAB_RCU. Of course
that requires an object validation pass, and I didn't check it does that
either.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Tue, Apr 01 2008, Peter Zijlstra wrote:
> On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > Hi,
> >
> > This appeared in my logs:
> >
> > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> >
> > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > EIP is at call_for_each_cic+0x2d/0x44
> > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > DR6: ffff4ff0 DR7: 00000400
> > [] kmemcheck_read+0xa8/0xe0
> > [] kmemcheck_access+0x1a5/0x244
> > [] do_page_fault+0x622/0x6fc
> > [] error_code+0x72/0x78
> > [] cfq_free_io_context+0xf/0x70
> > [] put_io_context+0x4f/0x58
> > [] exit_io_context+0x60/0x6c
> > [] do_exit+0x4d9/0x6f0
> > [] do_group_exit+0x29/0x88
> > [] sys_exit_group+0xf/0x14
> > [] sysenter_past_esp+0x6d/0xa4
> > [] 0xffffffff
> >
> > The error occurs in cfq_free_io_context()'s call to
> > call_for_each_cic() which looks like this:
> >
> > rcu_read_lock();
> > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > func(ioc, cic);
> > called++;
> > }
> > rcu_read_unlock();
> >
> > The function that is called is cic_free_func(). It is postulated that
> > hlist_for_each_entry_rcu() will dereference the previously freed list
> > element to get the ->next pointer.
> >
> > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > seemed evident that this list traversal should use
> > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > pointer before the object is freed.
> >
> > Does this report seem to be valid?
> >
> > The kernel is 2.6.25-rc7.
>
> The missing hlist for loop would look something like so:
>
> #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> for (pos = (head)->first; \
> rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> pos = n)
Good catch, I wonder why it didn't complain in my testing. I've added a
patch to fix that, please see it here:
http://git.kernel.dk/?p=linux-2.6-bl...059044083ea151
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02 2008, Peter Zijlstra wrote:
> On Tue, 2008-04-01 at 15:51 -0700, Paul E. McKenney wrote:
> > On Tue, Apr 01, 2008 at 11:36:28PM +0200, Peter Zijlstra wrote:
> > > On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > > > Hi,
> > > >
> > > > This appeared in my logs:
> > > >
> > > > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> > > >
> > > > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > > > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > > > EIP is at call_for_each_cic+0x2d/0x44
> > > > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > > > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > > > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > > DR6: ffff4ff0 DR7: 00000400
> > > > [] kmemcheck_read+0xa8/0xe0
> > > > [] kmemcheck_access+0x1a5/0x244
> > > > [] do_page_fault+0x622/0x6fc
> > > > [] error_code+0x72/0x78
> > > > [] cfq_free_io_context+0xf/0x70
> > > > [] put_io_context+0x4f/0x58
> > > > [] exit_io_context+0x60/0x6c
> > > > [] do_exit+0x4d9/0x6f0
> > > > [] do_group_exit+0x29/0x88
> > > > [] sys_exit_group+0xf/0x14
> > > > [] sysenter_past_esp+0x6d/0xa4
> > > > [] 0xffffffff
> > > >
> > > > The error occurs in cfq_free_io_context()'s call to
> > > > call_for_each_cic() which looks like this:
> > > >
> > > > rcu_read_lock();
> > > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > > > func(ioc, cic);
> > > > called++;
> > > > }
> > > > rcu_read_unlock();
> > > >
> > > > The function that is called is cic_free_func(). It is postulated that
> > > > hlist_for_each_entry_rcu() will dereference the previously freed list
> > > > element to get the ->next pointer.
> > > >
> > > > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > > > seemed evident that this list traversal should use
> > > > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > > > pointer before the object is freed.
> > > >
> > > > Does this report seem to be valid?
> > > >
> > > > The kernel is 2.6.25-rc7.
> > >
> > > The missing hlist for loop would look something like so:
> > >
> > > #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> > > for (pos = (head)->first; \
> > > rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> > > ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> > > pos = n)
> >
> > Why the heck is cic_free_func() immediately doing a kmem_cache_free()
> > on the cfq_io_context structure??? Shouldn't we have a call_rcu() or a
> > synchronize_rcu() in there somewhere??? Given the way this is written,
> > wouldn't readers on other code paths get dumped onto the freelist?
> > This would not be a good thing...
>
> I was told, but didn't check for myself it is using SLAB_RCU. Of course
> that requires an object validation pass, and I didn't check it does that
> either.
Yes, it's using SLAB_DESTROY_BY_RCU. If you find faults with that,
please let me know.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, 2 Apr 2008, Jens Axboe wrote:
> Good catch, I wonder why it didn't complain in my testing. I've added a
> patch to fix that, please see it here:
You probably don't have kmemcheck in your kernel ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02 2008, Pekka J Enberg wrote:
> On Wed, 2 Apr 2008, Jens Axboe wrote:
> > Good catch, I wonder why it didn't complain in my testing. I've added a
> > patch to fix that, please see it here:
>
> You probably don't have kmemcheck in your kernel ;-)
Ehm no, you are right 
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
* Jens Axboe wrote:
> On Wed, Apr 02 2008, Pekka J Enberg wrote:
> > On Wed, 2 Apr 2008, Jens Axboe wrote:
> > > Good catch, I wonder why it didn't complain in my testing. I've added a
> > > patch to fix that, please see it here:
> >
> > You probably don't have kmemcheck in your kernel ;-)
>
> Ehm no, you are right 
.... and you can get kmemcheck by testing on x86.git/latest:
http://people.redhat.com/mingo/x86.git/README
;-)
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02 2008, Ingo Molnar wrote:
>
> * Jens Axboe wrote:
>
> > On Wed, Apr 02 2008, Pekka J Enberg wrote:
> > > On Wed, 2 Apr 2008, Jens Axboe wrote:
> > > > Good catch, I wonder why it didn't complain in my testing. I've added a
> > > > patch to fix that, please see it here:
> > >
> > > You probably don't have kmemcheck in your kernel ;-)
> >
> > Ehm no, you are right 
>
> ... and you can get kmemcheck by testing on x86.git/latest:
>
> http://people.redhat.com/mingo/x86.git/README
>
> ;-)
Thanks Ingo, I'll give it a spin to verify this fix!
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02, 2008 at 08:15:56AM +0200, Peter Zijlstra wrote:
> On Tue, 2008-04-01 at 15:51 -0700, Paul E. McKenney wrote:
> > On Tue, Apr 01, 2008 at 11:36:28PM +0200, Peter Zijlstra wrote:
> > > On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > > > Hi,
> > > >
> > > > This appeared in my logs:
> > > >
> > > > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> > > >
> > > > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > > > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > > > EIP is at call_for_each_cic+0x2d/0x44
> > > > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > > > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > > > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > > DR6: ffff4ff0 DR7: 00000400
> > > > [] kmemcheck_read+0xa8/0xe0
> > > > [] kmemcheck_access+0x1a5/0x244
> > > > [] do_page_fault+0x622/0x6fc
> > > > [] error_code+0x72/0x78
> > > > [] cfq_free_io_context+0xf/0x70
> > > > [] put_io_context+0x4f/0x58
> > > > [] exit_io_context+0x60/0x6c
> > > > [] do_exit+0x4d9/0x6f0
> > > > [] do_group_exit+0x29/0x88
> > > > [] sys_exit_group+0xf/0x14
> > > > [] sysenter_past_esp+0x6d/0xa4
> > > > [] 0xffffffff
> > > >
> > > > The error occurs in cfq_free_io_context()'s call to
> > > > call_for_each_cic() which looks like this:
> > > >
> > > > rcu_read_lock();
> > > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > > > func(ioc, cic);
> > > > called++;
> > > > }
> > > > rcu_read_unlock();
> > > >
> > > > The function that is called is cic_free_func(). It is postulated that
> > > > hlist_for_each_entry_rcu() will dereference the previously freed list
> > > > element to get the ->next pointer.
> > > >
> > > > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > > > seemed evident that this list traversal should use
> > > > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > > > pointer before the object is freed.
> > > >
> > > > Does this report seem to be valid?
> > > >
> > > > The kernel is 2.6.25-rc7.
> > >
> > > The missing hlist for loop would look something like so:
> > >
> > > #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> > > for (pos = (head)->first; \
> > > rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> > > ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> > > pos = n)
> >
> > Why the heck is cic_free_func() immediately doing a kmem_cache_free()
> > on the cfq_io_context structure??? Shouldn't we have a call_rcu() or a
> > synchronize_rcu() in there somewhere??? Given the way this is written,
> > wouldn't readers on other code paths get dumped onto the freelist?
> > This would not be a good thing...
>
> I was told, but didn't check for myself it is using SLAB_RCU. Of course
> that requires an object validation pass, and I didn't check it does that
> either.
But doesn't SLAB_RCU only wait for a grace period when returning a
slab to the system, rather than on each kmem_cache_free()?
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
Hi Paul,
On Wed, Apr 2, 2008 at 1:40 PM, Paul E. McKenney
wrote:
> I am still confused.
>
> o The kmem_cache has SLAB_DESTROY_BY_RCU.
>
> o This means that a given slab should not be returned to the
> system until a grace period elapses.
Yeah, that's what I thought too, that this is a SLUB bug but Peter
convinced me otherwise. SLUB keeps the _page_ around so the pointer
will be _valid_, although it might not be _your_ pointer so the caller
needs to do some validation step. Or at least that's how I understood
what Peter was saying.
Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02, 2008 at 09:17:10AM +0200, Jens Axboe wrote:
> On Tue, Apr 01 2008, Peter Zijlstra wrote:
> > On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > > Hi,
> > >
> > > This appeared in my logs:
> > >
> > > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> > >
> > > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > > EIP is at call_for_each_cic+0x2d/0x44
> > > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > DR6: ffff4ff0 DR7: 00000400
> > > [] kmemcheck_read+0xa8/0xe0
> > > [] kmemcheck_access+0x1a5/0x244
> > > [] do_page_fault+0x622/0x6fc
> > > [] error_code+0x72/0x78
> > > [] cfq_free_io_context+0xf/0x70
> > > [] put_io_context+0x4f/0x58
> > > [] exit_io_context+0x60/0x6c
> > > [] do_exit+0x4d9/0x6f0
> > > [] do_group_exit+0x29/0x88
> > > [] sys_exit_group+0xf/0x14
> > > [] sysenter_past_esp+0x6d/0xa4
> > > [] 0xffffffff
> > >
> > > The error occurs in cfq_free_io_context()'s call to
> > > call_for_each_cic() which looks like this:
> > >
> > > rcu_read_lock();
> > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > > func(ioc, cic);
> > > called++;
> > > }
> > > rcu_read_unlock();
> > >
> > > The function that is called is cic_free_func(). It is postulated that
> > > hlist_for_each_entry_rcu() will dereference the previously freed list
> > > element to get the ->next pointer.
> > >
> > > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > > seemed evident that this list traversal should use
> > > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > > pointer before the object is freed.
> > >
> > > Does this report seem to be valid?
> > >
> > > The kernel is 2.6.25-rc7.
> >
> > The missing hlist for loop would look something like so:
> >
> > #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> > for (pos = (head)->first; \
> > rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> > ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> > pos = n)
>
> Good catch, I wonder why it didn't complain in my testing. I've added a
> patch to fix that, please see it here:
>
> http://git.kernel.dk/?p=linux-2.6-bl...059044083ea151
I am still confused.
o The hlist_for_each_entry_safe_rcu() is under rcu_read_lock().
o The kmem_cache has SLAB_DESTROY_BY_RCU.
o This means that a given slab should not be returned to the
system until a grace period elapses.
o So the bugginess (or not) of this code should not be affected
by adding hlist_for_each_entry_safe_rcu() here.
(I am not seeing the checks that would be needed to avoid
something being kmem_cache_free()ed while being accessed,
but might be missing something.)
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, 2008-04-02 at 13:46 +0300, Pekka Enberg wrote:
> Hi Paul,
>
> On Wed, Apr 2, 2008 at 1:40 PM, Paul E. McKenney
> wrote:
> > I am still confused.
> >
> > o The kmem_cache has SLAB_DESTROY_BY_RCU.
> >
> > o This means that a given slab should not be returned to the
> > system until a grace period elapses.
>
> Yeah, that's what I thought too, that this is a SLUB bug but Peter
> convinced me otherwise. SLUB keeps the _page_ around so the pointer
> will be _valid_, although it might not be _your_ pointer so the caller
> needs to do some validation step. Or at least that's how I understood
> what Peter was saying.
Correct, that is always how i understood SLAB_DESTROY_BY_RCU to work.
Does SLAB (as opposed to SLUB) do it differently?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, 2 Apr 2008, Peter Zijlstra wrote:
> Correct, that is always how i understood SLAB_DESTROY_BY_RCU to work.
>
> Does SLAB (as opposed to SLUB) do it differently?
No, SLAB works this way as well.
Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02, 2008 at 09:28:46AM +0200, Ingo Molnar wrote:
>
> * Jens Axboe wrote:
>
> > On Wed, Apr 02 2008, Pekka J Enberg wrote:
> > > On Wed, 2 Apr 2008, Jens Axboe wrote:
> > > > Good catch, I wonder why it didn't complain in my testing. I've added a
> > > > patch to fix that, please see it here:
> > >
> > > You probably don't have kmemcheck in your kernel ;-)
> >
> > Ehm no, you are right 
>
> ... and you can get kmemcheck by testing on x86.git/latest:
>
> http://people.redhat.com/mingo/x86.git/README
>
> ;-)
I will check this when I get back to some bandwidth -- but in the meantime,
does kmemcheck special-case SLAB_DESTROY_BY_RCU? It is legal to access
newly-freed items in that case, as long as you did rcu_read_lock()
before gaining a reference to them and don't hold the reference past
the matching rcu_read_unlock().
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, 2008-04-02 at 03:40 -0700, Paul E. McKenney wrote:
> On Wed, Apr 02, 2008 at 09:17:10AM +0200, Jens Axboe wrote:
> > On Tue, Apr 01 2008, Peter Zijlstra wrote:
> > > On Tue, 2008-04-01 at 23:08 +0200, Vegard Nossum wrote:
> > > > Hi,
> > > >
> > > > This appeared in my logs:
> > > >
> > > > kmemcheck: Caught 32-bit read from freed memory (f7042348)
> > > >
> > > > Pid: 1374, comm: bash Not tainted (2.6.25-rc7 #92)
> > > > EIP: 0060:[] EFLAGS: 00210202 CPU: 0
> > > > EIP is at call_for_each_cic+0x2d/0x44
> > > > EAX: 00200286 EBX: 00000001 ECX: c200e908 EDX: f7042348
> > > > ESI: f6c26c60 EDI: c0503310 EBP: f70fff38 ESP: c082ec88
> > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > > > CR0: 8005003b CR2: f7826904 CR3: 36cd7000 CR4: 000006c0
> > > > DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> > > > DR6: ffff4ff0 DR7: 00000400
> > > > [] kmemcheck_read+0xa8/0xe0
> > > > [] kmemcheck_access+0x1a5/0x244
> > > > [] do_page_fault+0x622/0x6fc
> > > > [] error_code+0x72/0x78
> > > > [] cfq_free_io_context+0xf/0x70
> > > > [] put_io_context+0x4f/0x58
> > > > [] exit_io_context+0x60/0x6c
> > > > [] do_exit+0x4d9/0x6f0
> > > > [] do_group_exit+0x29/0x88
> > > > [] sys_exit_group+0xf/0x14
> > > > [] sysenter_past_esp+0x6d/0xa4
> > > > [] 0xffffffff
> > > >
> > > > The error occurs in cfq_free_io_context()'s call to
> > > > call_for_each_cic() which looks like this:
> > > >
> > > > rcu_read_lock();
> > > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) {
> > > > func(ioc, cic);
> > > > called++;
> > > > }
> > > > rcu_read_unlock();
> > > >
> > > > The function that is called is cic_free_func(). It is postulated that
> > > > hlist_for_each_entry_rcu() will dereference the previously freed list
> > > > element to get the ->next pointer.
> > > >
> > > > After a short discussion with Pekka Enberg and Peter Zijlstra, it
> > > > seemed evident that this list traversal should use
> > > > hlist_for_each_entry_safe_rcu() instead, which would buffer the next
> > > > pointer before the object is freed.
> > > >
> > > > Does this report seem to be valid?
> > > >
> > > > The kernel is 2.6.25-rc7.
> > >
> > > The missing hlist for loop would look something like so:
> > >
> > > #define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \
> > > for (pos = (head)->first; \
> > > rcu_dereference(pos) && ({ n = pos->next; 1; }) && \
> > > ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \
> > > pos = n)
> >
> > Good catch, I wonder why it didn't complain in my testing. I've added a
> > patch to fix that, please see it here:
> >
> > http://git.kernel.dk/?p=linux-2.6-bl...059044083ea151
>
> I am still confused.
>
> o The hlist_for_each_entry_safe_rcu() is under rcu_read_lock().
>
> o The kmem_cache has SLAB_DESTROY_BY_RCU.
>
> o This means that a given slab should not be returned to the
> system until a grace period elapses.
>
> o So the bugginess (or not) of this code should not be affected
> by adding hlist_for_each_entry_safe_rcu() here.
>
> (I am not seeing the checks that would be needed to avoid
> something being kmem_cache_free()ed while being accessed,
> but might be missing something.)
Agreed, when looking at this code its not making sense.
cfq_cic_lookup() is also mightily confusing. Only the actual
radix_tree_lookup() call is protected by RCU, I'm not seeing what
guarantees the existance of cic after rcu_read_unlock().
Nor does it do a validation check to see if cic->key == cfqd, something
that would be needed when using SLAB_DESTROY_BY_RCU.
This is most fishy code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, 2008-04-02 at 03:55 -0700, Paul E. McKenney wrote:
> On Wed, Apr 02, 2008 at 09:28:46AM +0200, Ingo Molnar wrote:
> >
> > * Jens Axboe wrote:
> >
> > > On Wed, Apr 02 2008, Pekka J Enberg wrote:
> > > > On Wed, 2 Apr 2008, Jens Axboe wrote:
> > > > > Good catch, I wonder why it didn't complain in my testing. I've added a
> > > > > patch to fix that, please see it here:
> > > >
> > > > You probably don't have kmemcheck in your kernel ;-)
> > >
> > > Ehm no, you are right 
> >
> > ... and you can get kmemcheck by testing on x86.git/latest:
> >
> > http://people.redhat.com/mingo/x86.git/README
> >
> > ;-)
>
> I will check this when I get back to some bandwidth -- but in the meantime,
> does kmemcheck special-case SLAB_DESTROY_BY_RCU? It is legal to access
> newly-freed items in that case, as long as you did rcu_read_lock()
> before gaining a reference to them and don't hold the reference past
> the matching rcu_read_unlock().
I don't think it does.
It would have to register an call_rcu callback itself in order to mark
it freed - and handle the race with the object being handed out again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, 2008-04-02 at 13:07 +0200, Jens Axboe wrote:
> On Wed, Apr 02 2008, Pekka Enberg wrote:
> > Hi Paul,
> >
> > On Wed, Apr 2, 2008 at 1:55 PM, Paul E. McKenney
> > wrote:
> > > I will check this when I get back to some bandwidth -- but in the meantime,
> > > does kmemcheck special-case SLAB_DESTROY_BY_RCU? It is legal to access
> > > newly-freed items in that case, as long as you did rcu_read_lock()
> > > before gaining a reference to them and don't hold the reference past
> > > the matching rcu_read_unlock().
> >
> > No, kmemcheck is work in progress and does not know about
> > SLAB_DESTROY_BY_RCU yet. The reason I asked Vegard to post the warning
> > was because Peter, Vegard, and myself identified this particular
> > warning as a real problem. But yeah, kmemcheck can cause false
> > positives for RCU for now.
>
> Makes sense, and to me Pauls analysis of the code looks totally correct
> - there's no bug there, at least related to hlist traversal and
> kmem_cache_free(), since we are under rcu_read_lock() and thus hold off
> the grace for freeing.
but what holds off the slab allocator re-issueing that same object and
someone else writing other stuff into it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
-
Re: kmemcheck caught read from freed memory (cfq_free_io_context)
On Wed, Apr 02 2008, Pekka Enberg wrote:
> Hi Paul,
>
> On Wed, Apr 2, 2008 at 1:55 PM, Paul E. McKenney
> wrote:
> > I will check this when I get back to some bandwidth -- but in the meantime,
> > does kmemcheck special-case SLAB_DESTROY_BY_RCU? It is legal to access
> > newly-freed items in that case, as long as you did rcu_read_lock()
> > before gaining a reference to them and don't hold the reference past
> > the matching rcu_read_unlock().
>
> No, kmemcheck is work in progress and does not know about
> SLAB_DESTROY_BY_RCU yet. The reason I asked Vegard to post the warning
> was because Peter, Vegard, and myself identified this particular
> warning as a real problem. But yeah, kmemcheck can cause false
> positives for RCU for now.
Makes sense, and to me Pauls analysis of the code looks totally correct
- there's no bug there, at least related to hlist traversal and
kmem_cache_free(), since we are under rcu_read_lock() and thus hold off
the grace for freeing.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/