[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] Re: libevent2 stuck in Linux, CPU 100%



I ll be able to check it in like 12 hours. Thanks !

On Jul 19, 2013 8:28 AM, "Nick Mathewson" <nickm@xxxxxxxxxxxxx> wrote:
On Fri, Jul 19, 2013 at 9:31 AM, Oleg Moskalenko <mom040267@xxxxxxxxx> wrote:
> Thank you Azat for the suggestion. It seems to me that UDP sockets are
> offenders, somehow it happens only in Linux (I know Linux has some weird UDP
> behavior):
>
> Process 20828 attached with 5 threads - interrupt to quit
> [pid 20831] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
> [pid 20832] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
> [pid 20831] <... clock_gettime resumed> {205614, 271115090}) = 0
> [pid 20831] gettimeofday( <unfinished ...>
> [pid 20832] <... clock_gettime resumed> {205614, 271926086}) = 0
> [pid 20831] <... gettimeofday resumed> {1374240484, 377784}, NULL) = 0
> [pid 20832] gettimeofday( <unfinished ...>
> [pid 20831] epoll_wait(20,  <unfinished ...>
> [pid 20829] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
> [pid 20830] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
> [pid 20832] <... gettimeofday resumed> {1374240484, 378418}, NULL) = 0
> [pid 20832] epoll_wait(16,  <unfinished ...>
> [pid 20830] <... clock_gettime resumed> {205614, 273231001}) = 0
> [pid 20829] <... clock_gettime resumed> {205614, 272801617}) = 0
> [pid 20829] gettimeofday( <unfinished ...>
> [pid 20830] gettimeofday( <unfinished ...>
> [pid 20829] <... gettimeofday resumed> {1374240484, 379094}, NULL) = 0
> [pid 20829] epoll_wait(28,  <unfinished ...>
> [pid 20830] <... gettimeofday resumed> {1374240484, 379317}, NULL) = 0
> [pid 20830] epoll_wait(24,  <unfinished ...>
> [pid 20828] recvfrom(8, 0x7fff61df20c0, 4, 2, 0xa9bc20, 0x7fff61df20bc) = -1
> EAGAIN (Resource temporarily unavailable)
> [pid 20828] epoll_wait(4, {{EPOLLERR, {u32=8, u64=8}}}, 32, 19) = 1
> [pid 20828] clock_gettime(CLOCK_MONOTONIC, {205614, 277088474}) = 0
> [pid 20828] gettimeofday({1374240484, 386338}, NULL) = 0
> [pid 20828] recvfrom(8, 0x7fff61df20c0, 4, 2, 0xa9bc20, 0x7fff61df20bc) = -1
> EAGAIN (Resource temporarily unavailable)
> [pid 20828] epoll_wait(4, {{EPOLLERR, {u32=8, u64=8}}}, 32, 12) = 1
> [pid 20828] clock_gettime(CLOCK_MONOTONIC, {205614, 286419826}) = 0
> [pid 20828] gettimeofday({1374240484, 392232}, NULL) = 0
> [pid 20828] recvfrom(8, 0x7fff61df20c0, 4, 2, 0xa9bc20, 0x7fff61df20bc) = -1

Hm. So, epoll_wait is reporting EPOLLERR on fd 8.  The Libevent
epoll.c code treats EPOLLERR as (EV_READ|EV_WRITE).  But when you
recvfrom on the socket, it only says EAGAIN.

So your program sensibly decides to keep listening for events on fd 8,
and epoll keeps telling you that there was an error.

Assuming that this recvfrom is in your code, I'll echo Vsevolod's
question: what happens when you call getsockopt(...SO_ERROR...)  on
the socket in the event handler that calls the recvfrom, to see what
the queued error is?

--
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.