[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] workers for evhttp (was stable, now not!)



Ok, I made some progress, with help from Google's ThreadSanitizer! http://code.google.com/p/data-race-test/wiki/ThreadSanitizer

My libevent's http.c now allows me to hand off an evhttp request to a worker, which can call evhttp_send_reply() safely (and nothing else after that). Quite a bit closer to stable with a threadpool handling some of the connections. The big advantage of this for me is: I can parse the http request in the network thread, and then decide to hand "expensive" operations to a worker, and handle event-driven stuff in the network thread.

A few notes so far:
1. I added BEV_OPT_THREADSAFE in evhttp_connection_base_new (evcon->bufev = bufferevent_socket_new(NULL, -1, BEV_OPT_THREADSAFE); + callbacks). This means switching callbacks, etc., are threadsafe now.

2. I refcount bufev while the thread is writing. In particular, evhttp_send_reply does incref while it's running (decref after, but carefully because request may have been deleted). This prevents the network thread from deleting the bufferevent in the middle of the send.

Third patch I'm not sure about, but maybe someone can provide insight if it matters or not:
3. Move bufferevent_enable(evcon->bufev, EV_WRITE); in all cases (http.c) to the end of each function, setting up callbacks first. This ensures the network thread uses the correct callbacks, so it doesn't race. (Each call does a BEV_LOCK separately, so the order does matter.) There are a few of these I've missed below.

My #2 patch needs to be generalized (for other code paths: chunked, etc.) perhaps evhttp_write_buffer should do the refcount instead? But appears to be the important piece for this.

Here's the patch as-is (needs quite a lot of cleanup):

372,373d371
<     bufferevent_enable(evcon->bufev, EV_WRITE);
<
381a380,382
>
>     // mbh: moved after setcb, so we don't race with the network thread:
>     bufferevent_enable(evcon->bufev, EV_WRITE);
394d394
<     bufferevent_enable(evcon->bufev, EV_WRITE);
404a405,407
>
>     // mbh: moved after, to avoid races:
>     bufferevent_enable(evcon->bufev, EV_WRITE);
997a1001
>
2039a2044
> #if 0
2046a2052,2064
> #else
>     // threadsafe, using newer API
>     evcon->bufev = bufferevent_socket_new(NULL, -1, BEV_OPT_THREADSAFE);
>
>     if (evcon->bufev == NULL) {
>         printf("bufferevent_socket_net failed\n");
>         event_warn("%s: bufferevent_new failed", __func__);
>         goto error;
>     }
>
>     bufferevent_setcb(evcon->bufev, evhttp_read_cb, evhttp_write_cb, evhttp_error_cb, evcon);
>
> #endif
2145,2146c2163,2164
<     /* make sure that we get a write callback */
<     bufferevent_enable(evcon->bufev, EV_WRITE);
---
>
>     // mbh: bufferevent_enable was here
2161a2180,2183
>     // mbh: moved to avoid races
>     /* make sure that we get a write callback */
>     bufferevent_enable(evcon->bufev, EV_WRITE);
>
2370c2392,2400
<     evhttp_send(req, databuf);
---
>     if (req->evcon && req->evcon->bufev) {
>         struct bufferevent *bev = req->evcon->bufev;
>
>         bufferevent_incref(bev);
>         evhttp_send(req, databuf);
>
>         // note: req may be freed, but our incref is still around:
>         bufferevent_decref(bev);
>     }


Nick Mathewson wrote:
On Fri, May 20, 2011 at 11:26 PM, Michael Herf <herf@xxxxxxxxx> wrote:
 [...]
  
Questions: is bufferevent supposed threadsafe for this case? I suppose I
haven't verified if my technique of handing evhttp_request over the thread
boundary is supposed to work or not...it has just been working for quite a
long time under earlier versions of libevent.
    

Bufferevents are supposed to be threadsafe if you construct them with
the BEV_OPT_THREADSAFE option, but evhttp isn't threadsafe in 2.0.
(Specifically, you can use separate evhttp bases and connections in
different threads, but you can't be accessing the same one in more
than one thread at once.)  I would love a threadsafe evhttp in 2.1 if
somebody wants to work on that.

No idea why this would happen more in 2.0.10-stable than in 2.0.8-rc;
There were some bugs fixed in 2.0.9-rc and 2.0.10-stable that might
make the main thread more aggressive about running events as needed,
or about scheduling events more quickly, thus making races more
frequent in practice, but I'm just guessing there.