[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: [Libevent-users] Help understanding the evbuffer and its chains



On Sat, Jun 9, 2012 at 5:11 AM, Filipe Miguel Campos Santos
<fc009@xxxxxxxxxxxxxxxx> wrote:
> Hey Everyone,
>
> i did use Libevent in my master thesis to programm a multithreaded
> websocket server and while i know *how to use* Libevent now, i dont really
> know how Libevent works internally. Everything i did know was that it
> relies on multiplexing-IO-mechanisms like epoll.
>
> So in order to be able to explain how libevent works internally i decided
> to look at the source and was able to understand a lot. But now i am
> having problems figuring out why the evbuffer is as it actually is.
>
> Why is the evbuffer represented as some type of linked list with
> chain-Nodes containing the actually buffers? What was the idea behind
> using a linked list of buffers instead just using on single Buffer?
>
> And what is the purpose of the misalign-space in the Buffers?

Hello, Felipe!

When you're buffering network data, a linked list of chunks is a
pretty clasic interface.  BSD has a classic "mbuf" design that does
this, and Linux uses "skbuf"s (if I recall correctly).

To see why you might want to do this, consider the alternatives.  The
most obvious one is to store all the data in one bug chunk of RAM.
But that's got some problems.  First of all, you'll need to use
realloc() to grow and shrink the one-big-chunk, and that takes O(N)
time to copy the old data.  (Also, realloc() doesn't actually do a
good job of shrinking on some platforms.)

Second, in order to avoid bad performance from too many reallocs,
you'll need to follow an amortized O(1) approach where whenever you
grow the buffer, you double its size rather than just making it a few
bytes longer (or something like that).  But that means that on
average, something like 1/4 of all your buffer space will be empty.

Third, you'll have to decide whether the data in your buffer is going
to be kept in one sequential, contiguous chunk, or whether it can wrap
around the end.  Either way, there will be trouble: if you insist that
the data be contiguous, you'll frequently need to memmove() the data
back to the beginning of the buffer to avoid running off the end.  If
it can wrap around the end, then your realloc() operation turns into a
realloc-and-memcpy, which is a little complicated.

So instead it's usually better to use a linked list of chunks.  Malloc
implementations like this: they're much better at allocating a large
number of similarly sized medium objects than they are at allocating
big objects and realloc()ing them a log.  You basically never need to
realloc() anything: you can just use malloc() and free(), which are
comparatively fast.  While the first and last chunk of any given
buffer may be partially empty, all other chunks are typically full,
which means that you're much more space-efficient.  Finally, using
chunks like this lets you do things like having some chunks with
different behavior or semantics, like Libevent's mmap()d chunks or
reference chunks.

The "misalign" of a chunk describes how much empty space there is at
the beginning.  We don't put empty space at the beginning of a chunk
on purpose{*}; rather it occurs when we have drained from the front of
the buffer, and that data isn't there anymore.


{*} But kernel mbuf/skbuf implementations _will_ often put empty space
at the beginning of a packet, so that they can fill it in later with
headers without having to move the data.


hope this helps,
-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxx with
unsubscribe libevent-users    in the body.