[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

Re: huge pages, was where are the exit nodes gone?



Hi Arjan,
     On Wed, 14 Apr 2010 22:03:33 +0200 Arjan
>Scott Bennett wrote:
>>      On Tue, 13 Apr 2010 19:10:37 +0200 Arjan
>> <n6bc23cpcduw@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> Scott Bennett wrote:
>>>>      BTW, I know that there are *lots* of tor relays running on LINUX
>>>> systems whose operators are subscribed to this list.  Don't leave Olaf and
>>>> me here swinging in the breeze.  Please jump in with your LINUX expertise
>>>> and straighten us out.
>>> I'm not an expert, but I managed to perform some google searches.
>>>
>>> http://libhugetlbfs.ozlabs.org/
>>> >From that website:
>>> libhugetlbfs is a library which provides easy access to huge pages of
>>> memory. It is a wrapper for the hugetlbfs file system. Applications can
>>> use huge pages to fulfill malloc() requests without being recompiled by
>>> using LD_PRELOAD.
>> 
>>      [Aside to Olaf:  oh.  So forcing the use of OpenBSD's malloc() might
>> prevent the libhugetlbfs stuff from ever knowing that it was supposed to
>> do something. :-(  I wonder how hard it would be to fix the malloc() in
>> libhugetlbfs, which is most likely derived from the buggy LINUX version.
>> Does libhugetlbfs come as source code?  Or is the use of LD_PRELOAD simply
>> causing LINUX's libc to appear ahead of the OpenBSD version, in which case
>> forcing reordering of the libraries might work?  --SB]
>
>If Olafs test shows that CPU usage is reduced and throughput stays the
>same or improves, modifying Tor to support linux huge pages might be an
>option. Part 2 of this article contains some information about the
>available interfaces:
>	http://lwn.net/Articles/374424/

     Thanks.  I'll take a look at it, but I still haven't had the nap
I was going to take. :-(

>Getting the wrapper to work with (or like) the OpenBSD version will
>probably be easier.
>
     One of the reasons I'm still awake is that I was browsing through
the OpenBSD version of malloc() that is shipped with tor and libhugetlbfs's
morecore.c module.  I'm still not sure quite what is going on with how
the stuff gets linked together, so I don't know which avenue might be
the easiest approach, but modifying tor is probably the worst option.
If the LINUX side of things gets fixed, then the patches ought to be
contributed to the LINUX effort.  However, it may be easier to modify
the OpenBSD malloc() to call something in morecore.c to get memory
allocated by the kernel, falling back to whatever it currently does if
the morecore.c stuff returns an error because it can't allocate the
hugepages necessary to satisfy a request.  Of course, someone would still
need to find out how to keep the LINUX malloc() from being substituted
for the OpenBSD malloc() at runtime when the libhugetlbfs wrapper is
in use.  I doubt I can contribute much to the effort, given that I don't
have a LINUX system available to me.
>
>>> Someone is working on transparent hugepage support:
>>> http://thread.gmane.org/gmane.linux.kernel.mm/40182
>> 
>>      I've now had time to get through that entire thread.  I found it
>> kind of frustrating reading at times.  It seemed to me that in one of
>> the very first few messages, the author described how they had long
>> since shot themselves in the feet (i.e., by rejecting the algorithm of
>> Navarro et al. (2002), which had already been demonstrated to work on an early
>> FreeBSD 5 system modified by Navarro's team) on emotional grounds (i.e.,
>> "we didn't feel its [Navarro's method's] heuristics were right").
><snipped the remainder of the analysis>
>
>Thanks for your analysis of the thread and the reference to the Navarro
>paper.
>
>I've located the paper and will read it when time permits:
>http://www.usenix.org/events/osdi02/tech/full_papers/navarro/
>
     Oh.  Sorry about that.  I had intended to include that at the end
of what I wrote, but apparently I spaced it.  I didn't mean to make anyone
have to search for it.  Thanks for correcting the deficiency in my message.
     I think you'll find their design is quite elegant and well thought out.
It apparently required adding fewer than 3600 lines of code to the kernel
to do it and uses a trivial amount of kernel CPU time in action.  It's
quite transparent and adaptive to conditions, but there are probably some
conditions under which it might give less benefit than the LINUX hugepages
way.  However, it continually tries to promote processes that allocate
enough space in a segment to fill the next larger page size.  Its
reservation system greatly increases the chances that promotions will
occur.  It's not a perfect solution to the problem, but I suspect there
aren't any perfect solutions for it on the software side of things.  What
is really needed is for the chip manufacturers to correct the matter by
increasing their TLB sizes rather dramatically.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:       bennett at cs.niu.edu                              *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************
***********************************************************************
To unsubscribe, send an e-mail to majordomo@xxxxxxxxxxxxxx with
unsubscribe or-talk    in the body. http://archives.seul.org/or/talk/