[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [f-cpu] TLB resume

To: f-cpu@seul.org
Subject: Re: [f-cpu] TLB resume
From: nico <nicolas.boulay@ifrance.com>
Date: Tue, 06 Aug 2002 22:27:31 +0200
Delivered-To: archiver@seul.org
Delivered-To: f-cpu-outgoing@seul.org
Delivered-To: f-cpu@seul.org
Delivery-Date: Tue, 06 Aug 2002 16:06:16 -0400
References: <200208070229.09431.cedric.bail@free.fr>
Reply-To: f-cpu@seul.org
Sender: owner-f-cpu@seul.org

I try to explain for every one.

TLB is a little buffer used to translate virtual adresse to physical
memory adress. This is used to clearly sperate adresse space of
processes. It's possible to have 2 differentes TLB : one for data and
one for code.

This piece of hardware is critical because on the critical data path
when we access the memory. Sometimes the L1 cache cache virtual adresse
to avoid accessing to much the tlb.

Tlb is critical on task switch, too. On x86, it must be almost entirely
flush before being usable for the next task.

The tlb is in fact a kind of caches. It mainly use CAM (content adresse
memory, so the more entries, the slower) it's memory where you could
send a data and it give you back an address (that could used to access
an other normal memory bank, it's a key to content association). It
critical by it's size the number of address and by the size of the page
used. 

x86 have a very week protection model by using only 2 bit, that's why
buffer overflow are so easy to do in the x86 world.

cedric a écrit :
> 
> Here is an extract from the TLB discussion that append one month ago.
> 
> Michael propose this :
> "
> I've been thinking about the TLB before; IMHO, we need at least the
> following (assuming <n> bits for the page offset):
> 
>         - virtual address (64-<n> bits)

the input.

>         - physical address (64-<n> bits)

The main ouput.

>         - address space identifier (ASI; 8 bits was suggested)

Could be an output and then we check it with an SR. But if it's an
input, we didn't to flush the tlb during a task switch !

>         - user access rights (RWX, 3 bits)

Ouput and check what it's doing and raise the appropriate trap if
needed.

>         - supervisor access rights (RWX, 3 bits)

i don't really understood the use of this one.

If a process make a system call. It use a specif call that change the
mode (user-> superuser). Superuser (the kernel) could access to
anything. But the strangest thing is that superuser are the one that
could change the tlb content...


>         - valid bit (indicating that the entry is valid)
>         - dirty bit (indicating that the page has been written to)
>         - used bit (indicating that the page has been accessed)

That's for tlb miss handler to accelerate some choice and avoid
duplicate rewrite and so one...

> 
>         - page size (4K << size, 6 bits)

Page size are critical. How many memories could be acceded without a tlb
miss ? In x86 world, the pages are 4 Kb large. The tlb have between 32
to 128 entries : 512 Kb adressed. That's few !

Intel introduice big sized pages (4 Mb) for framebuffer for example.
Linux use it for the kernel code (so there is no tlb miss inside the
kernel code).

If the size of the page are bigger you could addresse more memories with
the same number of entries. But i could became hard for the pagging
system to find hole aligned with power of 2 adresse bits. A process
needs at least 3 pages (code, data, stack).

Here is a top extract :

10402 nico       9   0 23516  22M 12628 S     0,0 24,8   0:00
mozilla-bin
10402 nico       9   0 23516  22M 12628 S     0,0 24,8   0:00
mozilla-bin
10402 nico       9   0 23516  22M 12628 S     0,0 24,8   0:00
mozilla-bin
10403 nico       9   0 23516  22M 12628 S     0,0 24,8   0:00
mozilla-bin
10409 nico       9   0 20920  20M  9620 S     0,0 21,9   0:24
netscape-commun
 1439 root       9   0 59432 7608  1216 S     0,0  8,0   2:16 X
10427 nico       8   0  3832 3752  3272 S     0,0  3,9   0:00
netscape-commun
 1351 xfs        9   0  5080 3440   696 S     0,0  3,6   0:13 xfs
 1656 nico       9   0  1812 1480  1016 S     0,0  1,5   0:15 wmaker
 1739 nico       9   0  1476 1280  1080 S     0,0  1,3   0:00 bash
10440 nico      12   0  1056 1056   844 R     0,7  1,1   0:00 top
 1730 nico       9   0  1108  792   628 S     0,0  0,8   0:00 bash
 1731 nico       9   0   940  448   424 S     0,0  0,4   0:00 bash
 8913 root       8   0   760  428   236 S     0,0  0,4   0:00 bash
 1723 nico       9   0   492  412   400 S     0,0  0,4   0:00 aterm
10249 root       8   0   476  412   320 S     0,0  0,4   0:00 pppd
 1724 nico       9   0   496  396   372 R     0,1  0,4   0:00 aterm
 1725 nico       9   0   468  364   364 S     0,0  0,3   0:00 aterm
 1726 nico       9   0   384  288   288 S     0,0  0,3   0:00 aterm
 1727 nico       9   0   412  184   184 S     0,0  0,1   0:00
wmCalClock-Wind
  878 root       9   0   236  180   180 S     0,0  0,1   0:00 syslogd
  886 root       9   0   828  172   172 S     0,0  0,1   0:00 klogd

22 Mb for mozilla, 60 Mb for X (but used 7.5 Mb currently and swap
almost 90% of it's size), a few utility use ~1Mb and lot of them use 512
Kb.

So around 1 Mb for most of them and many Mb for 2 or 3 applications.

Hurd guys advice to have 4 size : tiny for message passing (4 Ko ? what
ever the size, a message is message and most of them are small, so it
will use a page), medium size for code (64 Kb was the first idea), big
size for framebuffer and fat kernel (4-8Mo, 16 Mo) and very big for
memory hungry application (data based, a scientific tools,...) (256 Mo
!).

So 4 sizes !  In the IA-64 arch, intel put 11 sizes !

My current problem is how managing different size of page inside the
same CAM. Maybe you could manipulat the bit mask but i don't see how or
split the adresse in 2 or 4.

Using how many table than size don't seems to be too realistic.

Or we can use faster cache. Instead of using full associativity we can
use 4 way caches. it's smaller and faster but you add depencies inside
the adresse that could be put inside the tlb.

hope this help.
nicO

> "
> 
> "present bits" has been removed because it's only needed by HW TLB after the
> discussion. I hope I have maid a good resume, but if someone want to add
> something, before I add this to the manual (perhaps we must clarify how to
> access it before).
> 
> Cedric
> 
> *************************************************************
> To unsubscribe, send an e-mail to majordomo@seul.org with
> unsubscribe f-cpu       in the body. http://f-cpu.seul.org/
*************************************************************
To unsubscribe, send an e-mail to majordomo@seul.org with
unsubscribe f-cpu       in the body. http://f-cpu.seul.org/

Follow-Ups:
- Re: [f-cpu] TLB resume
  - From: Michael Riepe <michael@stud.uni-hannover.de>

References:
- [f-cpu] TLB resume
  - From: cedric <cedric.bail@free.fr>

Prev by Date: [f-cpu] SR security
Next by Date: Re: [f-cpu] TLB resume
Prev by thread: [f-cpu] TLB resume
Next by thread: Re: [f-cpu] TLB resume
Index(es):
- Date
- Thread