[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Xlib problems (Lol. I've only been programming in it for10days...)
> > I'd like to set something straight once and for all: DGA isn't that
> > fast. It is "okay", but not fast. It gives you programmed I/O access to
> > the framebuffer, which doesn't accelerate anything (in fact, XCopyArea
> > of a largish Pixmap to a Window is WAY faster than DGA with XFree86
> > 3.9.x).
> Sorry, I just timed our new engine:
> XCopyArea: 62fps
> DGA: 123fps
> That looks like pure speed for me.
> Maybe I'm the only one without a super-duper-accelerated Matrox Card,
> but it sure is faster on my "cheap" S3 Virge.
There *is* a way to go faster with DGA than XCopyArea or XShm, which is
to interleave computing operations with writes. The thing is, if you
write as fast as possible to the screen in DGA, there will be a lot of
"stalling" because of doing PIO back to back.
When you write data using PIO (no larger than what the bus can carry, 32
bits), it takes some time to actually get written. If you try to send
some more before the first write is finished, the second write will not
start until the first one is finished, which is very bad. Video card
specific drivers can move a whole lot of data without stalling because
they do not use PIO.
If you do something computationally expensive, like a fade or alpha
blending, if you do the computation "as you go", computing the next 32
bits while the bus is busy sending the previous data, this can be nicely
efficient, combining the time needed to do the computation and the data
moving. If the video card specific driver isn't very good and uses PIO
itself, it will not do the work as efficiently (the more efficient trick
is specific to the task at hand and cannot be generalized).
Also, your 62 fps number is rather slowish (amounts to less than 10
megabytes per second), but is also very dependent on the computer speed,
so I cannot tell accurately, but it could be that you are doing too many
copies. Do you repetitively XCopyArea the same pixmap to the window or
do up XShmPutImage something (or worse, XPutImage) something and *then*
XCopyArea? If this is a single XShmPutImage, you could just skip the
XCopyArea and do it directly into the window, you'd nearly double the
Older video cards are sometimes better at PIO than newer cards, as in
the old time, everyone was using PIO, but today, hardware acceleration
is used in most drivers, and applications uses the drivers. I was told
for example that the equivalent of DGA with DirectDraw (part of DirectX)
was *very* slow on a Riva TNT, as the card doesn't even support direct
framebuffer access in the accelerated modes it is set in! Those calls
are emulated by copying data from/to the video card leading to something
in the 10 fps range!
> > The page flipping is one of the biggest incentive. The video memory
> > organization is useless if you can't use the hardware blitter to move
> > things around (which may or may not be possible using DGA, depending on
> > the version).
> PageFlipping, Hardware-Scrolling and _Syncing on the Retrace_ are important
> for realtime FX.
Sync on retrace is done by the X server anyway. I also run Quake 3 here
without it, and since the framerate is higher than the refresh rate on
my monitor, I do not see it.
Page flipping and hardware scrolling cannot be done in windows anyway,
only in fullscreen (except some high end video cards that have overlay
support, they can page-flip a window, but I do not know about scrolling
tho). Page flipping is needed mostly when blitting performance is poor,
like when using PIO. This is why it *used* to be popular, back when
everybody used PIO, but now with accelerated DirectX drivers, you can
blit a complete screen from system memory to the screen before the video
refresh is complete, thanks to DMA busmastering and other goodies.
Ludus Design, http://ludusdesign.com/
"First they ignore you. Then they laugh at you.
Then they fight you. Then you win." -- Gandhi