Design trade-off: windows and XImages - Xwindows

This is a discussion on Design trade-off: windows and XImages - Xwindows ; Suppose you're writing an emulator that has a single window where the top left 3/4 of the window is for the video output of the emulator and the rest is used for status display etc. The video emulation is rendered ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Design trade-off: windows and XImages

  1. Design trade-off: windows and XImages

    Suppose you're writing an emulator that has a single window
    where the top left 3/4 of the window is for the video output of
    the emulator and the rest is used for status display etc.

    The video emulation is rendered in real time pixel by pixel into
    a buffer. When the buffer is full, then the whole frame is
    XPutImage()'d (50 times per second).

    The status display probably won't be updated constantly, for
    speed reasons. Anyway, it won't be updated more often than the
    video emulation, i.e. 50 times per second. So a single
    XPutImage() for both panes is possible.

    Speed is of the essence.

    What would you do ? Two windows ? One window with two XImages ?
    A single XImage ?

    --
    André Majorel
    Conscience is what hurts when everything else feels so good.

  2. Re: Design trade-off: windows and XImages


    > Speed is of the essence.
    >
    > What would you do ? Two windows ? One window with two XImages ?
    > A single XImage ?


    It does not matter, as long as you use shared memory.
    http://pantransit.reptiles.org/prog/mit-shm.html
    So you can directly access your frame buffer as memory, and
    get server side XImage to Drawable speed;
    it will be like copying from Pixmap to Window, or
    Window to Window.

    XShmPutImage will give you 50% boost over XPutImage in most cases.
    Other than that, organize your interface in such a way
    that it will be easier for the user.

    Best regards,
    Dusan Peterc

  3. Re: Design trade-off: windows and XImages

    arahne writes:

    >> Speed is of the essence.
    >>
    >> What would you do ? Two windows ? One window with two XImages ?
    >> A single XImage ?

    >
    > It does not matter, as long as you use shared memory.
    > http://pantransit.reptiles.org/prog/mit-shm.html
    > So you can directly access your frame buffer as memory, and
    > get server side XImage to Drawable speed;
    > it will be like copying from Pixmap to Window, or
    > Window to Window.


    Not quite. An XImage always lives in system memory, and
    X(Shm)PutImage copies it to the video memory. A Pixmap or Window is
    typically stored in the video memory on the graphics card, and a copy
    from one Pixmap/Window to another is done by the graphics
    accelerator.

    --
    Måns Rullgård
    mru@mru.ath.cx

  4. Re: Design trade-off: windows and XImages

    > > It does not matter, as long as you use shared memory.
    > > http://pantransit.reptiles.org/prog/mit-shm.html
    > > So you can directly access your frame buffer as memory, and
    > > get server side XImage to Drawable speed;
    > > it will be like copying from Pixmap to Window, or
    > > Window to Window.

    >
    > Not quite. An XImage always lives in system memory, and
    > X(Shm)PutImage copies it to the video memory. A Pixmap or Window is
    > typically stored in the video memory on the graphics card, and a copy
    > from one Pixmap/Window to another is done by the graphics
    > accelerator.


    It on first thought, you are right, but then, you are not.
    Empirical data appears to support your claim, on my
    Nvidia 1600x1200 @ 32bpp

    x11perf -putimage500
    800 reps @ 11.3602 msec ( 88.0/sec): PutImage 500x500 square
    x11perf -shmput500
    2000 reps @ 2.9258 msec ( 342.0/sec): ShmPutImage 500x500 square
    x11perf -copypixwin500
    16000 reps @ 0.3230 msec ( 3100.0/sec): Copy 500x500 from pixmap to
    window

    But the actual MIT-SHM implementation is very cunning.
    You can make a shared XImage / Pixmap, and you get the pointer
    to the image/pixmap data. If X server implementation is such that
    Pixmaps reside on the graphics card, it will be there.
    Examine memory layout of Linux - graphic card memory is
    mapped to CPU memory, so it is directly accessible, given
    proper privileges. MIT-SHM handles these priviledges for you.
    If you use dual XImage / Pixmap, you can write your data
    directly through the image data pointer (observing depth and bpl),
    and then simply XCopyArea(...), bypassing X(Shm)PutImage
    completely. So you can have the blazing speed of
    3100 fps on 500x500, if you can generate your image in no time.

    No need to argue about this, since I have written
    programs which use precisely this architecture.
    The only twist is to use XSync after every write
    from shared pixmap to window. If you don't, you can
    be modifying your pixmap data, while server is still
    copying it (no matter how fast the server is ;-)

    Best regards,

    Dusan Peterc

  5. Re: Design trade-off: windows and XImages

    arahne writes:

    >> > It does not matter, as long as you use shared memory.
    >> > http://pantransit.reptiles.org/prog/mit-shm.html
    >> > So you can directly access your frame buffer as memory, and
    >> > get server side XImage to Drawable speed;
    >> > it will be like copying from Pixmap to Window, or
    >> > Window to Window.

    >>
    >> Not quite. An XImage always lives in system memory, and
    >> X(Shm)PutImage copies it to the video memory. A Pixmap or Window is
    >> typically stored in the video memory on the graphics card, and a copy
    >> from one Pixmap/Window to another is done by the graphics
    >> accelerator.

    >
    > It on first thought, you are right, but then, you are not.
    > Empirical data appears to support your claim, on my
    > Nvidia 1600x1200 @ 32bpp
    >
    > x11perf -putimage500
    > 800 reps @ 11.3602 msec ( 88.0/sec): PutImage 500x500 square
    > x11perf -shmput500
    > 2000 reps @ 2.9258 msec ( 342.0/sec): ShmPutImage 500x500 square
    > x11perf -copypixwin500
    > 16000 reps @ 0.3230 msec ( 3100.0/sec): Copy 500x500 from pixmap to
    > window
    >
    > But the actual MIT-SHM implementation is very cunning.
    > You can make a shared XImage / Pixmap, and you get the pointer
    > to the image/pixmap data. If X server implementation is such that
    > Pixmaps reside on the graphics card, it will be there.
    > Examine memory layout of Linux - graphic card memory is
    > mapped to CPU memory, so it is directly accessible, given
    > proper privileges. MIT-SHM handles these priviledges for you.
    > If you use dual XImage / Pixmap, you can write your data
    > directly through the image data pointer (observing depth and bpl),
    > and then simply XCopyArea(...), bypassing X(Shm)PutImage
    > completely. So you can have the blazing speed of
    > 3100 fps on 500x500, if you can generate your image in no time.
    >
    > No need to argue about this, since I have written
    > programs which use precisely this architecture.
    > The only twist is to use XSync after every write
    > from shared pixmap to window. If you don't, you can
    > be modifying your pixmap data, while server is still
    > copying it (no matter how fast the server is ;-)


    What is the method to obtain one of these magic XImages?

    As a side note, the situation can also occur where the graphics memory
    gets full and pixmaps must be stored in system memory. In this case
    the graphics memory will be used as a cache for pixmaps.

    --
    Måns Rullgård
    mru@mru.ath.cx

  6. Re: Design trade-off: windows and XImages

    > What is the method to obtain one of these magic XImages?

    Look at the code below.

    Best regards,

    Dusan Peterc

    /* sample code to allocate dual XImage / Pixmap residing in the */
    /* same memory. Pure Qt can't do this, so KDE paint programs have */
    /* doggy refreshes and like to die on Pixmap allocation failures. */
    ....
    #include
    #include
    #include

    #define BUF_X 512
    #define BUF_Y 512
    XImage *bufXIM;
    Pixmap bufPix;
    XShmSegmentInfo shmInfo;

    /* display, visual, depth, window,... are from your own program */

    /* allocate */
    if (/* the usual checks for MIT-SHM presence and local display */)
    {
    bufXIM = XShmCreateImage(display, visual, depth, ZPixmap,
    0, &shmInfo, BUF_X, BUF_Y);
    if (!bufXIM)
    return;
    shmInfo.shmid = shmget(IPC_PRIVATE,
    bufXIM->bytes_per_line*bufXIM->height, IPC_CREAT|0777);
    if (shmInfo.shmid < 0)
    return;
    shmInfo.shmaddr = (char *)shmat(shmInfo.shmid, 0, 0);
    if (shmInfo.shmaddr == ((char *)-1))
    return;
    bufXIM->data = shmInfo.shmaddr;
    XShmAttach(display, &shmInfo);
    bufPix = XShmCreatePixmap(display,
    RootWindow(display, screen), shmInfo.shmaddr,
    &shmInfo, bufXIM->width, bufXIM->height, bufXIM->depth);
    }

    /* use */
    {
    /* write your image data to bufXIM->data */
    /* use your pixel pushing pointer skills */
    XCopyArea(display, bufPix, window, gc, srcx, srcy, wid, hei,
    dstx, dsty);
    XSync(display, FALSE);
    /* ready do do it again */
    }

    /* don't forget to free shared memory before exiting, */
    /* or it will linger until your X server dies */
    {
    XShmDetach(xinfo.dpy, &shmInfo);
    if (shmctl(shmInfo.shmid, IPC_RMID, 0)==0)
    printf("Shared memory free!\n");
    else
    perror("shmctl");
    }

  7. Re: Design trade-off: windows and XImages

    arahne writes:

    > You can make a shared XImage / Pixmap, and you get the pointer
    > to the image/pixmap data. If X server implementation is such that
    > Pixmaps reside on the graphics card, it will be there.


    Your code allocates the memory with shmget(IPC_PRIVATE, ,
    IPC_CREAT|0777). This does not give the OS any hint that the
    graphics card should be used. Your code then attaches to the
    shared memory segment and gives its id to the X server too.
    Are you implying there is a system call the X server can use to
    relocate the already allocated segment into video RAM and update
    memory mappings in all attached processes?

  8. Re: Design trade-off: windows and XImages

    Kalle Olavi Niemitalo wrote:
    >
    > arahne writes:
    >
    > > You can make a shared XImage / Pixmap, and you get the pointer
    > > to the image/pixmap data. If X server implementation is such that
    > > Pixmaps reside on the graphics card, it will be there.

    >
    > Your code allocates the memory with shmget(IPC_PRIVATE, ,
    > IPC_CREAT|0777). This does not give the OS any hint that the
    > graphics card should be used. Your code then attaches to the
    > shared memory segment and gives its id to the X server too.
    > Are you implying there is a system call the X server can use to
    > relocate the already allocated segment into video RAM and update
    > memory mappings in all attached processes?


    Hello Kalle,

    The intention of my wording was that the location of the
    allocated Pixmap memory is implementation dependent. As an
    application programmer, I don't care if it is on graphics card
    or in the main memory.

    Thank you for pointing out that my shared memory allocation
    gets the memory from the regular memory pool. Your idea of
    relocating the memory is doable, if memory would be always
    accessed through a handle. But probably it is not worth the
    hassle.

    The main points of my argument are:
    1) Using XPutPixel or XPutImage on a local X server introduces
    a huge performance penalty.
    2) This penalty can be overcome by using MIT-SHM extension.
    3) If programmed properly, you can enjoy both the commodity
    of direct access to the image/pixmap pointer, and the speed
    of server side X graphics operations, like XCopyArea().

    XImage/Pixmap allocated in this way will have the server's
    native byte order, depth, padding, etc. So putting this data
    to the screen involves mainly memcpy (line by line).
    I agree that GPU's bitblit is faster CPU's memcopy.
    But for most applications, the even the "slow" CPU's memcopy
    is extremely fast. The application will spend much more time
    rendering the image in the image buffer and adapting it to
    server's depth. One should concentrate the optimization efforts
    into areas where biggest gains can be obtained.

    Dusan Peterc

+ Reply to Thread