GB 18030 in DICOM - DICOM

This is a discussion on GB 18030 in DICOM - DICOM ; Hello everybody, i wonder about how the 3 component groups in the DICOM PN attributes (e.g. Patients' Name (0010,0010)) shall be handled when the data is encoded in Unicode UTF-8 or GB18030. These both new character sets introduced by the ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: GB 18030 in DICOM

  1. GB 18030 in DICOM

    Hello everybody,

    i wonder about how the 3 component groups in the DICOM PN attributes
    (e.g. Patients' Name (0010,0010)) shall be handled when the data is
    encoded in Unicode UTF-8 or GB18030. These both new character sets
    introduced by the Change Proposal 252 are multibyte but mustn't be
    used together with other character sets. Therefore no ESC sequences
    according to ISO 2022 are allowed/needed.

    But now from my point of view the question arises in which component
    group e.g. GB18030 encoded text shall be written. The text may contain
    only single-byte characters but also ideographic and phonetic
    characters. Therefore none of the component groups seems to be the
    right one.

    Does anybody know if the concept of component groups is
    invalid/unnecessary for UTF-8/GB18030 encoded attribute data?

    Many thanks in advance for your help!

    Best regards,
    Thomas


  2. Re: GB 18030 in DICOM

    > Does anybody know if the concept of component groups is
    > invalid/unnecessary for UTF-8/GB18030 encoded attribute data?
    >


    According to PS 3.5, section 6.2.1, the first component group and the group
    delimiter (i.e. "=") shoud be encoded with a 8-bit charset defined in the
    Attribute Specific Character Set (0008,0005) Tag, value 1.
    The PN's component group 2 can be encoded in any multi-byte charset,
    including UTF-8/GB18030.
    That's what I understand reading the standard.

    Umberto

  3. Re: GB 18030 in DICOM

    UTF-8 and GB18030 encoding differentiates from the other multi-byte
    character sets where the Specific Character Set (0008,0005) attribute
    specifies a multiple of character sets.

    Because UTF-8 and GB18030 contains a large number of character set,
    including the default ASCII set, a single value for Specific Character
    Set is sufficient. Also switching between character sets with an ESC
    sequence is not necessary.

    See PS 3.5 - 2004 Annex J for examples of this type of encoding.

    In fact these character sets make life much easier, but it wil take a
    (long) time before they are well accepted.
    Regards,
    Bas -- Philips Medical Systems


  4. Re: GB 18030 in DICOM

    > See PS 3.5 - 2004 Annex J for examples of this type of encoding.
    >
    > In fact these character sets make life much easier, but it wil take a
    > (long) time before they are well accepted.
    > Regards,
    > Bas -- Philips Medical Systems


    All is clear now, thanks.
    I also agree that using UTF-8 or similar would make life really easier
    (expecially for developers).
    Unlikely these encondings will be scarcely accepted as long as other
    standard such HL7 doesn't support them (AFAIK, it doesn't).
    There's moreover the problem that a multi-byte/variable-lenght charset as
    UTF-8 could affect the integrity of store-and-forward operations (P.S 3.5,
    pag. 21) for applications expecting single-byte charset(s). I think this is
    a big problem, since charset are not agreed during association negotiation.

    Is Philips introducing such UTF-8 or GB18030 support on their systems?

    Umberto

  5. Re: GB 18030 in DICOM

    The previous message was sent with a wrong name and email, sorry.

    Umberto

  6. Re: GB 18030 in DICOM

    UC wrote in <15tdhbdx7o3uh.ek7lvrieuqo3.dlg@40tude.net>:

    >Is Philips introducing such UTF-8 or GB18030 support on their systems?


    I would like to see that, too. I even would appreciate a table, who's
    HIS/RIS and PACS do support what. Okay, I could try to google every
    single DICOM conformance statement available, e.g. Kodak DryView 8150
    Systems, and create that table myself, but that's not a task, that's a
    punishment.

    You know, one of the regular sorry excuses for a given development is:
    "No one else is supporting it, why should we?" Like Bell didn't invent
    the telephone because he couldn call anyone by then.

    AFAIK, there is a DICOM connectivity meeting every year, perhaps they
    have a resultset available or something...


    Carsten Witte

    --
    private: http://www.carsti.de
    lurker : http://www.midwinter.de/lurk

+ Reply to Thread