64 bit files - Linux

This is a discussion on 64 bit files - Linux ; Hello, I am working on a program that randomly accesses bytes from a bunch of large binary files. The files average ~30Gig in size each. I had 64 bit file reading working fine on Mandrake 10.2 when compiled with -D_FILE_OFFSET_BITS=64. ...

+ Reply to Thread
Results 1 to 13 of 13

Thread: 64 bit files

  1. 64 bit files

    Hello,

    I am working on a program that randomly accesses bytes from
    a bunch of large binary files.
    The files average ~30Gig in size each.
    I had 64 bit file reading working fine on Mandrake 10.2
    when compiled with -D_FILE_OFFSET_BITS=64. The system
    has since been updated and the code no longer gives full 64 bit file access.

    I have tried compiling using -D_LARGE_FILE_SOURCE but continue
    to have problems with the 64 bit file routines.

    For example:
    #define _FILE_OFFSET_BITS 64
    #define _LARGE_FILE_SOURCE

    off_t pos;

    fseeko(fp,0,SEEK_END);
    pos=ftello(fp);

    In this case 'pos' always comes back as zero.

    Do I need to define all of these?:

    #define _GNU_SOURCE
    #define _LARGEFILE_SOURCE
    #define _LARGEFILE64_SOURCE
    #define _FILE_OFFSET_BITS 64

    Using the flags supplied by 'getconf LFS_CFLAGS' doesn't seem to work
    properly. Which ones are required?

    Also, I notice that EXT2 needs to be formatted with 4k blocks
    to get over 16 Gigs which made me think that maybe the file system
    may have a few limits.
    I have an NTFS SMB share mounted where the
    files are being read. Is there a file size limit to SMB which
    might make reading from 64 bit files problematic?


    P

  2. Re: 64 bit files

    On Sunday 14 Jan 2007 04:07 in article
    of comp.os.linux.development.apps, p wrote:

    > #define _FILE_OFFSET_BITS 64
    > #define _LARGE_FILE_SOURCE


    Did you put these before or after #include ?

    --
    Regards

    Dave [RLU#314465]
    ================================================== ====
    dwnoon@spamtrap.ntlworld.com (David W Noon)
    Remove spam trap to reply via e-mail.
    ================================================== ====


  3. Re: 64 bit files

    p writes:

    > Hello,
    >
    > I am working on a program that randomly accesses bytes from
    > a bunch of large binary files.
    > The files average ~30Gig in size each.
    > I had 64 bit file reading working fine on Mandrake 10.2
    > when compiled with -D_FILE_OFFSET_BITS=64. The system
    > has since been updated and the code no longer gives full 64 bit file access.
    >
    > I have tried compiling using -D_LARGE_FILE_SOURCE but continue
    > to have problems with the 64 bit file routines.
    >
    > For example:
    > #define _FILE_OFFSET_BITS 64
    > #define _LARGE_FILE_SOURCE
    >
    > off_t pos;
    >
    > fseeko(fp,0,SEEK_END);
    > pos=ftello(fp);
    >
    > In this case 'pos' always comes back as zero.
    >
    > Do I need to define all of these?:
    >
    > #define _GNU_SOURCE
    > #define _LARGEFILE_SOURCE
    > #define _LARGEFILE64_SOURCE
    > #define _FILE_OFFSET_BITS 64


    Only #define _FILE_OFFSET_BITS 64 should be required. As someone else
    said though, it must be done before any system headers are included.
    A good idea is to define this on the command line with the -D compiler
    option. That way you won't accidentally forget to add it in some
    file.

    > Using the flags supplied by 'getconf LFS_CFLAGS' doesn't seem to work
    > properly. Which ones are required?
    >
    > Also, I notice that EXT2 needs to be formatted with 4k blocks
    > to get over 16 Gigs which made me think that maybe the file system
    > may have a few limits.


    For very large files, you are better off with a filesystem designed to
    handle them efficiently. XFS and JFS are both good choices.

    > I have an NTFS SMB share mounted where the
    > files are being read. Is there a file size limit to SMB which
    > might make reading from 64 bit files problematic?


    SMB is so slow that I wouldn't even begin to consider it for such
    large volumes of data.

    --
    Måns Rullgård
    mru@inprovide.com

  4. Re: 64 bit files

    David W Noon wrote:
    >
    > Did you put these before or after #include ?
    >



    I'll have to check that.
    I'm pretty sure I just put it above all the text in the headers.

    I also defined -D_LARGE_FILE_SOURCE on the gcc command line
    which should prepend it to al files.

    p

  5. Re: 64 bit files

    Måns Rullgård wrote:
    .................................................. ......
    >
    >
    > Only #define _FILE_OFFSET_BITS 64 should be required. As someone else
    > said though, it must be done before any system headers are included.
    > A good idea is to define this on the command line with the -D compiler
    > option. That way you won't accidentally forget to add it in some
    > file.
    >
    >


    Acutally I used 'getconf LFS_CFLAGS' on the command line. That is
    *supposed* to retrieve that flags required based on the system you're
    using.



    ................................................
    >
    >
    > For very large files, you are better off with a filesystem designed to
    > handle them efficiently. XFS and JFS are both good choices.
    >
    >


    Yes, unfortunately I'm working with a bunch of devout windows zealots
    who are also luddites who wish to be able swap portable drives...

    There is also a firewire problem with the 2.6 series kernels that
    causes a race condition if you read/write a lot of data to 1394.
    This problem appeared with 2.6 kernel and is apparently nonexistent
    with 2.4 kernel.

    I am wondering If I should just mount the NTFS (Win XP SP 2)volumes
    directly... I'm not sure where the NTFS filesystem driver is at for
    goodness at this time. Can I read write on a mounted XP NTFS volume
    without screwing it up?



    ...................
    >
    >
    > SMB is so slow that I wouldn't even begin to consider it for such
    > large volumes of data.
    >


    Unfortunately, because of the scale of data that I am reading/writing to
    the drives, the fact that they are firewire, and the fact that I need
    2.6 kernel for other things on the system, I'm kind of stuck with SMB
    for now.

    The upside is that the SMB is direct to a dediciated machine over a
    gigabit link.

    p

  6. Re: 64 bit files

    In article , p wrote:
    >Hello,
    >
    >I am working on a program that randomly accesses bytes from
    >a bunch of large binary files.
    >The files average ~30Gig in size each.
    >I had 64 bit file reading working fine on Mandrake 10.2
    >when compiled with -D_FILE_OFFSET_BITS=64. The system
    >has since been updated and the code no longer gives full 64 bit file access.


    I wonder what "the system has been updated" means. New kernel? New glibc? New
    compiler? All of the above?

    >
    >off_t pos;
    >
    >fseeko(fp,0,SEEK_END);
    >pos=ftello(fp);
    >
    >In this case 'pos' always comes back as zero.


    Try running this test code with strace to see what syscalls it's making. You
    should see O_LARGEFILE in the open() call if fopen is working correctly.

    --
    Alan Curry
    pacman@world.std.com

  7. Re: 64 bit files

    Alan Curry wrote:
    >
    > I wonder what "the system has been updated" means. New kernel? New glibc? New
    > compiler? All of the above?
    >


    It was an upgrade from Mandriva 10.1 to Mandriva 2007.
    Kernel major version is still 2.6 but different minor.
    I believe the glibc version changed from v5 to v6. The
    compiler cahnges from gcc 3.x to gcc 4.x.

    I was expecting large file support to be better not
    more unstable.

    Interestingly, I can write a 16 Gig file to a spare EXT2 parition
    as a test. I am trying to figure out if the partion was simply
    formatted with 2k blocks or it's the same problem.

    My main problem may only be with the remotely mounted
    (NTFS SMB) read writes. Perhaps SMB has limitations...


    >
    >
    > Try running this test code with strace to see what syscalls it's making. You
    > should see O_LARGEFILE in the open() call if fopen is working correctly.
    >


    Thanks, I'll try that. There is also an 'fopen64' call that forces
    the file to open in 64 bit mode. If that one fails does it fall back to
    32 bits or just return NULL? My guess would be the latter.


    P

  8. Re: 64 bit files

    p wrote:
    > Alan Curry wrote:
    > >
    >> I wonder what "the system has been updated" means. New kernel? New
    >> glibc? New
    >> compiler? All of the above?
    >>

    >
    > It was an upgrade from Mandriva 10.1 to Mandriva 2007.
    > Kernel major version is still 2.6 but different minor.
    > I believe the glibc version changed from v5 to v6. The
    > compiler cahnges from gcc 3.x to gcc 4.x.
    >
    > I was expecting large file support to be better not
    > more unstable.
    >
    > Interestingly, I can write a 16 Gig file to a spare EXT2 parition
    > as a test. I am trying to figure out if the partion was simply
    > formatted with 2k blocks or it's the same problem.
    >
    > My main problem may only be with the remotely mounted
    > (NTFS SMB) read writes. Perhaps SMB has limitations...
    >


    Yes, the Windows SMB protocol is limited to 2GB if I remember
    correctly...

  9. Re: 64 bit files

    Larry Smith wrote:
    >>My main problem may only be with the remotely mounted
    >>(NTFS SMB) read writes. Perhaps SMB has limitations...
    >>

    >
    >
    > Yes, the Windows SMB protocol is limited to 2GB if I remember
    > correctly...


    Awe fudge!....


    Oh well, that's 2 days shot to hell.
    So much for interoperability.......

    Which is more stable jfs or xfs? ~:|

    p

  10. Re: 64 bit files

    p wrote:
    > Larry Smith wrote:
    >>>My main problem may only be with the remotely mounted
    >>>(NTFS SMB) read writes. Perhaps SMB has limitations...
    >>>

    >>
    >>
    >> Yes, the Windows SMB protocol is limited to 2GB if I remember
    >> correctly...

    >
    > Awe fudge!....
    >
    >
    > Oh well, that's 2 days shot to hell.
    > So much for interoperability.......
    >
    > Which is more stable jfs or xfs? ~:|
    >
    > p

    WHat about CIFS? Since it's more recent than SMB, I'd assume it
    supports files > 2G.

    Jerry

  11. Re: 64 bit files

    Jerry Peters wrote:

    > p wrote:
    >> Larry Smith wrote:
    >>>>My main problem may only be with the remotely mounted
    >>>>(NTFS SMB) read writes. Perhaps SMB has limitations...
    >>>>
    >>>
    >>>
    >>> Yes, the Windows SMB protocol is limited to 2GB if I remember
    >>> correctly...

    >>
    >> Awe fudge!....
    >>
    >>
    >> Oh well, that's 2 days shot to hell.
    >> So much for interoperability.......
    >>
    >> Which is more stable jfs or xfs? ~:|
    >>
    >> p

    > WHat about CIFS? Since it's more recent than SMB, I'd assume it
    > supports files > 2G.
    >
    > Jerry


    It is SMB by another name
    --
    Windows isn't unstable. It's spontaneous.


  12. Re: 64 bit files

    Larry Smith wrote:
    >
    > Yes, the Windows SMB protocol is limited to 2GB if I remember
    > correctly...


    I just thought about this.
    After *successfully* copying all my 64 bit size files of 30Gig or more,
    over an SMB connection, to Windows XP this doesn't seem to be the case.

    p

  13. Re: 64 bit files


    a6c21487 wrote:

    > Larry Smith wrote:


    > > Yes, the Windows SMB protocol is limited to 2GB if I remember
    > > correctly...


    > I just thought about this.
    > After *successfully* copying all my 64 bit size files of 30Gig or more,
    > over an SMB connection, to Windows XP this doesn't seem to be the case.


    The protocol supports operations with both 32-bit and 64-bit offsets. I
    have seen Windows XP issue 32-bit offset operations in some cases, so
    some things might not work on very large files. For example, Windows
    seems to access the 'desktop.ini' file with 32-bit offset operations,
    though a 'desktop.ini' file over 2GB seems unlikely.

    The main general-purpose code seems to have all migrated to 64-bit
    offsets. Some specialized 'internal' operations may still issue 32-bit
    offset requests. Understanding why Windows issues the SMB operations it
    does when it does is a frustrating affair. It seems to randomly issue
    redundant and strange queries.

    For example, doing a delete of a large group of files, some will be
    deleted with a 'delete' operation and some will be deleted with the
    following operations: open and lock, get file information, set delete
    on close flag, close, check if it's still there (since the 'close'
    operation doesn't really indicate the success of the delete!). The more
    complex mechanism does allow Windows to know exactly what it's
    deleting, but why some one way and some the other with no apparent
    pattern?

    DS


+ Reply to Thread