endian of within a tcp/ip byte tramission - TCP-IP
This is a discussion on endian of within a tcp/ip byte tramission - TCP-IP ; David Schwartz wrote:
> Albert Manfredi wrote:
>
> > Similarly, whenever a multi-octet field represents a numeric quantity
> > the left most bit of the whole field is the most significant bit. When
> > a multi-octet quantity ...
-
Re: endian of within a tcp/ip byte tramission
David Schwartz wrote:
> Albert Manfredi wrote:
>
> > Similarly, whenever a multi-octet field represents a numeric quantity
> > the left most bit of the whole field is the most significant bit. When
> > a multi-octet quantity is transmitted the most significant octet is
> > transmitted first.
>
> In the TCP payload, there are no multi-octet fields. Please show me
> where the TCP standard defines any multi-octet fields in the payload.
>
> You are completely misreading this, and your misreading leads to some
> very bizarre conseqeuences.
Show me how I'm misreading RFC 791. I won't quote it again. What RFC
791 describes as "data" is the payload of the IP datagram. That
payload, not just the headers of the IP datagram, are described as
being arranged and transmitted in the same order in which they are
written on a page. And to make this absolutely clear, RFC 791 also
gives an example of a multi-byte numerical quantity. In these cases,
bytes are transmitted in big endian order. Very unamibiguous.
The payload of the IP datagram can be a TCP segment, all of it, or a
UDP datagram, all of it. To go by the letter of the law (holy writ),
any multi-byte numerical quantity in the header and data of the IP
payload has to be big endian. Surely, you won't argue that there are
multi-byte numerical values transmitted in TCP and UDP headers, right?
Examples are message length, sequence number, checksum. And certainly,
because the TCP and UDP payloads are transmitted in byte sequence,
including the numerical data content, the payload can be any odd number
of bytes you choose. If the payload were instead specified to be little
endian byte order, you'd have to specify the word length and you'd have
to transmit a number of whole words (some multiple of bytes).
> Any standard you are using on top of TCP will define things at the
> byte-level. So there will never be any multi-octet fields in the TCP
> payload, as far as the TCP standard is concerned.
But again, RFC 791 talks about the *IP* payload. Which, to be
completely kosher, consists of the headers and the data parts of the
TCP/UDP packets.
> When you see, for example, BER explaining how to encode a multi-byte
> integer as a series of octets, that means that any protocol that layers
> BER on top of TCP (as, for example, SNMP does) *only* presents octets
> to TCP.
But specifying BER is no different. Is it not ASN.1? As far as I can
tell, this follows the same rules to the letter. Even ASCII sequences
are sent in that same left-to-right order. And of course the
application has to know where a payload starts and ends. That has
nothing to do with this discussion, as far as I can tell.
I'll agree that nothing would break if some genius designed a TCP
application that transmits binary numbers in 64-bit words, little
endian byte order. But that would be really dumb, and besides, it is
not consistent with the holy writ.
Bert
-
Re: endian of within a tcp/ip byte tramission
Barry Margolin wrote:
> FTP doesn't have any headers, and its commands and replies are all ASCII
> text. So the concept of "endianness" doesn't really apply. The only
> thing you might be referring to is the syntax of the PORT command, which
> breaks the address and port number into octets, which are then sent as
> comma-separated decimal numbers in ASCII.
Yes, to me that is basically big endian, as applied to ASCII. Most
significant characters first. And in fact, you'll note that the wording
says "high order 8 bits" when it describes the transmission sequence.
Bert
-
Re: endian of within a tcp/ip byte tramission
Albert Manfredi wrote:
> But specifying BER is no different. Is it not ASN.1? As far as I can
> tell, this follows the same rules to the letter. Even ASCII sequences
> are sent in that same left-to-right order. And of course the
> application has to know where a payload starts and ends. That has
> nothing to do with this discussion, as far as I can tell.
You are missing the point. If you are layering BER on top of TCP, for
example as you would to implement SNMP, then all you are passing the
TCP is octets, even if SNMP specifies you need to pass a 64-bit
integer. The BER protocol converts logical items such as long integers
into a stream of octets.
The same is true for any protocol layered on top of TCP. It defines its
output as a stream of octets because TCP so defines its input.
DS
-
Re: endian of within a tcp/ip byte tramission
Barry Margolin wrote:
> FTP doesn't have any headers, and its commands and replies are all ASCII
> text. So the concept of "endianness" doesn't really apply. The only
> thing you might be referring to is the syntax of the PORT command, which
> breaks the address and port number into octets, which are then sent as
> comma-separated decimal numbers in ASCII.
If the TCP specification required that integers be sent in network byte
order, then FTP would be violating the specification because it does
not send port numbers that way.
Because we agree that the idea that FTP violates the TCP specification
because it sends numbers in ASCII is ridiculous, it follows that the
premise that the TCP specification requires integers to be sent that
way is also ridiculous.
A premise that leads inexorably to an absurd conclusion is itself
absurd.
TCP only cares that you provide it a stream of octets. In fact, that is
the *only* thing you can provide to TCP. You cannot provide multi-byte
integers to TCP because TCP provides no way to tell where those
integers begin and end in the payload.
DS
-
Re: endian of within a tcp/ip byte tramission
In article <1157591548.088213.127880@i42g2000cwa.googlegroups. com>, "Albert Manfredi" writes:
> David Schwartz wrote:
>> Albert Manfredi wrote:
>>
>> > Similarly, whenever a multi-octet field represents a numeric quantity
>> > the left most bit of the whole field is the most significant bit. When
>> > a multi-octet quantity is transmitted the most significant octet is
>> > transmitted first.
>>
>> In the TCP payload, there are no multi-octet fields. Please show me
>> where the TCP standard defines any multi-octet fields in the payload.
>>
>> You are completely misreading this, and your misreading leads to some
>> very bizarre conseqeuences.
>
> Show me how I'm misreading RFC 791. I won't quote it again.
"Similarly, whenever a multi-octet field represents a numeric quantity
the left most bit of the whole field is the most significant bit. When
a multi-octet quantity is transmitted the most significant octet is
transmitted first.
In the context of RFC 791 one might successfully argue that the IP
payload is a "multi-octet field". But one cannot successfully argue
that it represents a numeric quantity in the context of RFC 791.
Fields defined within a higher layer protocols aren't subject matter
for RFC 791. This paragraph simply fails to apply to them.
RFC 793 does not specify byte transmission order for the TCP header
anywhere that I can see. No biggie -- it's obvious to everyone that
the TCP header diagrams should be interpreted in the same way that
RFC 791 mandates that the IP header diagrams be interpreted.
But as far as I can see, a compliant implementation of TCP could use
little endian sequence numbers. It wouldn't interoperate with anyone
else's TCP. And it'd be a pretty stupid. But it would be technically
compliant.
> What RFC
> 791 describes as "data" is the payload of the IP datagram. That
> payload, not just the headers of the IP datagram, are described as
> being arranged and transmitted in the same order in which they are
> written on a page.
Um, the data in the IP payload is placed there by the higher layer
protocol or application. If that application chooses to put the
low order byte at top left, it's IP's job to transmit that low
order byte first.
> And to make this absolutely clear, RFC 791 also
> gives an example of a multi-byte numerical quantity. In these cases,
> bytes are transmitted in big endian order. Very unamibiguous.
Yes. In the context of the IP header it is unambiguous.
> The payload of the IP datagram can be a TCP segment, all of it, or a
> UDP datagram, all of it. To go by the letter of the law (holy writ),
> any multi-byte numerical quantity in the header and data of the IP
> payload has to be big endian.
Header, yes.
Payload, no.
> Surely, you won't argue that there are
> multi-byte numerical values transmitted in TCP and UDP headers, right?
In the context of RFC 793, those are fields. They are subject matter
for verbiage in RFC 793 prescribing the associated transmission order.
In the context of RFC 791, those are not fields. They are part of an
opaque sequence of octets.
> Examples are message length, sequence number, checksum. And certainly,
> because the TCP and UDP payloads are transmitted in byte sequence,
> including the numerical data content, the payload can be any odd number
> of bytes you choose. If the payload were instead specified to be little
> endian byte order, you'd have to specify the word length and you'd have
> to transmit a number of whole words (some multiple of bytes).
In the context of RFC 791, the IP payload is an opaque sequence of
octets. It has no fields.
>> Any standard you are using on top of TCP will define things at the
>> byte-level. So there will never be any multi-octet fields in the TCP
>> payload, as far as the TCP standard is concerned.
>
> But again, RFC 791 talks about the *IP* payload. Which, to be
> completely kosher, consists of the headers and the data parts of the
> TCP/UDP packets.
Yes. And the notion of endian-ness for an opaque octet string is
meaningless.
> I'll agree that nothing would break if some genius designed a TCP
> application that transmits binary numbers in 64-bit words, little
> endian byte order. But that would be really dumb, and besides, it is
> not consistent with the holy writ.
There is nothing dumb about transferring a file full of little endian
unsigned 64 bit binary numbers via FTP. It is perfectly consistent with
the holy writ.
What would be dumb would be to tell all your users to convert binary
to ASCII before transmitting via FTP so that the RFC police won't
come.
-
Re: endian of within a tcp/ip byte tramission
"David Schwartz" wrote:
> Albert Manfredi wrote:
>
>> But specifying BER is no different. Is it not ASN.1? As far as I can
>> tell, this follows the same rules to the letter. Even ASCII sequences
>> are sent in that same left-to-right order. And of course the
>> application has to know where a payload starts and ends. That has
>> nothing to do with this discussion, as far as I can tell.
>
> You are missing the point. If you are layering BER on top of TCP, for
> example as you would to implement SNMP, then all you are passing the
> TCP is octets, even if SNMP specifies you need to pass a 64-bit
> integer. The BER protocol converts logical items such as long integers
> into a stream of octets.
>
> The same is true for any protocol layered on top of TCP. It defines
> its
> output as a stream of octets because TCP so defines its input.
The only point I made was that ASN.1 also specifies that bytes are
transmitted in sequence, left to right, and that numerical quantities
are transmitted with the most significant bytes first, or most
significant ASCII characters first. So there's nothing there that
contradicts RFC 791.
There must be something very basic about the meaning of "big endian" or
about the words of RFC 791 that we are not interpreting the same.
Because I find all of this very consistent and very self-explanatory.
Just parenthetically, I think the existence of specs like ASN.1 is why
extra layers between the Transport and the Application layers were
needed. Not to resurrect an old discussion, but this shows how the
7-layer OSI model applies very well indeed to the IP protocol stack.
ASN.1 is at layer 6.
Bert
-
Re: endian of within a tcp/ip byte tramission
wrote:
[From RFC 791]
> "Similarly, whenever a multi-octet field represents a numeric quantity
> the left most bit of the whole field is the most significant bit.
> When
> a multi-octet quantity is transmitted the most significant octet is
> transmitted first."
>
> In the context of RFC 791 one might successfully argue that the IP
> payload is a "multi-octet field". But one cannot successfully argue
> that it represents a numeric quantity in the context of RFC 791.
The quote you posted applies to both the header of IP and the "data"
portion, i.e. payload. Agreed? What it says is that *if* the byte
sequence represents a numerical quantity, *then* the sequence is big
endian. This also works with ASCII representation of numbers, since they
too are transmitted in the same sequence as written on a page. First the
ones on the left, then the ones to the right of that.
So I don't get the point you're making. RFC 791 does not state that all
bytes represent numerical quantities. But if they do represent numerical
quantities, then big endian is the transmission order.
> Fields defined within a higher layer protocols aren't subject matter
> for RFC 791. This paragraph simply fails to apply to them.
How so? Any higher layer protocol encapsulated within an IP datagram is
in the "data" portion of the IP datagram, which is very much a subject
of RFC 791.
Bert
-
Re: endian of within a tcp/ip byte tramission
In article , "Albert Manfredi" writes:
> wrote:
>
> [From RFC 791]
>
>> "Similarly, whenever a multi-octet field represents a numeric quantity
>> the left most bit of the whole field is the most significant bit.
>> When
>> a multi-octet quantity is transmitted the most significant octet is
>> transmitted first."
>>
>> In the context of RFC 791 one might successfully argue that the IP
>> payload is a "multi-octet field". But one cannot successfully argue
>> that it represents a numeric quantity in the context of RFC 791.
>
> The quote you posted applies to both the header of IP and the "data"
> portion, i.e. payload. Agreed? What it says is that *if* the byte
> sequence represents a numerical quantity, *then* the sequence is big
> endian.
It doesn't matter to me whether the quote applies to the data portion
of an IP packet or not since the data portion of an IP packet is not
a multi-octet field that represents a numeric quantity. It also does not
(for the purposes of RFC 791) contain any multi-octet fields that
represent numeric quantities.
Even if we agree that the paragraph applies to the payload, it is
vacuously upheld.
The IP payload is an opaque byte sequence. For the purposes of
RFC 791 it does not represent anything.
> This also works with ASCII representation of numbers, since they
> too are transmitted in the same sequence as written on a page. First the
> ones on the left, then the ones to the right of that.
This is a distraction, but since you are in error on this point as
well, let me correct you.
Numbers written on a page do not have a sequence. They may (if we
are consistent in the way we hold the paper) have something that
we may refer to as "left", "right", "top" and "bottom", but they
do not have a sequence.
The left-to-right, top-to-bottom ordering is a convention (and not
a universal one at that). It is not an inherent property of a
rectangular array of glyphs on a piece of paper.
Further, the numeric string "123" when interpreted as a decimal number
in the standard way does not have a sequence. It is NOT the case
that the one comes first and the three comes last. That is a mere
artifact of the way we were taught to read and write. What IS the case
is that the 1 is in the hundreds column, the 2 is in the tens column and
the 3 is in the ones column. This applies regardless of whether we
read or write "123" from right to left or from left to right.
The written convention is "most significant digit on the left", not "most
significant digit first".
> So I don't get the point you're making. RFC 791 does not state that all
> bytes represent numerical quantities. But if they do represent numerical
> quantities, then big endian is the transmission order.
The bytes in an IP payload do not represent anything for the purposes
of RFC 791. They are an opaque sequence of octets.
The interpretation of those octets may be suitable subject matter for
RFC 793. It is not suitable subject matter for RFC 791.
>> Fields defined within a higher layer protocols aren't subject matter
>> for RFC 791. This paragraph simply fails to apply to them.
>
> How so? Any higher layer protocol encapsulated within an IP datagram is
> in the "data" portion of the IP datagram, which is very much a subject
> of RFC 791.
Only as an opaque sequence of octets.
This is such a basic feature of layered protocol design that it is
difficult to understand why you don't "get it".
What the upper layer does with the data that the lower layer transmits
is NONE OF THE LOWER LAYER'S BUSINESS!!
Just as we don't expect TCP to prescribe a structure in the stream of
bytes that it is in the business of transferring, we shouldn't
expect IP to prescribe a structure in the payload in the datagrams
that it is in the business of transferring.
If the writers of RFC 791 actually meant to say what you claim they
meant to say they they wouldn't have buried the endianness convention
in appendix B. That kind of draconian requirement for upper layer
protocols would have been uppercased and re-emphasized multiple times
because it would have been both breathtakingly stupid and breathtakingly
unexpected.
Just to clarify. It's not stupid to have a default big-endian
convention in packet layout throghout a protocol family. And it's not
stupid to have a documentation convention for the presentation of
packet layouts in the documentation for that protocol family.
But it is a mistake to embed the documentation conventions in the
description of the lowest layer member of the protocol family and to
then fail to do so elsewhere. And it WOULD BE extremely stupid to
treat the packet layout documentation conventions in the lowest layer
as being a requirement that applies to all actual field contents in
all headers and all data at all layers.
Yes, I believe that RFC 793 is "mistaken" in this sense. But as I've
also said elsewhere, "no big deal -- we all understand that big-endian
was assumed".
I regard RFC 791, appendix B as a documentation convention rather than
anything of substance. It clarifies the interpretation of the tabular
format packet layout diagrams presented elsewhere in RFC 791. It has
no meaning beyond that.
That other layers assume a left to right, top to bottom, big-endian
convention for the interpretation of their packet layout diagrams
without explicitly making that clear is a problem, but not a significant
one.
-
Re: endian of within a tcp/ip byte tramission
Albert Manfredi wrote:
> The only point I made was that ASN.1 also specifies that bytes are
> transmitted in sequence, left to right, and that numerical quantities
> are transmitted with the most significant bytes first, or most
> significant ASCII characters first. So there's nothing there that
> contradicts RFC 791.
You are being very selective in your interpretation of RFC 791. For
example, RFC 791 says:
"Similarly, whenever a multi-octet field represents a numeric quantity
the left most bit of the whole field is the most significant bit. When
a multi-octet quantity is transmitted the most significant octet is
transmitted first."
How does sending a number in ASCII follow the rule that "the left most
bit of the whole field is the most significant bit"?
If this applied to the payload, sending numbers in ASCII would violate
it. But it is totally obvious that it doesn't apply to the payload
because the payload doesn't represent a numeric quantity.
RFC 791 does not specify the payload as consisting of multi-byte
integers. It is inconceivable that it could do so without specifying
how to tell where they begin or end. So how it says you encode
multi-byte integers doesn't matter. The payload consists only of
octets, so any protocol layered on top of TCP must specify its data
payload to be passed to TCP in terms of octets. This is in fact what
they do.
DS
-
Re: endian of within a tcp/ip byte tramission
wrote:
> The written convention is "most significant digit on the left", not
> "most
> significant digit first".
The *written convention* may not dictate that the MS character is
transmitted first. But *RFC 791* tells us to transmit the characters
that appear on the left first.
And for binary multi-byte quantities, it ties the transmission order to
significance even more clearly.
> The bytes in an IP payload do not represent anything for the purposes
> of RFC 791. They are an opaque sequence of octets.
In principle, what you say could have been written into RFC 791. In
principle, once you have correctly decoded the "protocol" field in the
IP header, you should be able to correctly decode whatever oddball
sequence of bytes the IP payload carries.
But the wording of RFC 791 is not ambiguous in this regard either, like
it or not. And in practice, byte ordering and big endian is followed
throughout the IP stack.
> If the writers of RFC 791 actually meant to say what you claim they
> meant to say they they wouldn't have buried the endianness convention
> in appendix B. That kind of draconian requirement for upper layer
> protocols would have been uppercased and re-emphasized multiple times
> because it would have been both breathtakingly stupid and
> breathtakingly
> unexpected.
Well, come now. If the writers had not meant what they said, they would
not have included Appendix B at all. This "buried in Appendix B" is a
specious argument.
> But it is a mistake to embed the documentation conventions in the
> description of the lowest layer member of the protocol family and to
> then fail to do so elsewhere. And it WOULD BE extremely stupid to
> treat the packet layout documentation conventions in the lowest layer
> as being a requirement that applies to all actual field contents in
> all headers and all data at all layers.
>
> Yes, I believe that RFC 793 is "mistaken" in this sense. But as I've
> also said elsewhere, "no big deal -- we all understand that big-endian
> was assumed".
Hey, whatever. I see a lot of disagreeing with RFCs, but that doesn't
give anyone the right to restate what they say. I'd agree with you when
you say "no big deal, we all assume big endian anyway." And I'd follow
that up with "and that's how it's written in RFC 791 anyway."
Bert
-
Re: endian of within a tcp/ip byte tramission
"David Schwartz" wrote:
> You are being very selective in your interpretation of RFC 791. For
> example, RFC 791 says:
>
> "Similarly, whenever a multi-octet field represents a numeric quantity
> the left most bit of the whole field is the most significant bit.
> When
> a multi-octet quantity is transmitted the most significant octet is
> transmitted first."
>
> How does sending a number in ASCII follow the rule that "the left most
> bit of the whole field is the most significant bit"?
I was tying your mention of ASN.1 to what RFC 791 says. In the case of
an ASCII representation, ASN.1 says to send the characters in a left to
right sequence, and RFC 791 says to transmit bytes in that same order
(left most byte transmitted first). So when transmitting an ASCII number
that was written using ASN.1, the combined effect of ASN.1 and RFC 791
is that the most significant characters are the ones transmitted first.
(Ditto when ASCII numbers are sent in FTP, DNS, DHCP, etc.)
Obviously, the bits of a number represented as ASCII characters have no
"most significant" connotation to them.
> RFC 791 does not specify the payload as consisting of multi-byte
> integers. It is inconceivable that it could do so without specifying
> how to tell where they begin or end.
RFC 791 doesn't give the boundaries of any *field* the payload might
carry, but it does say very clearly that bytes are sent in a certain
sequence. If sending a numerical quantity, the sequence is big endian.
Here's the deal that so many find hard to swallow:
*If* you accept what RFC 791 says, *then* the fact that RFC 793 and
others do not specify the byte sequence is okay. There is never any
ambiguity. Bytes are always sent in sequence as shown on the page of
whatever RFC, left to right, and any binary number shown in that RFC is
transmitted with MSbyte sent first. Simple, consistent, doesn't have to
be repeated in every RFC.
*If* you do not read what RFC 791 says, *then* you have to make
assumptions on the byte ordering of the higher level protocols.
Perhaps it was wrong to specify the byte ordering at the bottom layer
for all the upper layers. Nevertheless, that's how it is.
Bert
-
Re: endian of within a tcp/ip byte tramission
Albert Manfredi wrote:
> Obviously, the bits of a number represented as ASCII characters have no
> "most significant" connotation to them.
RFC 791 says:
"Similarly, whenever a multi-octet field represents a numeric quantity
the left most bit of the whole field is the most significant bit. When
a multi-octet quantity is transmitted the most significant octet is
transmitted first."
This doesn't say "whenever a multi-octet field represents a numeric
quantity *in* *binary*" or have any similar limitation. Clearly, FTP's
ASCII representation of a port consists of a multi-octet field
representing a numeric quantity.
The second sentence is even more troubling. Suppose I'm sending the
word "hello". This is clearly a multi-octet quantity. However, which
letter is most significant? After all, I am supposed to send that
first.
I only see two options:
1) Argue that every protocol layered on top of TCP that sends numbers
as more than one byte of ASCII are not complying with RFC791.
2) Argue that this clause only applies to numerical fields defined and
delimited in this RFC.
I think it is completely obvious that RFC 791 specifies the format of
only the fields it defines and delimits. It is nearly impossible to
specify the format of a field without specifying how that field is
located or delimited.
DS
-
Re: endian of within a tcp/ip byte tramission
David Schwartz wrote:
> RFC 791 says:
>
> "Similarly, whenever a multi-octet field represents a numeric quantity
> the left most bit of the whole field is the most significant bit. When
> a multi-octet quantity is transmitted the most significant octet is
> transmitted first."
>
> This doesn't say "whenever a multi-octet field represents a numeric
> quantity *in* *binary*" or have any similar limitation. Clearly, FTP's
> ASCII representation of a port consists of a multi-octet field
> representing a numeric quantity.
It's clear it means "in binary" in this paragraph, because they already
went to great lengths, including with a picture, up at the top of the
appendix, to say:
"Whenever a diagram shows a group of octets, the order of transmission
of those octets is the normal order in which they are read in English."
So this covers your ASCII numbers very well indeed. The sentence you
quote, instead, is at the end of Appendix B, where they just want to
clarify how binary numbers are to be assumed throughout the IP stack.
> The second sentence is even more troubling. Suppose I'm sending the
> word "hello". This is clearly a multi-octet quantity. However, which
> letter is most significant?
"Significant" only has meaning for numerical values. For regular words
in ASCII, "Whenever a diagram shows a group of octets, the order of
transmission of those octets is the normal order in which they are read
in English" takes care of everything already. No ambiguity at all. If
"hello" is written as I show here, in a message spec, e.g. in the data
field of some TCP segment being specified, it must be transmitted with
h first, and o last.
> I only see two options:
>
> 1) Argue that every protocol layered on top of TCP that sends numbers
> as more than one byte of ASCII are not complying with RFC791.
Any number of ASCII characters can be sent, and isn't it convenient
that numbers written in ASCII happen to have their MSdigits shown on
the left, when written in English. This is supposed to make you feel
good, not confused.
> 2) Argue that this clause only applies to numerical fields defined and
> delimited in this RFC.
This RFC very clearly states, way up top of Appendix B, that the
ordering rules apply to header and data. And of course, the "data"
field is not delimited, except by the 16-bit length field. So there is
no demarkation line implied, where these simple rules don't apply.
> It is nearly impossible to
> specify the format of a field without specifying how that field is
> located or delimited.
RFC 791 isn't trying to specify a format for the data bytes. It is only
specifying some clear, simple, and consistent rules for byte ordering,
rules that apply to any format. This is not only possible, it is done.
When you're taught to write words and numbers in first grade, the
teacher explains that letters arranged left to right, numbers arranged
from left to right. I don't think the teacher is specifying entire
books of prose. Only simple, basic rules that allow you to decode a
word or a number in *any* book. Do you get all confused wondering how
the teacher can possibly specify all the books or articles you'll be
writing? I don't think so! Same applies here. General, universally
applied ordering rules for byte transmission sequence. Nothing more.
Bert
-
Re: endian of within a tcp/ip byte tramission
In article , "Albert Manfredi" writes:
> wrote:
>
>> The written convention is "most significant digit on the left", not
>> "most
>> significant digit first".
>
> The *written convention* may not dictate that the MS character is
> transmitted first. But *RFC 791* tells us to transmit the characters
> that appear on the left first.
*RFC 791* tells us to transmit the characters that appear on the
left _IN THE PACKET LAYOUTS APPEARING IN RFC 791_ first.
> And for binary multi-byte quantities, it ties the transmission order to
> significance even more clearly.
Again, this applies only to multi-byte quantities as defined in RFC 791.
i.e. in the IP header.
>> The bytes in an IP payload do not represent anything for the purposes
>> of RFC 791. They are an opaque sequence of octets.
>
> In principle, what you say could have been written into RFC 791. In
> principle, once you have correctly decoded the "protocol" field in the
> IP header, you should be able to correctly decode whatever oddball
> sequence of bytes the IP payload carries.
No. In principle this is not possible.
Suppose I got a file yesterday. It was full of binary data. The
documentation indicates that it is 4 byte unsigned big-endian. I FTP
it from host A to host B confident that I am not violating any RFC's.
Tomorrow I look at the documentation and notice a clause I hadn't
seen before. The file is divided into variable length records with
4 byte unsigned little-endian byte counts. If I FTP the file again
today my FTP software will be non-compliant. According to you,
I can't transfer it again unless I receive a dispensation from the
RFC police first.
The state of compliance of my software should not depend on my
state of mind.
> But the wording of RFC 791 is not ambiguous in this regard either, like
> it or not. And in practice, byte ordering and big endian is followed
> throughout the IP stack.
Like it or not, Appendix B in RFC 791 does not have any useful meaning
as applied to the IP payload because RFC 791 does not interpret the
IP payload as a multibyte numeric field or as a composite field containing
multibyte numeric fields.
In practice, the upper level protocols were designed big-endian because
consistency makes sense, not because it was mandated by RFC 791.
According to you, if I lay an arbitrary protocol on top of IP using
GRE, that protocol is now required to have a little-endian convention
in its header fields.
>> If the writers of RFC 791 actually meant to say what you claim they
>> meant to say they they wouldn't have buried the endianness convention
>> in appendix B. That kind of draconian requirement for upper layer
>> protocols would have been uppercased and re-emphasized multiple times
>> because it would have been both breathtakingly stupid and
>> breathtakingly
>> unexpected.
>
> Well, come now. If the writers had not meant what they said, they would
> not have included Appendix B at all. This "buried in Appendix B" is a
> specious argument.
It is your reading of appendix B that is so mind-bogglingly insane
and unexpected that any such meaning would not have been buried in
an appendix.
>> But it is a mistake to embed the documentation conventions in the
>> description of the lowest layer member of the protocol family and to
>> then fail to do so elsewhere. And it WOULD BE extremely stupid to
>> treat the packet layout documentation conventions in the lowest layer
>> as being a requirement that applies to all actual field contents in
>> all headers and all data at all layers.
>>
>> Yes, I believe that RFC 793 is "mistaken" in this sense. But as I've
>> also said elsewhere, "no big deal -- we all understand that big-endian
>> was assumed".
>
> Hey, whatever. I see a lot of disagreeing with RFCs, but that doesn't
> give anyone the right to restate what they say. I'd agree with you when
> you say "no big deal, we all assume big endian anyway." And I'd follow
> that up with "and that's how it's written in RFC 791 anyway."
RFC 793 is silent on the question of endian-ness. That is an unfortunate
oversight.
RFC 791 is not silent on the question of endian-ness. However that
does not mean that RFC 791 imposes a big-endian requirement on
protocols that use IP.
I am not the one reading an improper interpretation into RFC 791.
You are. You are the one restating what it says. Incorrectly.
-
Re: endian of within a tcp/ip byte tramission
wrote:
> *RFC 791* tells us to transmit the characters that appear on the
> left _IN THE PACKET LAYOUTS APPEARING IN RFC 791_ first.
....
> Again, this applies only to multi-byte quantities as defined in RFC
> 791.
> i.e. in the IP header.
Sorry, no sell. The RFC makes it plain that the rules apply to header
and data. And the "data" field, i.e the payload, covers the entirety of
whatever upper layer protocol IP is carrying. There is no demarkation in
the IP datagram beyond which RFC 791 doesn't apply.
>> In principle, what you say could have been written into RFC 791. In
>> principle, once you have correctly decoded the "protocol" field in
>> the
>> IP header, you should be able to correctly decode whatever oddball
>> sequence of bytes the IP payload carries.
>
> No. In principle this is not possible.
>
> Suppose I got a file yesterday. It was full of binary data. The
> documentation indicates that it is 4 byte unsigned big-endian. I FTP
> it from host A to host B confident that I am not violating any RFC's.
>
> Tomorrow I look at the documentation and notice a clause I hadn't
> seen before. The file is divided into variable length records with
> 4 byte unsigned little-endian byte counts. If I FTP the file again
> today my FTP software will be non-compliant. According to you,
> I can't transfer it again unless I receive a dispensation from the
> RFC police first.
You are confused. What I said above was to explain that YOUR ideas could
have been written into the standard, but WERE NOT.
If you get a file transferred via TCP over IP, *IN PRINCIPLE* RFC 791
could have said, "the protocol field of the L4 header is used to
determine the byte ordering in the IP payload. The standard describing
that L4 protocol must be used to obtain this information."
Instead, RFC 793 relies on RFC 791, and says NOTHING about byte
ordering. If you downloaded a file containing a bunch of binary data
yesterday, however that file was partitioned, you know a priori that
each binary quantity *must* be big endian. So unless the originator was
totally clueless, you should NEVER have to worry about some fine print
explaining "The file is divided into variable length records with 4 byte
unsigned little-endian byte counts." That should NEVER HAPPEN, if you
read the RFCs.
As to use of ASCII vs binary, integer vs floating point, all those
decisions are not discussed in RFC 791 (or 793). They are left to each
L6 implementation. But the left to right sequence is determined.
> In practice, the upper level protocols were designed big-endian
> because
> consistency makes sense, not because it was mandated by RFC 791.
That's a nice story, but I'm afraid it's a profession of faith. The RFCs
are consistent. You may choose to ignore what they say and create a
different reality. Lucky that in this case, your reality does not
conflict with the standard.
> According to you, if I lay an arbitrary protocol on top of IP using
> GRE, that protocol is now required to have a little-endian convention
> in its header fields.
According to RFC 791, any protocol layered on top of IP is required to
gave *big endian*, not little endian. I'm not saying that knuckleheads
won't violate this simple rule, of course. I happen to know for a fact
that knuckleheads have violated it. That creates a hassle *invariably*,
but usually it doesn't break anything.
> RFC 793 is silent on the question of endian-ness. That is an
> unfortunate
> oversight.
Yes, you said this before. If you had read RFC 791 before 793, you would
have noticed that it was not an oversight.
But once again, I do agree that things *could* have been done
differently.
Bert
-
Re: endian of within a tcp/ip byte tramission
In article , "Albert Manfredi" writes:
> wrote:
>
>> *RFC 791* tells us to transmit the characters that appear on the
>> left _IN THE PACKET LAYOUTS APPEARING IN RFC 791_ first.
> ...
>> Again, this applies only to multi-byte quantities as defined in RFC
>> 791.
>> i.e. in the IP header.
>
> Sorry, no sell. The RFC makes it plain that the rules apply to header
> and data. And the "data" field, i.e the payload, covers the entirety of
> whatever upper layer protocol IP is carrying. There is no demarkation in
> the IP datagram beyond which RFC 791 doesn't apply.
Two comments.
First, look at the scope section of the document. TCP is not in scope.
Second, read appendix B as if it referred to only the diagrams appearing
in RFC 791 and you'll find that it reads entirely naturally.
>>> In principle, what you say could have been written into RFC 791. In
>>> principle, once you have correctly decoded the "protocol" field in
>>> the
>>> IP header, you should be able to correctly decode whatever oddball
>>> sequence of bytes the IP payload carries.
>>
>> No. In principle this is not possible.
>>
>> Suppose I got a file yesterday. It was full of binary data. The
>> documentation indicates that it is 4 byte unsigned big-endian. I FTP
>> it from host A to host B confident that I am not violating any RFC's.
>>
>> Tomorrow I look at the documentation and notice a clause I hadn't
>> seen before. The file is divided into variable length records with
>> 4 byte unsigned little-endian byte counts. If I FTP the file again
>> today my FTP software will be non-compliant. According to you,
>> I can't transfer it again unless I receive a dispensation from the
>> RFC police first.
>
> You are confused. What I said above was to explain that YOUR ideas could
> have been written into the standard, but WERE NOT.
In principle, looking at the IP protocol is not adequate to correctly
decode whatever oddball sequence of bytes appear in the payload.
You need contextual information. You need the documentation that went
with the binary data file that got transferred at 2:00 am this morning.
That's the "in principle it's possible" that I was objecting to.
You claimed that it's possible to decode an entire data stream
just by looking at it. It's not possible. Out of band context counts.
> If you get a file transferred via TCP over IP, *IN PRINCIPLE* RFC 791
> could have said, "the protocol field of the L4 header is used to
> determine the byte ordering in the IP payload. The standard describing
> that L4 protocol must be used to obtain this information."
In practice, everyone knows that the L4 protocol documentation is the
proper place to look for endianness requirements for the L4 packet headers
regardless of what any lower layer protocols may declare.
On the other hand, in practice, the engineers working on the protocol
layout are seven layers down in the weeds and completely forget that
the endian-ness that they take for granted is not the only possibility
and the documents that they generate can turn out to be ambiguous as a
result. Somebody apparently noticed that with respect to RFC 791.
> Instead, RFC 793 relies on RFC 791, and says NOTHING about byte
> ordering. If you downloaded a file containing a bunch of binary data
> yesterday, however that file was partitioned, you know a priori that
> each binary quantity *must* be big endian.
ROFLOL
Well, that remark makes it clear that this conversation is a waste of
everyone's time.
People exchange little-endian data using FTP every day regardless of
guarantees you think RFC 791 makes or what requirements you think
it imposes in that regard.
> So unless the originator was
> totally clueless, you should NEVER have to worry about some fine print
> explaining "The file is divided into variable length records with 4 byte
> unsigned little-endian byte counts." That should NEVER HAPPEN, if you
> read the RFCs.
Oh yeah. The fellow defining the file layout for an application on
a system that doesn't even have a TCP stack is responsible for reading
RFC 791 and using big-endian field layout on the off chance that
somebody will someday copy the files to a system that does and
some other yahoo will use binary mode FTP in the naive belief
that a byte stream is a byte stream, unaware that the RFC police
will come and get him if that byte stream contains little-endian data.
> As to use of ASCII vs binary, integer vs floating point, all those
> decisions are not discussed in RFC 791 (or 793). They are left to each
> L6 implementation. But the left to right sequence is determined.
No. It was not.
>> In practice, the upper level protocols were designed big-endian
>> because
>> consistency makes sense, not because it was mandated by RFC 791.
>
> That's a nice story, but I'm afraid it's a profession of faith. The RFCs
> are consistent. You may choose to ignore what they say and create a
> different reality. Lucky that in this case, your reality does not
> conflict with the standard.
You are the one reading into the RFC's that which is not there. You
are the one working on faith. You are the one without a clue about
how protocols are designed and how they interoperate.
You are the one claiming that binary mode FTP is a violation of RFC 791
when used on little-endian data files. I doubt you'll find many people
to agree with you on that point.
>> According to you, if I lay an arbitrary protocol on top of IP using
>> GRE, that protocol is now required to have a little-endian convention
>> in its header fields.
>
> According to RFC 791, any protocol layered on top of IP is required to
> gave *big endian*, not little endian.
Indeed. I meant to write "big endian".
> I'm not saying that knuckleheads
> won't violate this simple rule, of course.
I'm saying that only knuckleheads would make up such a stupid rule.
And the folks who wrote RFC 791 were not knuckleheads. Unlike you.
>> RFC 793 is silent on the question of endian-ness. That is an
>> unfortunate
>> oversight.
>
> Yes, you said this before. If you had read RFC 791 before 793, you would
> have noticed that it was not an oversight.
I've read them both. RFC 793 is still silent on the question. And
it does not defer to RFC 791 on the question. And RFC 791 does not
assert authority on the question.
But since you are not a rational person, I'll leave it at that.
-
Re: endian of within a tcp/ip byte tramission
wrote:
>> I'm not saying that knuckleheads
>> won't violate this simple rule, of course.
>
> I'm saying that only knuckleheads would make up such a stupid rule.
> And the folks who wrote RFC 791 were not knuckleheads. Unlike you.
Funny contradiction, since they wrote the rule exactly as I said. Funny
how you feel free to flame away, in spite of specific quotes pointed out
to you. I wonder if you think it makes you sound intelligent.
But I did check RFC 2460, to see if it too specified the byte order to
be used in IPv6, and it does not. Not for headers, not for payload.
Never mentioned anywhere, not even in the address architecture document
for IPv6 (RFC 4291)..
So I asked the IPv6 working group to see if this was some sort of
oversight, and the response was only that big endian byte order is
always assumed in Internet Protocols unless specifically mentioned
otherwise. Furthermore, this Wikipedia quote was suggested by one
working group member as guidance:
http://en.wikipedia.org/wiki/Network_byte_order
"The Internet Protocol defines a standard "big-endian" network byte
order. This byte order is used for all numeric values in the packet
headers and by many higher level protocols and file formats that are
designed for use over IP."
So, it looks like IPv6 leaves some room for knuckleheads.
Bert