主題中的免費 CRLF:行 - 為什麼它在那裡,它是否合法?
我遇到了一個 NAGIOS 系統向流行的電子郵件到簡訊服務發送電子郵件的問題。電子郵件到簡訊服務接收帶有文本的電子郵件
Subject:
,並將它們發送到To:
欄位中編碼的手機號碼。到現在為止還挺好。可悲的是,sendmail(和它之前的後綴)似乎在(必要長的)行中插入了一個免費的 CRLFSubject:
,這導致我的 SMS 消息在 CRLF 處被截斷,當且僅當該Subject:
行包含一個或多個冒號超過無償的CRLF。我相信這些消息是正確創建的,但可以肯定的是,這是我為自己創建一個完全點頭的測試消息,有很長的
Subject:
一行:echo "foo" | mail -s "1234567 101234567 201234567 301234567 401234567 501234567 601234567 701234567 801234567 90123456789" reaper@teaparty.net
注意這一行沒有多餘的冒號
Subject:
;我在這裡所做的只是表明在電線上插入了一個額外的 CRLF。這是結果sudo ngrep -x port 25
:`44 61 74 65 3a 20 46 72 69 2c 20 33 31 20 4d 61 Date: Fri, 31 Ma
79 20 32 30 31 33 20 31 30 3a 34 33 3a 35 35 20 y 2013 10:43:55
2b 30 31 30 30 0d 0a 54 6f 3a 20 72 65 61 70 65 +0100..To: reape
72 40 74 65 61 70 61 72 74 79 2e 6e 65 74 0d 0a r@teaparty.net..
53 75 62 6a 65 63 74 3a 20 31 32 33 34 35 36 37 Subject: 1234567
20 31 30 31 32 33 34 35 36 37 20 32 30 31 32 33 101234567 20123
34 35 36 37 20 33 30 31 32 33 34 35 36 37 20 34 4567 301234567 4
30 31 32 33 34 35 36 37 20 35 30 31 32 33 34 35 01234567 5012345
36 37 0d 0a 20 36 30 31 32 33 34 35 36 37 20 37 67***..*** 601234567 7
30 31 32 33 34 35 36 37 20 38 30 31 32 33 34 35 01234567 8012345
36 37 20 39 30 31 32 33 34 35 36 37 38 39 0d 0a 67 90123456789..
55 73 65 72 2d 41 67 65 6e 74 3a 20 48 65 69 72 User-Agent: Heir
6c 6f 6f 6d 20 6d 61 69 6c 78 20 31 32 2e 34 20 loom mailx 12.4
37 2f 32 39 2f 30 38 0d 0a 4d 49 4d 45 2d 56 65 7/29/08..MIME-Ve
72 73 69 6f 6e 3a 20 31 2e 30 0d 0a 43 6f 6e 74 rsion: 1.0..Cont
65 6e 74 2d 54 79 70 65 3a 20 74 65 78 74 2f 70 ent-Type: text/p
6c 61 69 6e 3b 20 63 68 61 72 73 65 74 3d 75 73 lain; charset=us`
大約一半(用粗體+斜體標記),在原始標題中的
501234567
和之間,您可以看到插入了一個 CRLF(,在左側的十六進制轉儲上,在右側的純文字上)。601234567``Subject:``0x0d 0x0a``..
接收 MTA 似乎很樂意對此進行後處理,當我查看接收端儲存在磁碟上的郵件時,我在 Subject: 行中只看到一個 LF (0x0a),並且該行被正確解析並且在其整體由,例如,
alpine
。儘管如此,CRLF 線上上,在我和(優秀的)電子郵件到簡訊支持人員之間,我們已經確定這些是問題的原因。所以我的問題是:MTA 在電線上插入免費的 CRLF 是否合法?
如果是這樣,而且我可以證明,那就是電子郵件轉簡訊公司的問題,因為他們不寬容。如果不是,或者是但我無法證明,那麼這就是我的問題,所以參考答案將是最有用的。
編輯:我現在可以清楚地知道有問題的電子郵件到簡訊服務是kapow。一旦向他們解釋了這個問題,他們就明白了,與我一起開發和測試修復程序,並部署了修復程序。我的帶有冒號的長主題行現在可以正確地轉發到 SMSes 中。我通常不吹噓個別公司,尤其是在 SF 上,但我認為值得注意的是 kapow 做了正確的事。(免責聲明:我與 kapow 沒有任何關係,除非作為付費客戶,他們對他們處理問題的方式感到滿意。)
好吧,如果我理解 RFC 822,它們在某些情況下是合法的,我認為這是具有 24x80 解析度的小螢幕時代的產物。
這些部分似乎相當清楚主題可以折疊,折疊是 CRLF 加 LWSP(線性空白)字元..它們可能已被取代,Wietse(在後綴列表上)如果你想知道他的 RFC 由內而外一個確定的答案。
3.1.1. LONG HEADER FIELDS Each header field can be viewed as a single, logical line of ASCII characters, comprising a field-name and a field-body. For convenience, the field-body portion of this conceptual entity can be split into a multiple-line representation; this is called "folding". The general rule is that wherever there may be linear-white-space (NOT simply LWSP-chars), a CRLF immediately followed by AT LEAST one LWSP-char may instead be inserted. Thus, the single line To: "Joe & J. Harvey" <ddd @Org>, JJV @ BBN can be represented as: To: "Joe & J. Harvey" <ddd @ Org>, JJV@BBN and To: "Joe & J. Harvey" <ddd@ Org>, JJV @BBN and To: "Joe & J. Harvey" <ddd @ Org>, JJV @ BBN The process of moving from this folded multiple-line representation of a header field to its single line represen- tation is called "unfolding". Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char. Note: While the standard permits folding wherever linear- white-space is permitted, it is recommended that struc- tured fields, such as those containing addresses, limit folding to higher-level syntactic breaks. For address fields, it is recommended that such folding occur between addresses, after the separating comma. 3.1.2. STRUCTURE OF HEADER FIELDS Once a field has been unfolded, it may be viewed as being com- posed of a field-name followed by a colon (":"), followed by a field-body, and terminated by a carriage-return/line-feed. The field-name must be composed of printable ASCII characters (i.e., characters that have values between 33. and 126., decimal, except colon). The field-body may be composed of any ASCII characters, except CR or LF. (While CR and/or LF may be present in the actual text, they are removed by the action of unfolding the field.) Certain field-bodies of headers may be interpreted according to an internal syntax that some systems may wish to parse. These fields are called "structured fields". Examples include fields containing dates and addresses. Other fields, such as "Subject" and "Comments", are regarded simply as strings of text. Note: Any field which has a field-body that is defined as other than simply <text> is to be treated as a struc- tured field. Field-names, unstructured field bodies and structured field bodies each are scanned by their own, independent "lexical" analyzers. 3.1.3. UNSTRUCTURED FIELD BODIES For some fields, such as "Subject" and "Comments", no struc- turing is assumed, and they are treated simply as <text>s, as in the message body. Rules of folding apply to these fields, so that such field bodies which occupy several lines must therefore have the second and successive lines indented by at least one LWSP-char.
發問者編輯:我希望 NickW 會原諒我添加註釋說明 RFC822 已被 RFC2822 淘汰,但新 RFC 在其第 2.2.3 節中說了幾乎相同的內容,並明確確認這種折疊應該在任何進一步處理完成之前被刪除:
每個標題欄位在邏輯上是單行字元,包括欄位名稱、冒號和欄位正文。然而,為了方便起見,並處理每行 998/78 個字元的限制,標題欄位的欄位主體部分可以拆分為多行表示;這稱為“折疊”。一般規則是,只要該標准允許折疊空格(不僅僅是 WSP 字元),就可以在任何 WSP 之前插入 CRLF。例如,標題欄位:
Subject: This is a test
可以表示為:
Subject: This is a test
注意:雖然結構化欄位體的定義方式使得折疊可以發生在許多詞彙標記之間(甚至在一些詞彙標記內),折疊應該僅限於
將 CRLF 放置在更高級別的句法中斷處。例如,如果欄位主體被定義為逗號分隔值,則建議折疊發生在逗號分隔結構化項目之後,優先於欄位可以折疊的其他位置,即使在其他地方允許折疊。
從標題欄位的這種折疊的多行表示移動到其單行表示的過程稱為“展開”。展開是通過簡單地刪除緊隨 WSP 的任何 CRLF 來完成的。每個標頭欄位都應以其展開的形式進行處理,以進行進一步的句法和語義評估。
這並不是要貶低 NickW 準確無誤地指出我需要知道的內容,只是為了幫助這個答案與將來可能偶然發現它的任何人保持相關性。