2012-04-27 72 views
2

我有一個Ruby on Rails應用程序,它是Java和.Net應用程序的服務器。我有一個自定義標頭用於發送一些數據,但是當這些數據到達Ruby on Rails應用程序時,Rails將該值作爲UTF-8讀取,然後表示值不是是有效的UTF-8字符串。什麼是用於HTTP請求標題值的文本編碼?

舉例來說,如果我送JÜRGENELITE-HP我得到:

#<ActiveRecord::StatementInvalid: PGError: ERROR: invalid byte sequence for encoding "UTF8": 0xdc52 
: SELECT * FROM "replicas" WHERE ("replicas"."identification" = 'J?RGENELITE-HP') AND ("replicas".user_id = 121) LIMIT 1> 

Java的HTTP客戶端庫清楚正確打印在控制檯中的數據:

DEBUG [main] (DefaultClientConnection.java:268) - >> POST /ze/api/files.json HTTP/1.1 
DEBUG [main] (DefaultClientConnection.java:271) - >> X-Replica: JÜRGENELITE-HP 
DEBUG [main] (DefaultClientConnection.java:271) - >> Authorization: Basic bWxpbmhhcmVzOjEyMzQ1Njc4 

DEBUG [main] (DefaultClientConnection.java:271) - >> Content-Length: 0 
DEBUG [main] (DefaultClientConnection.java:271) - >> Host: localhost:3000 
DEBUG [main] (DefaultClientConnection.java:271) - >> Connection: Keep-Alive 
DEBUG [main] (DefaultClientConnection.java:271) - >> User-Agent: Apache-HttpClient/4.1.2 (java 1.5) 

但是,當它到達它打破了軌道。 HTTP用什麼編碼來編碼標題值?

+1

您可以到[這裏的解決方案]利用(http://stackoverflow.com/questions/1361604/how-to-encode-utf8-filename-for-http-headers-python-django)來解決你想要做的事情。 – 2012-04-27 19:43:15

+1

[這裏也有一些很好的答案。](http://stackoverflow.com/questions/324470/http-headers-encoding-decoding-in-java) – 2012-04-27 20:11:17

回答

2

US-ASCII

如果你看看RFC2616 2.2節:

2.2基本規則

以下規則在整個說明書中用來
描述了基本的解析結構。 US-ASCII編碼字符集
由ANSI X3.4-1986 [21]定義。

OCTET   = <any 8-bit sequence of data> 
    CHAR   = <any US-ASCII character (octets 0 - 127)> 
    UPALPHA  = <any US-ASCII uppercase letter "A".."Z"> 
    LOALPHA  = <any US-ASCII lowercase letter "a".."z"> 
    ALPHA   = UPALPHA | LOALPHA 
    DIGIT   = <any US-ASCII digit "0".."9"> 
    CTL   = <any US-ASCII control character 
        (octets 0 - 31) and DEL (127)> 
    CR    = <US-ASCII CR, carriage return (13)> 
    LF    = <US-ASCII LF, linefeed (10)> 
    SP    = <US-ASCII SP, space (32)> 
    HT    = <US-ASCII HT, horizontal-tab (9)> 
    <">   = <US-ASCII double-quote mark (34)> 

的所述節的其餘部分具有大約頭和協議的其它元素的更具體信息。

你不得不跳過規範來找到所有正確的BNF定義。 4.2節包含頭文件的定義,雖然:

message-header = field-name ":" [ field-value ] 
    field-name  = token 
    field-value = *(field-content | LWS) 
    field-content = <the OCTETs making up the field-value 
        and consisting of either *TEXT or combinations 
        of token, separators, and quoted-string> 

TEXT在2.2節中定義回:

TEXT   = <any OCTET except CTLs, 
        but including LWS>