What you get back from recv
is a bytes
string:
Receive data from the socket. The return value is a bytes object representing the data received.
In Python 3.x, to convert a bytes
string into a Unicode text str
string, you have to know what character set the string is encoded with, so you can call decode
. For example, if it's UTF-8:
stringdata = data.decode('utf-8')
(In Python 2.x, bytes
is the same thing as str
, so you've already got a string. But if you want to get a Unicode text unicode
string, it's the same as in 3.x.)
The reason people often use struct
is that the data isn't just 8-bit or Unicode text, but some other format. For example, you might send each message as a "netstring": a length (as a string of ASCII digits) followed by a :
separator, then length
bytes of UTF-8, then a ,
—such as b"3:Abc,"
. (There are variants on the format, but this is the Bernstein standard netstring.)
The reason people use netstrings, or other similar techniques, is that you need some way to delimit messages when you're using TCP. Each recv
could give you half of what the other side passed with send
, or it could give your 3 send
s and part of the 4th. So, you have to accumulate a buffer of recv
data, and then pull the messages out of it. And you need some way to tell when one message ends and the next begins. If you're just sending plain text messages without any newlines, you can just use newlines as a delimiter. Otherwise, you'll have to come up with something else—maybe netstrings, or using
as a delimiter, or using newlines as a delimiter but escaping actual newlines within the data, or using some self-delimited structured format like JSON.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…