编码与解码的介绍

编码解码通常用来：格式转换（例如：中文转换为二进制，二进制转换成中文）、压缩、加密等操作。

序列化与反序列化

序列化是指将数据转换为可存储或者可传输的格式，反序列化是指将序列化之后的数据转换成原始的数据结构或者对象。

UTF-8编码

额外补充：https://blog.csdn.net/csdndn/article/details/79580019
常用应用场景：

文本文件存储与传输
页面内容
右键和通信
API和数据交换

JS实现UTF-8

// 编码
function encodeUtf8(str) {
  var bytes = []
  for (ch of str) {
    // for...of循环，能正确识别 32 位的 UTF-16 字符， 可以查阅资料了解。
    let code = ch.codePointAt(0)
    if (code >= 65536 && code <= 1114111) {// 位运算， 补齐8位
      bytes.push((code >> 18) | 0xf0)
      bytes.push(((code >> 12) & 0x3f) | 0x80)
      bytes.push(((code >> 6) & 0x3f) | 0x80)
      bytes.push((code & 0x3f) | 0x80)
    } else if (code >= 2048 && code <= 65535) {
      bytes.push((code >> 12) | 0xe0)
      bytes.push(((code >> 6) & 0x3f) | 0x80)
      bytes.push((code & 0x3f) | 0x80)
    } else if (code >= 128 && code <= 2047) {
      bytes.push((code >> 6) | 0xc0)
      bytes.push((code & 0x3f) | 0x80)
    } else {
      bytes.push(code)
    }
  }
  return bytes
}

对于逆向来讲一般看到.codePointAt就可以暂时认为是utf-8的编码js逻辑

测试代码：

Python实现UTF-8

python代码实现utf-8编码：

text ='哈'
# 编码
text_encode = text.encode(encoding='utf-8')
print(text_encode)
# 解码
text_decode = text_encode.decode(encoding='utf-8')
print(text_decode)

运行结果：