VBA解压缩ZIP文件05——Huffman树

时间:2022-07-22
本文章向大家介绍VBA解压缩ZIP文件05——Huffman树,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。

ZIP压缩使用的最重要的一个数据结构应该就是这个Huffman树,在压缩过程的介绍中,提到了h1(编码literal和length)、h2(编码distance)、h3(编码SQ1和SQ2)3颗Huffman,另外还有一颗静态的Huffman树(编码literal和length),一共是4颗。

01

正常Huffman树的创建

正常的Huffman树在创建过程需要2个参数:

  • 记录权值的WeightValues
  • 记录数据的Keys

创建的步骤:

  • 1、根据WeightValues和Keys创建一个节点数组ArrNodes
  • 2、将ArrNodes根据权值WeightValue进行从小到大的排序
  • 3、取前2个数据,也就是最小的2个,分别作为左节点和右节点,生成一个父节点,父节点的权值WeightValue等于他们的权值之和
  • 4、然后将父节点放入ArrNodes
  • 5、重复2-4,直到ArrNodes中剩下一个节点

有兴趣的可以到网上找些资料看看,这里不细说了,因为在ZIP的解压过程中,Huffman树的创建比这个还要简单。

02

ZIP中Huffman的创建

在ZIP中,Huffman树被记录的信息是树的码长Code Length(WeightValues),以及数组下标所对应的数字(Keys)。所以,4颗树在创建的时候,都是传入一个记录Code Length的数组:

'根据码字长度数组,构建Huffman
'CodeLen记录的是每一个下标对应的码长
Private Function CreateHuffman(CodeLen() As Long) As CHuffmanTree
    '构建HuffmanTree的时候,0是不需要的
    Dim arrIndex() As Long
    ReDim arrIndex(UBound(CodeLen)) As Long
    
    '清除0值
    Dim inum As Long
    Dim i As Long
    For i = 0 To UBound(CodeLen)
        If CodeLen(i) > 0 Then
            CodeLen(inum) = CodeLen(i)
            arrIndex(inum) = i
            inum = inum + 1
        End If
    Next
    
    ReDim Preserve CodeLen(inum - 1) As Long
    ReDim Preserve arrIndex(inum - 1) As Long
    
    Dim h As CHuffmanTree
    Set h = New CHuffmanTree
    h.Create CodeLen, arrIndex
    
    Set CreateHuffman = h
    Erase arrIndex
End Function

ZIP中创建Huffman树的特殊之处在于:

  • 这颗树是特殊的,叶子节点都在左边,在压缩过程中已经介绍
  • 这棵树的码长Code Length已知了

根据这2个重要的信息,创建这颗Huffman树的方法就简化了许多:

  • 节点n的Weight初始=2
  • 当节点n的Weight等于2,扩展左子树,节点n的Weight-1,扩展出来的节点Key等于n的Key+1,这时的Key记录的是码长。n指向左子树
  • 当节点n的Weight等于1,扩展右子树,节点n的Weight-1,扩展出来的的节点Key等于n的Key+1,这时的Key记录的是码长。n指向右子树
  • 当节点n的Weight等于0,n指向Parent
  • 当n的Key等于WeightValues中当前最小的那个,说明找到了对应的码长的叶子节点
  • WeightValues和Keys指针往前移动,节点指向parent
  • 重复

下面的动图是一个模拟创建的过程,注意想象节点扩展左子树和右子树的时候,节点的Weight和Key的改变。

类模块CHuffmanTree实现Create:

'创建树结构
Public Function Create(WeightValues() As Long, Keys() As Long) As Long
    Dim inum As Long
    inum = UBound(Keys)
    
    InsertSort WeightValues, Keys, 0, inum
    
    Set root = NewCNode(0, Nothing, Nothing, Nothing)
    
    Dim parr As Long
    Dim tmp As CNode
    
    Dim n As CNode
    Set n = root
    
    Do Until parr = inum + 1
    
        Do Until n.Key = WeightValues(parr)
            If n.Weight = 2 Then
                '新建左子树
                Set tmp = NewCNode(n.Key + 1, Nothing, Nothing, n)
                Set n.Left = tmp
                n.Weight = n.Weight - 1
                Set n = tmp
                
            ElseIf n.Weight = 1 Then
                '新建右子树
                Set tmp = NewCNode(n.Key + 1, Nothing, Nothing, n)
                Set n.Right = tmp
                n.Weight = n.Weight - 1
                Set n = tmp
                
            Else '= 0
                Set n = n.Parent
            End If
            

        Loop
        n.Key = Keys(parr)
        parr = parr + 1
        Set n = n.Parent
    Loop
    
End Function

Private Function InsertSort(WeightValues() As Long, Keys() As Long, Low As Long, High As Long)
    Dim i As Long, j As Long
    Dim ShaoBing As Long, ShaoBing_tmp As Long
    
    '先按arr_code_len排序
    For i = Low + 1 To High
        If WeightValues(i) < WeightValues(i - 1) Then
            ShaoBing = WeightValues(i)             '设置哨兵
            ShaoBing_tmp = Keys(i)
            
            j = i - 1
            Do While WeightValues(j) > ShaoBing
                WeightValues(j + 1) = WeightValues(j)
                Keys(j + 1) = Keys(j)
                j = j - 1
                If j = Low - 1 Then Exit Do
            Loop
            
            WeightValues(j + 1) = ShaoBing
            Keys(j + 1) = ShaoBing_tmp
        End If
    Next i

End Function

Private Function NewCNode(Key As Long, Left As CNode, Right As CNode, Parent As CNode) As CNode
    Set NewCNode = New CNode
    NewCNode.Weight = 2
    NewCNode.Key = Key
    
    Set NewCNode.Left = Left
    Set NewCNode.Right = Right
    Set NewCNode.Parent = Parent
End Function

节点Node也作为了一个类模块,这样添加起来方便一点:

CNode:

Private Type Node
    Value As Long
    Left As CNode
    Right As CNode
    Key As Long
    Parent As CNode
End Type

Private n As Node

Property Let Weight(V As Long)
    n.Value = V
End Property
Property Get Weight() As Long
    Weight = n.Value
End Property

Property Let Key(V As Long)
    n.Key = V
End Property
Property Get Key() As Long
    Key = n.Key
End Property

Property Set Left(V As CNode)
    Set n.Left = V
End Property
Property Get Left() As CNode
    Set Left = n.Left
End Property

Property Set Right(V As CNode)
    Set n.Right = V
End Property
Property Get Right() As CNode
    Set Right = n.Right
End Property

Property Set Parent(V As CNode)
    Set n.Parent = V
End Property
Property Get Parent() As CNode
    Set Parent = n.Parent
End Property