Vedastr：基于PyTorch的场景文本识别工具箱

1. Vedastr的特性

STR(Scene Text Recognition)的任务是识别文字框中的内容。我们在做关于STR的项目时发现，相关开源的toolbox较少，调研后发现现有的toolbox存在以下几个问题：

基于此，我们在调研了一系列的STR相关论文的基础上，构建了一个基于PyTorch的STR的toolbox——Vedastr，具有以下几个特性：

Vedastr项目链接：

https://github.com/Media-Smart/vedastr

2. Vedastr的运行方式

2.1 配置文件

Vedastr提供了开放式接口，可以在config文件配置相关参数。比如，我们配置optimizer和learning rate scheduler的参数：

optimizer = dict(type='Adam', lr=0.001)
lr_scheduler = dict(type='StepLR', max_epochs=3, milestones=[100000, 200000])

2.2 Train、test和demo

python tools/train.py config-path

python tools/test.py config-path checkpoint-path

python tools/demo.py config-path checkpoint-path img-path

3. 预训练模型

3.1 使用Vedastr复现的模型性能

Vedastr目前支持基于attention、ctc、fc和transformer的str方法。我们复现了几个STR模型，你可以在Benchmark and model zoo找到他们。下面是我们的一些复现指标：

TPS-ResNet-BiLSTM-Attention：What Is Wrong With Scene Text Recognition Model Comparisons?
Small-SATRN：On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

3.2 使用预训练模型实现一个demo

举个简单的例子：

python tools/demo.py configs/tps-resnet-bilstm-attention TPS-ResNet-BiLSTM-Attention.pth input-img

input-img:

终端窗口：

Vedastr项目链接如下，欢迎使用和star！

https://github.com/Media-Smart/vedastr