CTC
-
Pattern Recognition-2020,引用数:3:Reinterpreting CTC training as iterative fitting
- 探讨CTC数学原理,将CTC Loss解释为交叉熵损失,较为理论
-
ECCV-2020,引用数:2:Variational Connectionist Temporal Classification
- 提出变分CTC来增强网络对于非blank符号的学习
-
AAAI-2020,引用数:27:GTC: Guided Training of CTC towards Efficient and Accurate Scene Text Recognition
- 训练的时候用Attention分支辅助CTC,测试的时候只用CTC
-
IEEE Access-2019,引用数:30:Natural Scene Text Recognition Based on Encoder-Decoder Framework
- Attention的解码时候的对齐是没有限制的,故引入CTC对Attention的对齐进行监督
-
ECCV-2018, 引用数:69:Synthetically supervised feature learning for scene text recognition
-
NIPS-2018,引用数:25:Connectionist Temporal Classification with Maximum Entropy Regularization
- 解决CTC中的Spiky Distribution Problem, 使用最大熵来限制CTC学习,较为理论
-
NIPS-2017, 引用数:105:Gated recurrent convolution neural network for OCR
- GRCNN
-
Pattern Recognition-2017, 引用数:99:Accurate recognition of words in scenes without character segmentation using recurrent neural network
-
BMVC-2016,引用数:136:STAR-Net: A spatial attention residue network for scene text recognition
-
TPAMI-2016,引用数:1497:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- 场景文字识别开山之作,引入CTC将识别变为一个序列到序列的问题。
Attention
- IJCAI-2022:SVTR: Scene Text Recognition with a Single Visual Model
- ICDAR2021:Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition
- Electronics 2021: TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance
- Patter Recognition-2021,引用数:23:Master: Multi-aspect non-local network for scene text recognition
- ECCV-2020, 引用数:27:Robustscanner: Dynamically enhancing positional clues for robust text recognition
- CVPR-2020, 引用数:42:SCATTER: selective context attentional scene text recognizer
- 多阶段
- CVPRWorkshop-2020, 引用数:28:On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
- Transformer encoder + decoder, 引入自适应的2D位置编码,对于旋转文字,多行文字有较好的鲁棒性
- AAAI-2020, 引用数:28:Textscanner: Reading characters in order for robust scene text recognition
- 实例分割
- AAAI-2020, 引用数:67:Decoupled attention network for text recognition
- 考虑到注意力机制对于长文本的飘移,采用UNet架构来直接生成注意力图,将注意力图和解码之间解耦开
- Neural Computing-2020, 引用数:17:Adaptive embedding gate for attention-based scene text recognition
- ICCV-2019, 引用数:204:What is wrong with scene text recognition model comparisons? dataset and model analysis
- 框架型文章,值得一读
- AAAI-2019,引用数:128:Show, attend and read: A simple and strong baseline for irregular text recognition
- 2D Attention
- 在Attention计算过程中引入门控机制
- ICDAR-2019, 引用数:45NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition
- 使用Transformer encoder和decoder
- ACM MM-2018, 引用数:44:Attention and language ensemble for scene text recognition with convolutional sequence modeling
- CVPR-2018, 引用数:93:Edit probability for scene text recognition
- 现有的attention方法采用最大似然损失函数,本文探讨输出概率分布和预测间的关系
- AAAI-2018, 引用数:103:Char-Net: A character-aware neural network for distorted scene text recognition
- 加入单字检测分支。
- AAAI-2018, 引用数:50:SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
- 半监督端到端识别。
- TPAMI-2018, 引用数:329:ASTER: An Attentional Scene Text Recognizer
with Flexible Rectification
- Attention+Rectification经典之作
- Neural Computing-2018, 引用数:41:Reading scene text with fully convolutional sequence modeling
- Attention算法使用RNN建模,计算复杂并且较难训练,本文使用全卷积网络来捕获全局信息,比BiLSTM更加有效
- CVPR-2018,引用数:196:AON: Towards arbitrarily-oriented text recognition
- 关注于不规则文字的识别
- ICCV-2017, 引用数:290:Focusing attention: Towards accurate text recognition in natural images
- attention存在注意力飘移问题,引入Focusing Network把飘移的注意力抓回来
- IJCAI-2017, 引用数:124:Learning to read irregular text with attention mechanisms
- CVPR-2016, 引用数:370:Recursive recurrent nets with attention modeling for OCR in the wild
Transformer
Rectification Model
-
BMCV-2021:An Adaptive Rectification Model for Arbitrary-Shaped Scene Text Recognition
- 提出新的矫正方法,在弯曲文本上效果好于TPS和MORAN
-
ICCV-2019, 引用数:77:Symmetry-constrained Rectification Network for Scene Text Recognition
- 带限制的矫正网络
-
CVPR-2019, 引用数:165:ESIR: End-to-end scene text recognition via iterative image rectification
- 迭代矫正
-
TPAMI-2018, 引用数:329:ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
- 引入TPS变换进行矫正
-
Pattern Recognition-2018, 引用数:161MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition
- 任意方向矫正,效果比ASTER出色
-
CVPR-2016, 引用数:415:Robust Scene Text Recognition With Automatic Rectification
- TPS矫正+Attention
Language Model
- BMVC-2022:Visual-semantic transformer for scene text recognition
- BMVC-2022:Parallel and Robust Text Rectifier for Scene Text Recognition
- ICFHR-2022:A Vision Transformer Based Scene Text Recognizer with Multi-grained Encoding and Decoding
- TIP-2022:PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition
- ACCESS-2022:Scene Text Recognition with Semantics
- AAAI-2022:Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
- ECCV-2022:Levenshtein OCR
- ECCV-2022:Multi-Granularity Prediction for Scene Text Recognition
- ECCV-2022:Scene Text Recognition with Permuted Autoregressive Sequence Models
- AAAI-2022:Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
- 提升语言模型对于任意现状文本的识别能力
- arXiv-2021/12/1:Visual-Semantic Transformer for Scene Text Recognition
- arXiv-2021/11/30: Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features
- 探讨语言模型和视觉模型如何更好的结合,比肩ABINet,取得SOTA
- ICCV-2021,引用数:1 From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network
- VisionLAN
- 提出了一个新的遮挡文字数据集
- 弱监督的将语言模型融入进视觉模型中
- ICCV-2021,引用数:1 Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition
- 多阶段+transformer识别器
- 引入gumbel softmax,解决视觉到语义不可导问题
- CVPR-2021,Oral,引用数:1 Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
- ABINet
- 屠榜作品,在SRN基础进行改良,从人类阅读的角度进行思考
- CVPR-2020, 引用数:58 Towards accurate scene text recognition with semantic reasoning networks
- SRN
- 引入Transformer进行语言建模,视觉+语言模型,取得SOTA效果
- CVPR-2020,引用数:59 Seed: Semantics enhanced encoder-decoder framework for scene text recognition
- SEED
- 在BiLSTM第一个单元前输入语义信息
- 首次尝试把语言模型引入场景文字识别中
Dataset
-
CVPR-2020, 引用数:21:UnrealText: Synthesizing realistic scene text images from the unreal world
- 虚幻引擎来渲染文字图像
-
CVPR-2016, 引用数:979:Synthetic data for text localisation in natural images
- SynthText数据集
- NIPS-2014, 引用数:737:Synthetic data and artificial neural net?works for natural scene text recognition
- MJ数据集
Data Augmentation
- Arxiv-2021/11/17 TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
- 使用Permuted AdaIN进行数据增广
- CVPR-2020,引用数:32 Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
- 利用仿射变化对图像进行数据增广,有效提升准确率
Survey
- IJCV-2021, 引用数:156:Scene text detection and recognition: The deep learning era
- ACM Computing Surveys-2020, 引用数:28: Text Recognition in the Wild: A Survey
- TPAMI-2015, 引用数:682: Text detection and recognition in imagery: A survey
- Frontiers of Computer Science-2016, 引用数:299: Scene text detection and recognition: Recent advances and future trends
Self Supervise
- ICCV-2023: Self-supervised Character-to-Character Distillation for Text Recognition
- ICCV-2023: Revisiting Scene Text Recognition: A Data Perspective
- WACV-2023:Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition
- AAAI-2022: Perceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition
- WACV-2023:Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition
- arXiv-2022:MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining
- ACM-MM22:Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition
- arXiv-2022:Multimodal Semi-Supervised Learning for Text Recognition
- CVPR-2022:SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization
- 生成式对比学习
- AAAI-2022:Context-based Contrastive Learning for Scene Text Recognition
- 对比学习用于场景文字识别
- CVPR-2021, 引用数:5:Sequence-to-Sequence Contrastive Learning for Text Recognition
- 首次在STR中引入对比学习的方法
Super Resolution
-
ICCV-2023: A Benchmark for Chinese-English Scene Text Image Super-resolution
-
TIP-2023:Text prior guided scene text image super-resolution
-
CVPR-2022:A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution
-
AAAI-2022:Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
- 场景文本超分
-
arXiv-2021/12/16:TRIG: Transformer-Based Text Recognizer with Initial Embedding Guidance
- TPS + Transformer Encoder + Attention Decoder的组合。
- 场景文字超分,引入笔画级别的监督
-
CVPR2021:Scene Text Telescope: Text-Focused Scene Image Super-Resolution
-
ECCV2020:PlugNet: Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-Resolution Unit
- 可插拔超分模组
Diffusion models
- arXiv-2023:GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation
- arXiv-2023:TextDiffuser: Diffusion Models as Text Painters
- arXiv-2023:DiffUTE: Universal Text Editing Diffusion Model
- arXiv-2023:GlyphControl: Glyph Conditional Control for Visual Text Generation
Others
- PR-2023:Towards open-set text recognition via label-to-prototype learning
- ECCV-2022:TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
- TMM-2022:Dual Relation Network for Scene Text Recognition
- ECCV-2022:Background-Insensitive Scene Text Recognition with Text Semantic Segmentation
- arXiv-2022:A Scene-Text Synthesis Engine Achieved Through Learning from Decomposed Real-World Data
- arXiv-2022:Scene Text Recognition with Single-Point Decoding Network
- ECCV-2022:Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
- arXiv-2022:Invariant Autoencoders for Text Recognition and Document Enhancement
- arXiv-2022:Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching
- 通过搜索训练参数来提升现有模型性能
- arXiv-2022:Text-DIAE: Degradation Invariant Autoencoders for Text Recognition and Document Enhancement
- arXiv-2022:Invariant Autoencoders for Text Recognition and Document Enhancement
- arXiv-2022:Towards Open-Set Text Recognition via Label-to-Prototype Learning
- 当测试阶段遇到训练集中没有出现过的字符时,应该如何应对,场景文字识别中的开集问题
- AAAI-2022:FedOCR: Efficient and Secure Federated Learning for Scene Text Recognition
- 联邦学习用于场景文字识别
- AAAI-2021,引用数::SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition
- 图片送入网络前先在颜色上进行矫正
- arXiv-2021:Revisiting Classification Perspective on Scene Text Recognition
- 把文本识别当作一个图像分类任务
- arXiv-2020:Hamming OCR: A Locality Sensitive Hashing Neural Network for Scene Text Recognition
- 当识别种类数增大时,softmax embedding层就会更大,计算量也就会增大。本文提出使用汉明编码来进行解码,而不是使用one-hot进行解码
- ICCV Workshop-2021:Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark
- 构建了一个多域的中文数据集,定义为Domain Adaptation问题
- ICCV-2021, 引用数:3:Towards the Unseen: Iterative Text Recognition by Distilling from Errors
- 重复学习预测错误的样本
- CVPR-2021, 引用数:1:Primitive Representation Learning for Scene Text Recognition
- 表征学习
- CVPR-2021, 引用数:1:What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels
- 如果只用真实数据训练识别网络会怎样?
- ACM MM-2020, 引用数:6:Exploring Font-independent Features for Scene Text Recognition
- 考虑STR中字体风格的问题,用GAN将字体归一化进行识别
- CVPR-2020, 引用数:15:What Machines See Is Not What They Get: Fooling Scene Text Recognition Models with Adversarial Text Images
- 探究STR中的对抗攻击问题
- CVPR-2020, 引用数:11:On Vocabulary Reliance in Scene Text Recognition
- 在合成数据集上训练的识别器有字典依赖问题,本文探讨相关解决对策
- IJCV-2020, 引用数:6:Separating content from style using adversarial learning for recognizing text in the wild
- 使用对抗生成网络把文字从背景中分离出来进行识别
- CVPR-2019,引用数:56:Sequence-to-sequence domain adaptation network for robust text image recognition
- 探讨STR中的域自适应问题
- CVPR-2019, 引用数:49:Aggregation Cross-Entropy for Sequence Recognition
- 提出了一种全新的聚合交叉熵损失,通过计数的方法做序列识别,速度很快
CVPR
- CVPR2022:Pushing the Performance Limit of Scene Text Recognizer without Human Annotation
- CVPR2022:Syntax-Aware Network for Handwritten Mathematical Expression Recognition
- CVPR2022:Open-set Text Recognition via Character-Context Decoupling
- CVPR2022:A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution
- 场景文字超分
- CVPR2022:SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Normalization
- 生成式的对比学习
- CVPR2021:Scene Text Telescope: Text-Focused Scene Image Super-Resolution
- CVPR-2021, 引用数:5:Sequence-to-Sequence Contrastive Learning for Text Recognition
- 首次在STR中引入对比学习的方法
- CVPR-2021, 引用数:1:Primitive Representation Learning for Scene Text Recognition
- 表征学习
- CVPR-2021, 引用数:1:What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels
- 如果只用真实数据训练识别网络会怎样?
- CVPR-2021,Oral,引用数:1 Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
- ABINet
- 屠榜作品,在SRN基础进行改良,从人类阅读的角度进行思考
- CVPR-2020, 引用数:42:SCATTER: selective context attentional scene text recognizer
- 多阶段
- CVPR-2020, 引用数:58 Towards accurate scene text recognition with semantic reasoning networks
- SRN
- 引入Transformer进行语言建模,视觉+语言模型,取得SOTA效果
- CVPR-2020,引用数:59 Seed: Semantics enhanced encoder-decoder framework for scene text recognition
- SEED
- 在BiLSTM第一个单元前输入语义信息
- 首次尝试把语言模型引入场景文字识别中
- CVPR-2020, 引用数:21:UnrealText: Synthesizing realistic scene text images from the unreal world
- 虚幻引擎来渲染文字图像
- CVPR-2020,引用数:32 Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition
- 利用仿射变化对图像进行数据增广,有效提升准确率
- CVPR-2020, 引用数:15:What Machines See Is Not What They Get: Fooling Scene Text Recognition Models with Adversarial Text Images
- 探究STR中的对抗攻击问题
- CVPR-2020, 引用数:11:On Vocabulary Reliance in Scene Text Recognition
- CVPRWorkshop-2020, 引用数:28:On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
- Transformer encoder + decoder, 引入自适应的2D位置编码,对于旋转文字,多行文字有较好的鲁棒性
- 在合成数据集上训练的识别器有字典依赖问题,本文探讨相关解决对策
- CVPR-2019, 引用数:165:ESIR: End-to-end scene text recognition via iterative image rectification
- 迭代矫正
- CVPR-2019,引用数:56:Sequence-to-sequence domain adaptation network for robust text image recognition
- 探讨STR中的域自适应问题
- CVPR-2019, 引用数:49:Aggregation Cross-Entropy for Sequence Recognition
- 提出了一种全新的聚合交叉熵损失,通过计数的方法做序列识别,速度很快
- CVPR-2018, 引用数:93:Edit probability for scene text recognition
- 现有的attention方法采用最大似然损失函数,本文探讨输出概率分布和预测间的关系
- CVPR-2018,引用数:196:AON: Towards arbitrarily-oriented text recognition
- 关注于不规则文字的识别
- CVPR-2016, 引用数:370:Recursive recurrent nets with attention modeling for OCR in the wild
- CVPR-2016, 引用数:415:Robust Scene Text Recognition With Automatic Rectification
- TPS矫正+Attention
- CVPR-2016, 引用数:979:Synthetic data for text localisation in natural images
- SynthText数据集
ICCV
- ICCV-2023: Self-supervised Character-to-Character Distillation for Text Recognition
- ICCV-2023: A Benchmark for Chinese-English Scene Text Image Super-resolution
- ICCV-2023: MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition
- ICCV-2023: Revisiting Scene Text Recognition: A Data Perspective
- ICCV Workshop-2021:Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark
- 构建了一个多域的中文数据集,定义为Domain Adaptation问题
- ICCV-2021, 引用数:3:Towards the Unseen: Iterative Text Recognition by Distilling from Errors
- 重复学习预测错误的样本
- ICCV-2021,引用数:1 From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network
- VisionLAN
- 提出了一个新的遮挡文字数据集
- 弱监督的将语言模型融入进视觉模型中
- ICCV-2021,引用数:1 Joint Visual Semantic Reasoning: Multi-Stage Decoder for Text Recognition
- 多阶段+transformer识别器
- 引入gumbel softmax,解决视觉到语义不可导问题
- ICCV-2019, 引用数:204:What is wrong with scene text recognition model comparisons? dataset and model analysis
- 框架型文章,值得一读
- ICCV-2019, 引用数:77:Symmetry-constrained Rectification Network for Scene Text Recognition
- 带限制的矫正网络
- ICCV-2017, 引用数:290:Focusing attention: Towards accurate text recognition in natural images
- attention存在注意力飘移问题,引入Focusing Network把飘移的注意力抓回来
ECCV
- ECCV-2022:TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers
- ECCV-2022: Pure Transformer with Integrated Experts for Scene Text Recognition
- ECCV-2022:Background-Insensitive Scene Text Recognition with Text Semantic Segmentation
- ECCV-2022:Levenshtein OCR
- ECCV-2022:Multi-Granularity Prediction for Scene Text Recognition
- ECCV-2022:SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition
- ECCV-2022:Scene Text Recognition with Permuted Autoregressive Sequence Models
- ECCV-2020, 引用数:27:Robustscanner: Dynamically enhancing positional clues for robust text recognition
- ECCV-2020,引用数:2:Variational Connectionist Temporal Classification
- 提出变分CTC来增强网络对于非blank符号的学习
- ECCV-2018, 引用数:69:Synthetically supervised feature learning for scene text recognition
AAAI
- AAAI-2022: Perceiving Stroke-Semantic Context: Hierarchical Contrastive Learning for Robust Scene Text Recognition
- AAAI-2022:Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
- AAAI-2022:Context-based Contrastive Learning for Scene Text Recognition
- 对比学习用于场景文字识别
- AAAI-2022:FedOCR: Efficient and Secure Federated Learning for Scene Text Recognition
- AAAI-2022:Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition
- 提升语言模型对于任意现状文本的识别能力
- AAAI-2022:Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution
- 场景文字超分,引入笔画级别的监督
- AAAI-2021,引用数::SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition
- 图片送入网络前先在颜色上进行矫正
- AAAI-2020,引用数:27:GTC: Guided Training of CTC towards Efficient and Accurate Scene Text Recognition
- 训练的时候用Attention分支辅助CTC,测试的时候只用CTC
- AAAI-2020, 引用数:28:Textscanner: Reading characters in order for robust scene text recognition
- 实例分割
- AAAI-2020, 引用数:67:Decoupled attention network for text recognition
- 考虑到注意力机制对于长文本的飘移,采用UNet架构来直接生成注意力图,将注意力图和解码之间解耦开
- AAAI-2019,引用数:128:Show, attend and read: A simple and strong baseline for irregular text recognition
- 2D Attention
- 在Attention计算过程中引入门控机制
- AAAI-2018, 引用数:103:Char-Net: A character-aware neural network for distorted scene text recognition
- 加入单字检测分支。
- AAAI-2018, 引用数:50:SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
- 半监督端到端识别。
NIPS
- NIPS-2018,引用数:25:Connectionist Temporal Classification with Maximum Entropy Regularization
- NIPS-2017, 引用数:105:Gated recurrent convolution neural network for OCR
- GRCNN
- NIPS-2014, 引用数:737:Synthetic data and artificial neural net?works for natural scene text recognition
- MJ数据集
Others
- BMVC-2022:Visual-semantic transformer for scene text recognition
- BMVC-2022:Parallel and Robust Text Rectifier for Scene Text Recognition
- ICFHR-2022:A Vision Transformer Based Scene Text Recognizer with Multi-grained Encoding and Decoding
- ECCV-2022:Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
- ECCV-2022:SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition
- Attention的解码时候的对齐是没有限制的,故引入CTC对Attention的对齐进行监督
- ACM MM-2018, 引用数:44:Attention and language ensemble for scene text recognition with convolutional sequence modeling
- IJCAI-2017, 引用数:124:Learning to read irregular text with attention mechanisms
- BMVC-2016,引用数:136:STAR-Net: A spatial attention residue network for scene text recognition
TPAMI
- TPAMI-2018, 引用数:329:ASTER: An Attentional Scene Text Recognizer
with Flexible Rectification
- Attention+Rectification经典之作
- TPAMI-2016,引用数:1497:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- 场景文字识别开山之作,引入CTC将识别变为一个序列到序列的问题。
Pattern Recognition
- PR-2023:Towards open-set text recognition via label-to-prototype learning
- Patter Recognition-2021,引用数:23:Master: Multi-aspect non-local network for scene text recognition
- Pattern Recognition-2020,引用数:3:Reinterpreting CTC training as iterative fitting
- 探讨CTC数学原理,将CTC Loss解释为交叉熵损失,较为理论
- Pattern Recognition-2018, 引用数:161MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition
- 任意方向矫正,效果比ASTER出色
- Pattern Recognition-2017, 引用数:99:Accurate recognition of words in scenes without character segmentation using recurrent neural network
- ICCV-2023: MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition
- IJCAI-2023:TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition
- IJCAI-2022:Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition
- ICDAR-2023:Scene Text Recognition with Image-Text Matching-guided Dictionary
- arXiv-2023:Improving Scene Text Recognition for Character-Level Long-Tailed Distribution
- TIP-2023:Text prior guided scene text image super-resolution
- Neurocomputing-2023:DPF-S2S: A novel dual-pathway-fusion-based sequence-to-sequence text recognition model
- WACV-2023:Seq-UPS: Sequential Uncertainty-aware Pseudo-label Selection for Semi-Supervised Text Recognition
- arXiv-2023:CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
- ECCVW-2022:On calibration of scene-text recognition models
- Others:STR transformer: a cross-domain transformer for scene text recognition
- BMCV-2022: Masked Vision-Language Transformers for Scene Text Recognition
- Applied intelligence:Scene text recognition based on two-stage attention and multi-branch feature fusion module
- ICPR-2022: Portmanteauing Features for Scene Text Recognition
- ACCESS-2022:Scene Text Recognition with Semantics
- TIP-2022:PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition
- TMM-2022:Dual Relation Network for Scene Text Recognition
- IJCV-2021, 引用数:156:Scene text detection and recognition: The deep learning era
- Neural Computing-2020, 引用数:17:Adaptive embedding gate for attention-based scene text recognition
- IEEE Access-2019,引用数:30:Natural Scene Text Recognition Based on Encoder-Decoder Framework
- Neural Computing-2018, 引用数:41:Reading scene text with fully convolutional sequence modeling
- Attention算法使用RNN建模,计算复杂并且较难训练,本文使用全卷积网络来捕获全局信息,比BiLSTM更加有效