Text in Natural Images

Text in natural images carries rich and high-level information. Examples including text on product labels, receipts, and traffic signs. Reading text in images facilitates many real world applications, e.g. retrieving images, parsing product labels, translate foreign text (like the Google Translate app).

The challenges of reading such text include handling the large variations in text and noisy backgrounds, detecting long lines of oriented text, and handling multi-lingual text, etc.

At SE(3), we are interested in two of the most fundamental and important problems of this topic: 1) Building large-scale datasets for training and evaluating algorithms; 2) Developing robust and flexible text detection and recognition algorithms.


  • Detecting Oriented Text in Natural Images by Linking Segments Most state-of-the-art text detection methods are specific to horizontal Latin text and are not fast enough for real-time applications. We introduce Segment Linking (SegLink), an oriented text detection method. The main idea is to decompose text into two locally detectable elements, namely segments and links. A segment is an oriented box covering a part of ...
  • COCO-Text: Dataset for Text Detection and Recognition The COCO-Text V2 dataset is out. Check out our brand new website! Check out the ICDAR2017 Robust Reading Challenge on COCO-Text!  COCO-Text is a new large scale dataset for text detection and recognition in natural images. Version 1.3 of the dataset is out! 63,686 images, 145,859 text instances, 3 fine-grained text attributes. This dataset is based on the MSCOCO dataset. Text localizations as bounding ...
  • GrOCR GrOCR is an ongoing research project for word recognition in unconstrained images. The name is derived from the original impetus of the project, OCR for reading text on products found in grocery stores. While the focus of the project has moved beyond just that domain, the name has remained the same. More info on: http://vision.ucsd.edu/~kai/grocr/ Papers