Descripció del projecte

Textual content in human environments conveys important high-level semantic information that is not available in any other form in the scene. Interpreting written information in human environments is essential for performing most everyday tasks like making a purchase, using public transportation, finding a place in the city, etc.

This research project will focus on the detection, extraction and understanding of textual information in real scene images in unconstrained conditions. In particular, the candidate will research into novel methods for text tracking in videos, and for improving recognition rates by taking advantage of the temporal information.

The PhD candidate will address the problem both as in an end-to-end recognition scenario, and a word-spotting sceario. Lightweight, real-time solutions will be of particular interest.