Deep Learning methods for checkboxes detection and classification

An example of the checkboxes section we want to analyze. The model has to predict a state for each one of the checkboxes : is it ticked or not?

The “heading straight to disaster” approach

For each textual zone, compute Levenshtein ratios to identify the number it is associated with. Then, use coordinates (X, Y) and dimensions (dx, dy) of the bounding box to compute some relative offset and locate the zones that contain the checkboxes associated with the given label.
Classify them all!
Sometime, you just didn’t pick the right tool for the right job!
Various kinds of expositions : straight, rotated, with perspective, scanned, physically deformed. The offset methods described above would not do well with such inputs

The “we should have done it from the beginning” approach

An input image for the new architecture

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store