NAME

Image::Leptonica::Func::pageseg

VERSION

version 0.04

pageseg.c

pageseg.c

   Top level page segmentation
       l_int32   pixGetRegionsBinary()

   Halftone region extraction
       PIX      *pixGenHalftoneMask()

   Textline extraction
       PIX      *pixGenTextlineMask()

   Textblock extraction
       PIX      *pixGenTextblockMask()

   Location of page foreground
       PIX      *pixFindPageForeground()

   Extraction of characters from image with only text
       l_int32   pixSplitIntoCharacters()
       BOXA     *pixSplitComponentWithProfile()

FUNCTIONS

pixFindPageForeground

BOX * pixFindPageForeground ( PIX *pixs, l_int32 threshold, l_int32 mindist, l_int32 erasedist, l_int32 pagenum, l_int32 showmorph, l_int32 display, const char *pdfdir )

pixFindPageForeground()

    Input:  pixs (full resolution (any type or depth)
            threshold (for binarization; typically about 128)
            mindist (min distance of text from border to allow
                     cleaning near border; at 2x reduction, this
                     should be larger than 50; typically about 70)
            erasedist (when conditions are satisfied, erase anything
                       within this distance of the edge;
                       typically 30 at 2x reduction)
            pagenum (use for debugging when called repeatedly; labels
                     debug images that are assembled into pdfdir)
            showmorph (set to a negative integer to show steps in
                       generating masks; this is typically used
                       for debugging region extraction)
            display (set to 1  to display mask and selected region
                     for debugging a single page)
            pdfdir (subdirectory of /tmp where images showing the
                    result are placed when called repeatedly; use
                    null if no output requested)
    Return: box (region including foreground, with some pixel noise
                 removed), or null if not found

Notes:
    (1) This doesn't simply crop to the fg.  It attempts to remove
        pixel noise and junk at the edge of the image before cropping.
        The input @threshold is used if pixs is not 1 bpp.
    (2) There are several debugging options, determined by the
        last 4 arguments.
    (3) If you want pdf output of results when called repeatedly,
        the pagenum arg labels the images written, which go into
        /tmp/<pdfdir>/<pagenum>.png.  In that case,
        you would clean out the /tmp directory before calling this
        function on each page:
            lept_rmdir(pdfdir);
            lept_mkdir(pdfdir);

pixGenHalftoneMask

PIX * pixGenHalftoneMask ( PIX *pixs, PIX **ppixtext, l_int32 *phtfound, l_int32 debug )

pixGenHalftoneMask()

    Input:  pixs (1 bpp, assumed to be 150 to 200 ppi)
            &pixtext (<optional return> text part of pixs)
            &htfound (<optional return> 1 if the mask is not empty)
            debug (flag: 1 for debug output)
    Return: pixd (halftone mask), or null on error

pixGenTextblockMask

PIX * pixGenTextblockMask ( PIX *pixs, PIX *pixvws, l_int32 debug )

pixGenTextblockMask()

    Input:  pixs (1 bpp, textline mask, assumed to be 150 to 200 ppi)
            pixvws (vertical white space mask)
            debug (flag: 1 for debug output)
    Return: pixd (textblock mask), or null on error

Notes:
    (1) Both the input masks (textline and vertical white space) and
        the returned textblock mask are at the same resolution.
    (2) The result is somewhat noisy, in that small "blocks" of
        text may be included.  These can be removed by post-processing,
        using, e.g.,
           pixSelectBySize(pix, 60, 60, 4, L_SELECT_IF_EITHER,
                           L_SELECT_IF_GTE, NULL);

pixGenTextlineMask

PIX * pixGenTextlineMask ( PIX *pixs, PIX **ppixvws, l_int32 *ptlfound, l_int32 debug )

pixGenTextlineMask()

    Input:  pixs (1 bpp, assumed to be 150 to 200 ppi)
            &pixvws (<return> vertical whitespace mask)
            &tlfound (<optional return> 1 if the mask is not empty)
            debug (flag: 1 for debug output)
    Return: pixd (textline mask), or null on error

Notes:
    (1) The input pixs should be deskewed.
    (2) pixs should have no halftone pixels.
    (3) Both the input image and the returned textline mask
        are at the same resolution.

pixGetRegionsBinary

l_int32 pixGetRegionsBinary ( PIX *pixs, PIX **ppixhm, PIX **ppixtm, PIX **ppixtb, l_int32 debug )

pixGetRegionsBinary()

    Input:  pixs (1 bpp, assumed to be 300 to 400 ppi)
            &pixhm (<optional return> halftone mask)
            &pixtm (<optional return> textline mask)
            &pixtb (<optional return> textblock mask)
            debug (flag: set to 1 for debug output)
    Return: 0 if OK, 1 on error

Notes:
    (1) It is best to deskew the image before segmenting.
    (2) The debug flag enables a number of outputs.  These
        are included to show how to generate and save/display
        these results.

pixSplitComponentWithProfile

BOXA * pixSplitComponentWithProfile ( PIX *pixs, l_int32 delta, l_int32 mindel, PIX **ppixdebug )

pixSplitComponentWithProfile()

    Input:  pixs (1 bpp, exactly one connected component)
            delta (distance used in extrema finding in a numa; typ. 10)
            mindel (minimum required difference between profile minimum
                    and profile values +2 and -2 away; typ. 7)
            &pixdebug (<optional return> debug image of splitting)
    Return: boxa (of c.c. after splitting), or null on error

Notes:
    (1) This will split the most obvious cases of touching characters.
        The split points it is searching for are narrow and deep
        minimima in the vertical pixel projection profile, after a
        large vertical closing has been applied to the component.

pixSplitIntoCharacters

l_int32 pixSplitIntoCharacters ( PIX *pixs, l_int32 minw, l_int32 minh, BOXA **pboxa, PIXA **ppixa, PIX **ppixdebug )

pixSplitIntoCharacters()

    Input:  pixs (1 bpp, contains only deskewed text)
            minw (minimum component width for initial filtering; typ. 4)
            minh (minimum component height for initial filtering; typ. 4)
            &boxa (<optional return> character bounding boxes)
            &pixa (<optional return> character images)
            &pixdebug (<optional return> showing splittings)

    Return: 0 if OK, 1 on error

Notes:
    (1) This is a simple function that attempts to find split points
        based on vertical pixel profiles.
    (2) It should be given an image that has an arbitrary number
        of text characters.
    (3) The returned pixa includes the boxes from which the
        (possibly split) components are extracted.

AUTHOR

Zakariyya Mughal <zmughal@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Zakariyya Mughal.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.