public class PdfParser extends FileParser
FileParser
for "application/pdf" files using PDFBox.context, EMPTY, maxFileSize
Constructor and Description |
---|
PdfParser() |
Modifier and Type | Method and Description |
---|---|
Iterable<Reader> |
doParse(File file)
Template method to parse a
File into manageable chunks. |
parse, setApplicationContext, setMaxFileSize, wrap, wrap
public Iterable<Reader> doParse(File file) throws Exception
FileParser
File
into manageable chunks.
The default implementation reads from the file lazily with chunks
overlapping on the final white space. For example a file with:
The quick brown fox jumps over the lazy dog
might be
parsed to: The quick brown fox jumps
and
jumps over the lazy dog
.
Receives a non-null, readable
File
instance from FileParser.parse(File)
and can return a possible null
Iterable
or throw an Exception
.
In any of the non-successful cases, the FileParser.EMPTY
Iterable
will be returned to the consumer.doParse
in class FileParser
Exception
Version: 5.3.3-ice35-b63
Copyright © 2017 The University of Dundee & Open Microscopy Environment. All Rights Reserved.