Apache PDF Box is an open source Java PDF library used for PDF creation and manipulation. PDF Box is developed entirely by a group of volunteers and is published under the Apache License 2.0. It's simple format makes it relatively simple to use. PDF creation and text extraction are amongst the most basic functions of the program. With the right instructions you'll have the ability to compile your own information into PDF files that are easy to share online.
- Skill level:
- Moderately Easy
Other People Are Reading
Create a empty PDF document by typing the following stream code in one line: "document = new PDDocument();" (do not include quotation marks when typing any of the stream codes).
Add a page to the empty PDF document by typing the following command on a separate line: "PDPage blank Page = new PDPage();" press "Enter" on the keyboard and write the next line of code its own line, "document.addPage( blank Page );."
Save the blank PDF file and use it as a template for PDF file creation in the future by typing the following command line on it's own line: "document.save("BlankPage.pdf");."
Close the document to make it permanent by typing the following command line: "document.close();."
PDF Creation - Blank Page
Open the blank PDF file template you just created and fill it in with text by changing and adding a few lines to the code. On the second line of code, "PDPage blank Page = new PDPage();" change "blankPage" to "page."
Press "Enter" on the keyboard to go to the next line and add the following line of code: "document.addPage( page );."
Press "Enter" on the keyboard and create a font description using the standard PDF font type by writing the following line of code: "PDFont font = PDType1Font.HELVETICA_BOLD;."
Press "Enter" on the keyboard and create a content stream by typing in the following command line: "PDPageContentStream content Stream = new PDPageContentStream(document, page);."
Press "Enter" on the keyboard and define the content, font and position of the text by typing the following command lines: "contentStream.beginText(); contentStream.setFont( font, 12 ); contentStream.moveTextPositionByAmount( 100, 700 ); contentStream.drawString( "Type in your text here" ); contentStream.endText();" press "Enter" on the keyboard after every semicolon.
Close the content stream line by typing the following command line "contentStream.close();."
Add the title of your PDF file in the save document command line, "document.save("BlankPage.pdf");" by replacing "BlankPage" with your own file name.
Keep the document close line as is to keep the document sealed.
PDF Creation - File With Text
Open PDF Box and type in the following command line to transform an existing PDF file into a simple text document: "Document luceneDocument = LucenePDFDocument.getDocument( insert PDF file name here );."
Extract the text you need from the PDF file from the simple text document by highlighting it, click the "Right" mouse button and select "Copy" from the menu. Paste the extracted text into a document by clicking the "Right" mouse button and selecting "Paste" from the menu.
Extract a specific line of text immediately from an existing PDF file by typing the following command lines: "PDFTextStripper stripper = new PDFTextStripper(); stripper.setStartPage( specify start page here, for example, 16 ); stripper.setEndPage( specify end page here, for example, 23 ); stripper.writeText( ... ); press "Enter" on the keyboard after each semicolon.
PDF Text Extraction From Existing PDF File
Tips and warnings
- See the Resource Section for information about more advanced and complex functions of PDF Box.
- 20 of the funniest online reviews ever
- 14 Biggest lies people tell in online dating sites
- Hilarious things Google thinks you're trying to search for