Easy-to-follow video tutorials help you learn software, creative, and business skills.Become a member
Along with Microsoft Excel integration, ColdFusion 9 now also gives you the ability to convert Word documents to PDF and extract the data from these files to be used elsewhere. In this video I am going to show you how to take a Word document, convert it to a PDF, and then pull all the text of those files so that we can manipulate it as we see fit. So to start, in your Chapter7 folder you see there is a word.docx file. I want to go and open that up and this is just the default file I generated from Microsoft's new Project Gallery.
It's a simple resume and inside a table we have got Objectives, Experience, Education, etcetera. It has a header, a bunch of text inside of it, and we are going to take this Word document and convert it to a PDF. So I'll go and close this and open up word.cfm. So we are going to use the cfdocument tag to read in the content of that Word file. So we'll say cfdocument and we are going to generate the PDF. We are going to save the PDF as convertedWordDoc.pdf.
And we are going to specify the file that we want to read it, which in this case is going to be Word.docx in the current folder. We'll save our file and preview it and we don't get any feedback but if I go over here at Chapter7 and refresh, you can now see I have a convertedwordDoc.pdf. If I open that up, it looks awfully close to the Word document. I am, however, missing the header. If I scroll down here. It did pull in-- my footer has got page numbers.
And on the second page I have some bad bookmarks and other kind of special stuff that was in my Word document. So, this is not perfect. You are not going to get 100% copy of your Word document into the PDF. Bookmarks and other special Word formatting may not come through perfectly. So make sure you that you test all of your integration before you post something like this out to production. But now that we have our PDF, I can now read in all the text from this using some new actions in the cfpdf tag.
So go back to CF Builder, go to our source code and let's just comment out that section. And we are now going to use our cfpdf tag action="extracttext". And I will specify the source that I want to pull that content from. So we are going to use our convertedwordDoc.pdf. I will tell it how I want that data returned. I can choose string or xml, we'll choose string to start with, tell it which variable I want to place that content in and then specify which pages I want to read.
And in this case lets just do page 1. Because if you noticed before, page 2 in our PDF just had a whole bunch of gobbledygook and missing bookmarks. So I'll choose page 1 and we'll dump out myWordContent. So if I preview this, you can now see I have all of the content of page 1 as plain text. If I go back over to my source code, I can say xml and then we'll use the XMLParse function here to turn this into a proper xml object, and preview that.
And now we can see we have our TextPerPage, the individual pageNumber, and all the content for that page. If you were going to read in more than one page, go back to our source code and say *. We can do 1 through 2 or 1, 3, 5 just to read in the odd-numbered pages and you can specify the number of pages however you want in there. We'll save that, preview it again and this time we get a node for each of the individual pages in our document.
While we still don't have the same type of deep integration in Word documents as we do in Excel spreadsheets, the ability to pull in content from Word and repurpose it in your applications is a huge advantage in ColdFusion 9.
Get unlimited access to all courses for just $25/month.Become a member
82 Video lessons · 101922 Viewers
61 Video lessons · 88646 Viewers
71 Video lessons · 72452 Viewers
56 Video lessons · 104146 Viewers
Access exercise files from a button right under the course name.
Search within course videos and transcripts, and jump right to the results.
Remove icons showing you already watched videos if you want to start over.
Make the video wide, narrow, full-screen, or pop the player out of the page into its own window.
Click on text in the transcript to jump to that spot in the video. As the video plays, the relevant spot in the transcript will be highlighted.
Your file was successfully uploaded.