<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Indexing PDF Documents with Zend_Search_Lucene</title>
	<atom:link href="http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/</link>
	<description>Come and read the thoughts about everything and anything from politics to web development.  Shaun Farrell has been the founder and creator of kapustabrothers.com since it was registered some time ago.  As a web developer, and programmer Shaun brings thought to his posts had takes a stance on a lot of issues.  Come and read his thoughts and comments of others.</description>
	<lastBuildDate>Sun, 10 Jan 2010 09:02:42 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Slavi</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-16852</link>
		<dc:creator>Slavi</dc:creator>
		<pubDate>Fri, 12 Dec 2008 20:41:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-16852</guid>
		<description>Hi,

Thanks for sharing your thoughts on Zend_Search_Lucene.
I want to point to this:

...
# $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword(&#039;id&#039;, $i)); //Stores the ID  
...

src: http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html 
------ quote ---
 Nevertheless it&#039;s a good idea not to use &#039;id&#039; and &#039;score&#039; names to avoid ambiguity in QueryHit properties names.

The Zend_Search_Lucene_Search_QueryHit id and score properties always refer to internal Lucene document id and hit score. If the indexed document has the same stored fields, you have to use the getDocument() method to access them.
------ quote ---</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>Thanks for sharing your thoughts on Zend_Search_Lucene.<br />
I want to point to this:</p>
<p>&#8230;<br />
# $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword(&#8216;id&#8217;, $i)); //Stores the ID<br />
&#8230;</p>
<p>src: <a href="http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html" rel="nofollow">http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html</a><br />
&#8212;&#8212; quote &#8212;<br />
 Nevertheless it&#8217;s a good idea not to use &#8216;id&#8217; and &#8217;score&#8217; names to avoid ambiguity in QueryHit properties names.</p>
<p>The Zend_Search_Lucene_Search_QueryHit id and score properties always refer to internal Lucene document id and hit score. If the indexed document has the same stored fields, you have to use the getDocument() method to access them.<br />
&#8212;&#8212; quote &#8212;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: esca</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-13549</link>
		<dc:creator>esca</dc:creator>
		<pubDate>Fri, 15 Aug 2008 10:31:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-13549</guid>
		<description>hi...

i already read this article and run it with my own PDF files, 
but that i want to know
is it zend lucene already have Zend_Pdf_FileParser subclass
is it the same with your articles ???

can u explain

thanx before</description>
		<content:encoded><![CDATA[<p>hi&#8230;</p>
<p>i already read this article and run it with my own PDF files,<br />
but that i want to know<br />
is it zend lucene already have Zend_Pdf_FileParser subclass<br />
is it the same with your articles ???</p>
<p>can u explain</p>
<p>thanx before</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: farrelley</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10533</link>
		<dc:creator>farrelley</dc:creator>
		<pubDate>Tue, 25 Mar 2008 16:00:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10533</guid>
		<description>@amin2u: Yes you have to put the .exe in the exe stmt.  You can&#039;t just copy the same code that I put out because that was optimzed for a unix install.   You need to get the correct exe files from pdftotxt and use correct paths and exe&#039;s</description>
		<content:encoded><![CDATA[<p>@amin2u: Yes you have to put the .exe in the exe stmt.  You can&#8217;t just copy the same code that I put out because that was optimzed for a unix install.   You need to get the correct exe files from pdftotxt and use correct paths and exe&#8217;s</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amin2u</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10519</link>
		<dc:creator>amin2u</dc:creator>
		<pubDate>Tue, 25 Mar 2008 02:04:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10519</guid>
		<description>do i have to specified the exe in exec stmt?</description>
		<content:encoded><![CDATA[<p>do i have to specified the exe in exec stmt?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: farrelley</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10227</link>
		<dc:creator>farrelley</dc:creator>
		<pubDate>Thu, 13 Mar 2008 11:16:44 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10227</guid>
		<description>That&#039;s correct.  You need to download the .exe files form the site I posted above.  If you downloaded the source that I included with the project you will just get the linux versions of the pdftotext.</description>
		<content:encoded><![CDATA[<p>That&#8217;s correct.  You need to download the .exe files form the site I posted above.  If you downloaded the source that I included with the project you will just get the linux versions of the pdftotext.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amin2u</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10223</link>
		<dc:creator>amin2u</dc:creator>
		<pubDate>Thu, 13 Mar 2008 08:03:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10223</guid>
		<description>i just have problems with the path...and today it&#039;s still not work.. suppose the pdftotext will have EXE extension rite?</description>
		<content:encoded><![CDATA[<p>i just have problems with the path&#8230;and today it&#8217;s still not work.. suppose the pdftotext will have EXE extension rite?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: farrelley</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10153</link>
		<dc:creator>farrelley</dc:creator>
		<pubDate>Mon, 10 Mar 2008 02:45:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10153</guid>
		<description>Yes.. I have used it on windows.  I will have to look at the code again and get back to you.</description>
		<content:encoded><![CDATA[<p>Yes.. I have used it on windows.  I will have to look at the code again and get back to you.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amin2u</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10152</link>
		<dc:creator>amin2u</dc:creator>
		<pubDate>Mon, 10 Mar 2008 01:42:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10152</guid>
		<description>have u test the searchtxt.php on windows platform? Is it working? I still got problem with fopen.</description>
		<content:encoded><![CDATA[<p>have u test the searchtxt.php on windows platform? Is it working? I still got problem with fopen.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: farrelley</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10126</link>
		<dc:creator>farrelley</dc:creator>
		<pubDate>Fri, 07 Mar 2008 11:16:23 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10126</guid>
		<description>admin2u: I would index your files with Zend and tag them with keywords. You can save each keyword in a document field for that pdf in the index. 

As for XPDF on windows - go here http://www.foolabs.com/xpdf/download.html and download the win32 version.   your php scripts will have to call it with a system_exec or exec functions, becuase you need to call the xpdf with a DOS command.

As for your error it looks like the error is in the fopen() function call.  Make sure your path is correct.

Good luck!</description>
		<content:encoded><![CDATA[<p>admin2u: I would index your files with Zend and tag them with keywords. You can save each keyword in a document field for that pdf in the index. </p>
<p>As for XPDF on windows &#8211; go here <a href="http://www.foolabs.com/xpdf/download.html" rel="nofollow">http://www.foolabs.com/xpdf/download.html</a> and download the win32 version.   your php scripts will have to call it with a system_exec or exec functions, becuase you need to call the xpdf with a DOS command.</p>
<p>As for your error it looks like the error is in the fopen() function call.  Make sure your path is correct.</p>
<p>Good luck!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amin2u</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10122</link>
		<dc:creator>amin2u</dc:creator>
		<pubDate>Fri, 07 Mar 2008 09:15:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10122</guid>
		<description>Fatal error: Uncaught exception &#039;Zend_Search_Lucene_Exception&#039; with message &#039;fopen(C:/wamp/www//pdf_index//segments) [&lt;a href=&#039;function.fopen&#039; rel=&quot;nofollow&quot;&gt;function.fopen&lt;/a&gt;]: failed to open stream: No such file or directory&#039; in C:\wamp\www\pdf_index\Zend\Search\Lucene\Storage\File\Filesystem.php:64 Stack trace: #0 C:\wamp\www\pdf_index\Zend\Search\Lucene\Storage\Directory\Filesystem.php(338): Zend_Search_Lucene_Storage_File_Filesystem-&gt;__construct(&#039;C:/wamp/www//pd...&#039;) #1 C:\wamp\www\pdf_index\Zend\Search\Lucene.php(235): Zend_Search_Lucene_Storage_Directory_Filesystem-&gt;getFileObject(&#039;segments&#039;) #2 C:\wamp\www\pdf_index\Zend\Search\Lucene.php(182): Zend_Search_Lucene-&gt;__construct(&#039;C:/wamp/www//pd...&#039;, false) #3 C:\wamp\www\pdf_index\searchtxt.php(11): Zend_Search_Lucene::open(&#039;C:/wamp/www//pd...&#039;) #4 {main} thrown in C:\wamp\www\pdf_index\Zend\Search\Lucene\Storage\File\Filesystem.php on line 64

I got the following errors, where to fix the error eh?</description>
		<content:encoded><![CDATA[<p>Fatal error: Uncaught exception &#8216;Zend_Search_Lucene_Exception&#8217; with message &#8216;fopen(C:/wamp/www//pdf_index//segments) [<a href='function.fopen' rel="nofollow">function.fopen</a>]: failed to open stream: No such file or directory&#8217; in C:\wamp\www\pdf_index\Zend\Search\Lucene\Storage\File\Filesystem.php:64 Stack trace: #0 C:\wamp\www\pdf_index\Zend\Search\Lucene\Storage\Directory\Filesystem.php(338): Zend_Search_Lucene_Storage_File_Filesystem-&gt;__construct(&#8216;C:/wamp/www//pd&#8230;&#8217;) #1 C:\wamp\www\pdf_index\Zend\Search\Lucene.php(235): Zend_Search_Lucene_Storage_Directory_Filesystem-&gt;getFileObject(&#8217;segments&#8217;) #2 C:\wamp\www\pdf_index\Zend\Search\Lucene.php(182): Zend_Search_Lucene-&gt;__construct(&#8216;C:/wamp/www//pd&#8230;&#8217;, false) #3 C:\wamp\www\pdf_index\searchtxt.php(11): Zend_Search_Lucene::open(&#8216;C:/wamp/www//pd&#8230;&#8217;) #4 {main} thrown in C:\wamp\www\pdf_index\Zend\Search\Lucene\Storage\File\Filesystem.php on line 64</p>
<p>I got the following errors, where to fix the error eh?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
