<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Indexing PDF Documents with Zend_Search_Lucene</title>
	<atom:link href="http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/</link>
	<description></description>
	<lastBuildDate>Tue, 15 Nov 2011 15:00:31 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Save</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-95966</link>
		<dc:creator>Save</dc:creator>
		<pubDate>Thu, 26 May 2011 07:54:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-95966</guid>
		<description>Hai,
First of all thanks for this wonderful article..

How can i pdf to plain text content? please help me</description>
		<content:encoded><![CDATA[<p>Hai,<br />
First of all thanks for this wonderful article..</p>
<p>How can i pdf to plain text content? please help me</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Felix Capdeville</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-94586</link>
		<dc:creator>Felix Capdeville</dc:creator>
		<pubDate>Wed, 18 May 2011 12:24:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-94586</guid>
		<description>Great info thanks! What language is close to php as far as functionality and ease of use?</description>
		<content:encoded><![CDATA[<p>Great info thanks! What language is close to php as far as functionality and ease of use?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sheira</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-67754</link>
		<dc:creator>Sheira</dc:creator>
		<pubDate>Mon, 13 Dec 2010 16:45:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-67754</guid>
		<description>Hi,

I&#039;m just trying to test your Code but I have a problem with xpdf .
I&#039;m not sure to do it well. I downloaded the win 32 version of xpdf and copied the folder in my document path but&#039;it doesn&#039;t work.


$pdf_filename = &quot;celia&quot;;
// get pdf information
$output = shell_exec (&quot;pdfinfo &quot;.$pdf_filename.&quot;.pdf&quot;);
//Gets the metadata
$data = explode(&quot;\n&quot;, $output); //puts it into an array

        
//Gets the metadata
$data = explode(&quot;\n&quot;, $output); //puts it into an array
print_r($data); 

print_r($data) returns nothing.

Can you help me please?</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>I&#8217;m just trying to test your Code but I have a problem with xpdf .<br />
I&#8217;m not sure to do it well. I downloaded the win 32 version of xpdf and copied the folder in my document path but&#8217;it doesn&#8217;t work.</p>
<p>$pdf_filename = &#8220;celia&#8221;;<br />
// get pdf information<br />
$output = shell_exec (&#8220;pdfinfo &#8220;.$pdf_filename.&#8221;.pdf&#8221;);<br />
//Gets the metadata<br />
$data = explode(&#8220;\n&#8221;, $output); //puts it into an array</p>
<p>//Gets the metadata<br />
$data = explode(&#8220;\n&#8221;, $output); //puts it into an array<br />
print_r($data); </p>
<p>print_r($data) returns nothing.</p>
<p>Can you help me please?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: hind14</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-49248</link>
		<dc:creator>hind14</dc:creator>
		<pubDate>Sun, 04 Jul 2010 02:20:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-49248</guid>
		<description>if someone can tell me when we should put the code (in a controller, view ...) so when you talk about &#039;path&#039;? that is to say the installation path xpdf or whatever, need your suggestions plz plz</description>
		<content:encoded><![CDATA[<p>if someone can tell me when we should put the code (in a controller, view &#8230;) so when you talk about &#8216;path&#8217;? that is to say the installation path xpdf or whatever, need your suggestions plz plz</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ZEMZEMI</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-45915</link>
		<dc:creator>ZEMZEMI</dc:creator>
		<pubDate>Mon, 10 May 2010 19:05:07 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-45915</guid>
		<description>@Farrelly - thank you for your help; I want to know if I can retireve keywords and abstract with XPDF.</description>
		<content:encoded><![CDATA[<p>@Farrelly &#8211; thank you for your help; I want to know if I can retireve keywords and abstract with XPDF.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: farrelley</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-44546</link>
		<dc:creator>farrelley</dc:creator>
		<pubDate>Wed, 28 Apr 2010 15:12:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-44546</guid>
		<description>@ZEMZEMI - I&#039;m not sure if XPDF gets the metadata or not.  if it does you can just store them in the index. and retrieve them when you loop through the hits.</description>
		<content:encoded><![CDATA[<p>@ZEMZEMI &#8211; I&#8217;m not sure if XPDF gets the metadata or not.  if it does you can just store them in the index. and retrieve them when you loop through the hits.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ZEMZEMI</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-44463</link>
		<dc:creator>ZEMZEMI</dc:creator>
		<pubDate>Mon, 26 Apr 2010 23:35:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-44463</guid>
		<description>Hi,
Thanks you for your source code, It was very helpful for me. But, I have some questions : did XPDF retrieve Keywords Metadata ? I don&#039;t think so because I tested your code with XPDF library and it shows authors, title, date...
I want to konw if I can retrieve keywords with zend_search_lucene.

thanks</description>
		<content:encoded><![CDATA[<p>Hi,<br />
Thanks you for your source code, It was very helpful for me. But, I have some questions : did XPDF retrieve Keywords Metadata ? I don&#8217;t think so because I tested your code with XPDF library and it shows authors, title, date&#8230;<br />
I want to konw if I can retrieve keywords with zend_search_lucene.</p>
<p>thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Slavi</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-16852</link>
		<dc:creator>Slavi</dc:creator>
		<pubDate>Fri, 12 Dec 2008 20:41:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-16852</guid>
		<description>Hi,

Thanks for sharing your thoughts on Zend_Search_Lucene.
I want to point to this:

...
# $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword(&#039;id&#039;, $i)); //Stores the ID  
...

src: http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html 
------ quote ---
 Nevertheless it&#039;s a good idea not to use &#039;id&#039; and &#039;score&#039; names to avoid ambiguity in QueryHit properties names.

The Zend_Search_Lucene_Search_QueryHit id and score properties always refer to internal Lucene document id and hit score. If the indexed document has the same stored fields, you have to use the getDocument() method to access them.
------ quote ---</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>Thanks for sharing your thoughts on Zend_Search_Lucene.<br />
I want to point to this:</p>
<p>&#8230;<br />
# $doc-&gt;addField(Zend_Search_Lucene_Field::Keyword(&#8216;id&#8217;, $i)); //Stores the ID<br />
&#8230;</p>
<p>src: <a href="http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html" rel="nofollow">http://framework.zend.com/manual/en/zend.search.lucene.best-practice.html</a><br />
&#8212;&#8212; quote &#8212;<br />
 Nevertheless it&#8217;s a good idea not to use &#8216;id&#8217; and &#8216;score&#8217; names to avoid ambiguity in QueryHit properties names.</p>
<p>The Zend_Search_Lucene_Search_QueryHit id and score properties always refer to internal Lucene document id and hit score. If the indexed document has the same stored fields, you have to use the getDocument() method to access them.<br />
&#8212;&#8212; quote &#8212;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: esca</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-13549</link>
		<dc:creator>esca</dc:creator>
		<pubDate>Fri, 15 Aug 2008 10:31:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-13549</guid>
		<description>hi...

i already read this article and run it with my own PDF files, 
but that i want to know
is it zend lucene already have Zend_Pdf_FileParser subclass
is it the same with your articles ???

can u explain

thanx before</description>
		<content:encoded><![CDATA[<p>hi&#8230;</p>
<p>i already read this article and run it with my own PDF files,<br />
but that i want to know<br />
is it zend lucene already have Zend_Pdf_FileParser subclass<br />
is it the same with your articles ???</p>
<p>can u explain</p>
<p>thanx before</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: farrelley</title>
		<link>http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/comment-page-1/#comment-10533</link>
		<dc:creator>farrelley</dc:creator>
		<pubDate>Tue, 25 Mar 2008 16:00:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.kapustabrothers.com/2008/01/20/indexing-pdf-documents-with-zend_search_lucene/#comment-10533</guid>
		<description>@amin2u: Yes you have to put the .exe in the exe stmt.  You can&#039;t just copy the same code that I put out because that was optimzed for a unix install.   You need to get the correct exe files from pdftotxt and use correct paths and exe&#039;s</description>
		<content:encoded><![CDATA[<p>@amin2u: Yes you have to put the .exe in the exe stmt.  You can&#8217;t just copy the same code that I put out because that was optimzed for a unix install.   You need to get the correct exe files from pdftotxt and use correct paths and exe&#8217;s</p>
]]></content:encoded>
	</item>
</channel>
</rss>

