Lucene is a gem in the open-source worlda¹-a highly scalable, fast search engine. It delivers performance and is disarmingly easy to use. Lucene in Action is the authoritative guide to Lucene. It describes how to index your data, including types you definitely need to know such as MS Word, PDF, HTML, and XML. It introduces you to searching, sorting, filtering, and highlighting search results. Lucene powers search in surprising placesa¹-in discussion groups at Fortune 100 companies, in commercial issue trackers, in email search from Microsoft, in the Nutch web search engine (that scales to billions of pages). It is used by diverse companies including Akamai, Overture, Technorati, HotJobs, Epiphany, FedEx, Mayo Clinic, MIT, New Scientist Magazine, and many others. Adding search to your application can be easy. With many reusable examples and good advice on best practices, Lucene in Action shows you how. What's Inside - How to integrate Lucene into your applications - Ready-to-use framework for rich document handling - Case studies including Nutch, TheServerSide, jGuru, etc. - Lucene ports to Perl, Python, C#/.Net, and C++ - Sorting, filtering, term vectors, multiple, and remote index searching - The new SpanQuery family, extending query parser, hit collecting - Performance testing and tuning - Lucene add-ons (hit highlighting, synonym lookup, and others)7.5.1 Using POI POI is a Jakarta project; you can find it at http://jakarta.apache. org/poi. Ita#39;s a highly active project whose goal is to provide a Java API for manipulation of various file formats based on Microsofta#39;s OLE 2 Compound Documentanbsp;...
|Title||:||Lucene in action|
|Author||:||Otis Gospodnetić, Erik Hatcher|
|Publisher||:||Manning Publications - 2005|