public class LuceneIndexToDict
extends Object
SpellWritingAnalyzer or SpellWritingFilter) since that will
grab non-stored as well as stored fields. Still, if that isn't an option or
if you simply want to test out spelling correction, after-the-fact dictionary
creation may be useful.| Constructor and Description |
|---|
LuceneIndexToDict() |
| Modifier and Type | Method and Description |
|---|---|
static void |
createDict(Directory indexDir,
File dictDir)
Read a Lucene index and make a spelling dictionary from it.
|
static void |
createDict(Directory indexDir,
File dictDir,
ProgressTracker prog)
Read a Lucene index and make a spelling dictionary from it.
|
static void |
createDict(IndexReader indexReader,
Analyzer analyzer,
SpellWriter spellWriter,
ProgressTracker prog)
Read a Lucene index and make a spelling dictionary from it.
|
static void |
main(String[] args)
Command-line interface for build a dictionary directly from a Lucene index
without writing any code.
|
static void |
queueWords(IndexReader reader,
Analyzer analyzer,
SpellWriter writer,
ProgressTracker prog)
Re-tokenize all the words in stored fields within a Lucene index,
and queue them to a spelling dictionary.
|
public static void createDict(Directory indexDir,
File dictDir)
throws IOException
StopAnalyzer.ENGLISH_STOP_WORDS).indexDir - directory containing the Lucene indexdictDir - directory to receive the spelling dictionaryIOExceptionpublic static void createDict(Directory indexDir,
File dictDir,
ProgressTracker prog)
throws IOException
StopAnalyzer.ENGLISH_STOP_WORDS).indexDir - directory containing the Lucene indexdictDir - directory to receive the spelling dictionaryprog - tracker called periodically to display progressIOExceptionpublic static void createDict(IndexReader indexReader,
Analyzer analyzer,
SpellWriter spellWriter,
ProgressTracker prog)
throws IOException
StopAnalyzer.ENGLISH_STOP_WORDS).indexReader - used to read fields from a Lucene indexanalyzer - used to tokenize fields from the index; generally,
this should do minimal filtering, taking care to avoid substantive
token modification (such as stemming or depluralization). A good
choice is MinimalAnalyzer.spellWriter - receives words to be added to the dictionaryprog - tracker called periodically to display progressIOExceptionpublic static void queueWords(IndexReader reader,
Analyzer analyzer,
SpellWriter writer,
ProgressTracker prog)
throws IOException
reader - used to read fields from a Lucene indexanalyzer - used to tokenize fields from the index; generally,
this should do minimal filtering, taking care to avoid substantive
token modification (such as stemming or depluralization). A good
choice is MinimalAnalyzer.writer - receives words to be added to the dictionaryprog - tracker called periodically to display progressIOExceptionpublic static void main(String[] args)