public final class StandardAnalyzer extends StopwordAnalyzerBase
StandardTokenizer
with StandardFilter
, LowerCaseFilter
and StopFilter
, using a list of
English stop words.
You may specify the Version
compatibility when creating StandardAnalyzer:
ClassicTokenizer
and ClassicAnalyzer
are the pre-3.1 implementations of StandardTokenizer and
StandardAnalyzer.
Analyzer.GlobalReuseStrategy, Analyzer.PerFieldReuseStrategy, Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_TOKEN_LENGTH
Default maximum allowed token length
|
static CharArraySet |
STOP_WORDS_SET
An unmodifiable set containing some common English words that are usually not
useful for searching.
|
stopwords
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor and Description |
---|
StandardAnalyzer()
Builds an analyzer with the default stop words (
STOP_WORDS_SET ). |
StandardAnalyzer(CharArraySet stopWords)
Builds an analyzer with the given stop words.
|
StandardAnalyzer(Reader stopwords)
Builds an analyzer with the stop words from the given reader.
|
StandardAnalyzer(Version matchVersion)
Deprecated.
|
StandardAnalyzer(Version matchVersion,
CharArraySet stopWords)
Deprecated.
|
StandardAnalyzer(Version matchVersion,
Reader stopwords)
Deprecated.
|
Modifier and Type | Method and Description |
---|---|
protected Analyzer.TokenStreamComponents |
createComponents(String fieldName,
Reader reader)
Creates a new
Analyzer.TokenStreamComponents instance for this analyzer. |
int |
getMaxTokenLength() |
void |
setMaxTokenLength(int length)
Set maximum allowed token length.
|
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet
close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, setVersion, tokenStream, tokenStream
public static final int DEFAULT_MAX_TOKEN_LENGTH
public static final CharArraySet STOP_WORDS_SET
public StandardAnalyzer(CharArraySet stopWords)
stopWords
- stop words@Deprecated public StandardAnalyzer(Version matchVersion, CharArraySet stopWords)
StandardAnalyzer(CharArraySet)
public StandardAnalyzer()
STOP_WORDS_SET
).@Deprecated public StandardAnalyzer(Version matchVersion)
StandardAnalyzer()
public StandardAnalyzer(Reader stopwords) throws IOException
stopwords
- Reader to read stop words fromIOException
WordlistLoader.getWordSet(Reader)
@Deprecated public StandardAnalyzer(Version matchVersion, Reader stopwords) throws IOException
StandardAnalyzer()
IOException
public void setMaxTokenLength(int length)
public int getMaxTokenLength()
setMaxTokenLength(int)
protected Analyzer.TokenStreamComponents createComponents(String fieldName, Reader reader)
Analyzer
Analyzer.TokenStreamComponents
instance for this analyzer.createComponents
in class Analyzer
fieldName
- the name of the fields content passed to the
Analyzer.TokenStreamComponents
sink as a readerreader
- the reader passed to the Tokenizer
constructorAnalyzer.TokenStreamComponents
for this analyzer.Copyright © 2000–2022 The Apache Software Foundation. All rights reserved.