Module elipdotter::index
source · Expand description
The index (lookup table of words) lives here.
The trait Provider
(and OccurenceProvider
)
enables multiple types of indices to be defined.
The only one (for now) is Simple
. That stores a list of all the documents
which contains each word in the input data (e.g. web pages). It then fetches those documents
again and finds occurrences within those.
The DocumentMap
makes it performant to get the document ID from name and vice versa.
Structs
- Wrapper for representing
T
as only containing alphanumeric characters. - If
Occurence
is part of an AND, these can be associated to tell where the other parts of the AND chain are. - Map of documents and their
Id
s to quickly get name from id and vice versa. - Id of a document.
- Index which keeps track of all occurrences of all words.
- The docs this word exists in. Each doc has an associated
LosslessDocOccurrences
which keeps track of all the occurrences in that document. - The occurrences of a word in this document.
- Get occurrences of a word (or similar words) from this
Lossless
index. - A list of missing occurrences collected when searching for occurrences using
SimpleOccurences
. - An occurrence of
crate::Query
. - Needed to index a custom struct in maps. We have to have the same type, so this acts as both the borrowed and owned.
Eq
isn’t implemented as you’d probably want to check which document it belongs to as well.
Traits
- Allows to insert words and remove occurrences from documents.
Functions
- Returns the next valid UTF-8 character.