Module elipdotter::source ·
The index (lookup table of words) lives here.
enables multiple types of indices to be defined.
The only one (for now) is
Simple. That stores a list of all the documents
which contains each word in the input data (e.g. web pages). It then fetches those documents
again and finds occurrences within those.
DocumentMap makes it performant to get the document ID from name and vice versa.
- Wrapper for representing
Tas only containing alphanumeric characters.
Occurenceis part of an AND, these can be associated to tell where the other parts of the AND chain are.
- Map of documents and their
Ids to quickly get name from id and vice versa.
- Id of a document.
- Index which keeps track of all occurrences of all words.
- The docs this word exists in. Each doc has an associated
LosslessDocOccurrenceswhich keeps track of all the occurrences in that document.
- The occurrences of a word in this document.
- Get occurrences of a word (or similar words) from this
- A list of missing occurrences collected when searching for occurrences using
- An occurrence of
- Needed to index a custom struct in maps. We have to have the same type, so this acts as both the borrowed and owned.
Eqisn’t implemented as you’d probably want to check which document it belongs to as well.
- Allows to insert words and remove occurrences from documents.
- Returns the next valid UTF-8 character.