Docfd

TUI multiline fuzzy document finder

Think interactive grep for text files, PDFs, DOCXs, etc, but word/token based instead of regex and line based, so you can search across lines easily.

Docfd aims to provide good UX via integration with common text editors and PDF viewers, so you can jump directly to a search result with a single key press.


Navigating repo:


Quick search with non-interactive mode:


Navigating PDF and opening it to the closest location to the selected search result via PDF viewer integration:

Features

Text editor integration

Docfd uses the text editor specified by $VISUAL (this is checked first) or $EDITOR.

Docfd opens the file at first line of search result for the following editors:

PDF viewer integration

Docfd guesses the default PDF viewer based on the output of xdg-mime query default application/pdf, and invokes the viewer either directly or via flatpak depending on where the desktop file can be first found in the list of directories specified by $XDG_DATA_DIRS.

Docfd opens the file at first page of the search result and starts a text search of the most unique word of the matched phrase within the same page for the following viewers:

Docfd opens the file at first page of the search result for the following viewers:

Installation

Statically linked binaries are available via GitHub releases.

Docfd is also packaged on:

Notes for packagers: Outside of the OCaml toolchain for building (if you are packaging from source), Docfd also requires the following external tools at run time for full functionality:

Launching

Read from piped stdin
command | docfd

Docfd uses single file view when source of document is piped stdin.

No paths should be supplied as arguments in this case. If any paths are specified, then stdin is ignored.

Scan for files
docfd [PATH]...

The list of paths can contain directories. Each directory in the list is scanned recursively for files with one of the following extensions by default:

You can change the file extensions to use via --exts, or add onto the list of extensions via --add-exts.

If the list PATHs is empty, then Docfd defaults to scanning the current directory ..

If any of the file ends with .pdf, then pdftotext is required to continue.

If exactly one file is specified in the list of paths, then Docfd uses single file view. Otherwise, Docfd uses multi-file view.

Scan for files then select with fzf
docfd [PATH]... ?

The ? can be in any position in the path list. If any of the path is ?, then file selection of the discovered files via fzf is invoked.

Use list of paths from file
docfd [PATH]... --paths-from paths.txt

The final list of paths used is then the concatenation of PATHs and paths listed in paths.txt, which has one path per line.

The list PATHs does not default to . when --paths-from is used.

Searching

The search field takes a search expression as input. A search expression is one of:

To use literal ?, (, ) or |, a backslash (\) needs to be placed in front of the character.

Optional operator handling specifics

For a phrase with optional operator, such as ?word0 word1 ..., the first word is grouped implicitly, i.e. it is treated as ?(word0) word1 ....

Search phrase and search procedure

Document content and user input in the search field are tokenized/segmented in the same way, based on:

A search phrase is a list of said tokens.

Search procedure is a DFS through the document index, where the search range for a word is fixed to a configured range surrounding the previous word (when applicable).

A token in the index matches a token in the search phrase if they fall into one of the following cases:

Search results are then ranked using heuristics.

Common controls between multi-file view and single file view

Navigation mode

Search mode

Multi-file view

The default TUI is divided into four sections:

Controls

Docfd operates in modes, the initial mode is navigation mode.

Navigation mode

Single file view

If the specified path to Docfd is not a directory, then single file view is used.

In this view, the TUI is divided into only two sections:

Controls

The controls are simplified in single file view, namely Shift is optional for scrolling through search result list.

Navigation mode

Limitations

Acknowledgement