Bot that monitors a Matrix room for PDF and JPEG files and uploads them to Paperless-ngx.
Find a file
Jeena 0aa044eead ingest: Accept RoomMessageFile events regardless of body content
WhatsApp bridge files (e.g. PDFs) may arrive with an empty body field,
causing the previous .pdf extension check to silently skip them. Accept
all RoomMessageFile events and fall back to "document.pdf" as filename.
File content is still validated via magic bytes before upload.
2026-03-11 23:35:23 +00:00
.env.example ingest: Initial implementation 2026-03-11 13:45:28 +00:00
.gitignore ingest: Initial implementation 2026-03-11 13:45:28 +00:00
ingest.py ingest: Accept RoomMessageFile events regardless of body content 2026-03-11 23:35:23 +00:00
matrix-paperless-ingest.service deps: Switch from uv to plain venv and pip-tools 2026-03-11 22:57:41 +00:00
pyproject.toml deps: Switch from uv to plain venv and pip-tools 2026-03-11 22:57:41 +00:00
README.md deps: Switch from uv to plain venv and pip-tools 2026-03-11 22:57:41 +00:00
requirements.txt deps: Switch from uv to plain venv and pip-tools 2026-03-11 22:57:41 +00:00

matrix-paperless-ingest

Monitors a Matrix room for PDF and JPEG files and uploads them to Paperless-ngx. Designed for rooms bridged from WhatsApp via mautrix-whatsapp.

  • Processes the full room history on startup (skips files already in Paperless)
  • Listens for new files indefinitely
  • Retries failed uploads with exponential backoff
  • State is tracked in a local SQLite database

Requirements

  • Python 3.11+
  • libolm + python-olm — must be installed via the system package manager because python-olm's build system is incompatible with modern CMake

Arch Linux:

sudo pacman -S libolm python-olm

Ubuntu:

sudo apt install libolm3 python3-olm

Setup

1. Clone and install dependencies

git clone <repo>
cd matrix-paperless-ingest
python3 -m venv .venv --system-site-packages
.venv/bin/pip install -r requirements.txt

2. Create a Matrix bot account

Create a new Matrix account for the bot on your homeserver (e.g. via Element), then invite it to the room you want to monitor and accept the invite.

3. Generate a Matrix access token

Log in with the bot account to obtain an access token and device ID:

curl -XPOST 'https://jeena.net/_matrix/client/v3/login' \
  -H 'Content-Type: application/json' \
  -d '{
    "type": "m.login.password",
    "user": "@yourbot:jeena.net",
    "password": "yourpassword"
  }'

Copy access_token and device_id from the response. You can then delete the password from your notes — it is not needed again.

4. Find your Matrix room ID

In Element: open the room → Settings → Advanced → Internal room ID. It looks like !abc123:jeena.net.

5. Find your Paperless inbox tag ID

In Paperless-ngx, go to Tags and note the ID of your inbox tag, or look it up via the API:

curl -H 'Authorization: Token YOUR_TOKEN' https://paperless.jeena.net/api/tags/

6. Configure

cp .env.example .env
$EDITOR .env

Fill in all values:

MATRIX_HOMESERVER=https://jeena.net
MATRIX_USER=@yourbot:jeena.net
MATRIX_ACCESS_TOKEN=syt_...
MATRIX_DEVICE_ID=ABCDEFGH
MATRIX_ROOM_ID=!abc123:jeena.net

PAPERLESS_URL=https://paperless.jeena.net
PAPERLESS_TOKEN=your_paperless_api_token
PAPERLESS_INBOX_TAG_ID=1

7. Test

.venv/bin/python ingest.py

Watch the logs. It will process all historical messages, then listen for new ones. Press Ctrl-C to stop.

Install as a systemd user service

# Enable lingering so the service starts at boot without requiring login
loginctl enable-linger

# Install the service
mkdir -p ~/.config/systemd/user
cp matrix-paperless-ingest.service ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable --now matrix-paperless-ingest

# Check logs
journalctl --user -u matrix-paperless-ingest -f

Updating dependencies

If you need to add or update a dependency, edit pyproject.toml and regenerate the locked requirements.txt:

.venv/bin/pip install pip-tools
.venv/bin/pip-compile pyproject.toml
.venv/bin/pip install -r requirements.txt

Viewing retry queue

sqlite3 state.db "SELECT filename, status, retry_count, datetime(next_retry, 'unixepoch') FROM processed_events WHERE status = 'failed';"

Moving to a new server

  1. Copy the project directory to ~/matrix-paperless-ingest (including .env and state.db)
  2. Install libolm3 and python3-olm via the system package manager
  3. Run python3 -m venv .venv --system-site-packages && .venv/bin/pip install -r requirements.txt
  4. Install the systemd user service as above