Bot that monitors a Matrix room for PDF and JPEG files and uploads them to Paperless-ngx. Supports E2E encrypted attachments via inline AES keys, historical catchup on startup, exponential backoff retries with a permanent give-up after max attempts, file format validation via magic bytes, Uptime Kuma heartbeat monitoring, and email alerts on errors via SMTP SSL.
3.4 KiB
matrix-paperless-ingest
Monitors a Matrix room for PDF and JPEG files and uploads them to Paperless-ngx. Designed for rooms bridged from WhatsApp via mautrix-whatsapp.
- Processes the full room history on startup (skips files already in Paperless)
- Listens for new files indefinitely
- Retries failed uploads with exponential backoff
- State is tracked in a local SQLite database
Requirements
- Python 3.11+
- uv (
curl -LsSf https://astral.sh/uv/install.sh | sh) libolm+python-olm— must be installed via the system package manager becausepython-olm's build system is incompatible with modern CMake
Arch Linux:
sudo pacman -S libolm python-olm
Ubuntu:
sudo apt install libolm3 python3-olm
Setup
1. Clone and install dependencies
git clone <repo>
cd matrix-paperless-ingest
uv venv --system-site-packages
uv sync --no-install-package python-olm
2. Create a Matrix bot account
Create a new Matrix account for the bot on your homeserver (e.g. via Element), then invite it to the room you want to monitor and accept the invite.
3. Generate a Matrix access token
Log in with the bot account to obtain an access token and device ID:
curl -XPOST 'https://jeena.net/_matrix/client/v3/login' \
-H 'Content-Type: application/json' \
-d '{
"type": "m.login.password",
"user": "@yourbot:jeena.net",
"password": "yourpassword"
}'
Copy access_token and device_id from the response. You can then delete the
password from your notes — it is not needed again.
4. Find your Matrix room ID
In Element: open the room → Settings → Advanced → Internal room ID.
It looks like !abc123:jeena.net.
5. Find your Paperless inbox tag ID
In Paperless-ngx, go to Tags and note the ID of your inbox tag, or look it up via the API:
curl -H 'Authorization: Token YOUR_TOKEN' https://paperless.jeena.net/api/tags/
6. Configure
cp .env.example .env
$EDITOR .env
Fill in all values:
MATRIX_HOMESERVER=https://jeena.net
MATRIX_USER=@yourbot:jeena.net
MATRIX_ACCESS_TOKEN=syt_...
MATRIX_DEVICE_ID=ABCDEFGH
MATRIX_ROOM_ID=!abc123:jeena.net
PAPERLESS_URL=https://paperless.jeena.net
PAPERLESS_TOKEN=your_paperless_api_token
PAPERLESS_INBOX_TAG_ID=1
7. Test
uv run --no-sync python ingest.py
Watch the logs. It will process all historical messages, then listen for new ones. Press Ctrl-C to stop.
Install as a systemd service
# Create a dedicated user
sudo useradd -r -s /bin/false matrix-paperless-ingest
# Copy the project
sudo cp -r . /opt/matrix-paperless-ingest
sudo chown -R matrix-paperless-ingest:matrix-paperless-ingest /opt/matrix-paperless-ingest
# Install and start the service
sudo cp paperless-ingest.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now paperless-ingest
# Check logs
sudo journalctl -u paperless-ingest -f
Viewing retry queue
sqlite3 state.db "SELECT filename, status, retry_count, datetime(next_retry, 'unixepoch') FROM processed_events WHERE status = 'failed';"
Moving to a new server
- Copy the project directory (including
.envandstate.db) - Install
uv,libolm3, andpython3-olmon the new server - Run
uv venv --system-site-packages && uv sync --no-install-package python-olm - Install the systemd service as above