Projects

Open Source Tools


These are open-source tools that I’ve built for personal use within my home compute lab, where I find them very helpful. I’m therefore open sourcing them because I feel that others may benefit from them as well, and have decided to put the necessary work into polish them up enough to get them in shape to release publicly!

mattstash

Inspired by the great Credstash package, this is a CLI and Python package that uses KeePass on the backend. Intended for use as a sidecar for containerized Kubernetes pod deployments, but without the DynamoDB dependency. Can also be used interactively and in Python Scripts.

forklift

An tool for loading messy source data to parquet files with an opinionated schema, and many convenience tools for typical data engineering tasks. Uses PyArrow for processing for quick, memory efficient processing.

Overland Listener

Overland is a popular iOS app for location tracking. Several backends exist for it, but I’ve never had much success with them—and they’re designed for different purposes than my use case: using location data for automation (e.g., sending a Pushover notification when I arrive somewhere). To solve this, I built a tool that streams Overland data directly into an S3 bucket.

Name TBD – DE/DS Dev Environment

Batteries Included” Data Engineering environment
As a self-hosting hobbyist, I’ve developed numerous packages and containers for data analysis. I plan to package and release these in a standard Docker container for cloud-agnostic deployment.