A list of tools that generally make life easier when working with data.
Things I use and recommend
- jq is a lightweight and flexible command-line JSON processor.
- csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
- gron gron transforms JSON into discrete assignments to make it easier to grep for what you want and see the absolute ‘path’ to it. It eases the exploration of APIs that return large blobs of JSON but have terrible documentation. Its primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don’t already know the structure; much of jq’s power is unlocked only once you know that structure.
yq a lightweight and portable command-line YAML processor. yq uses jq like syntax but works with yaml files as well as json.
I find it useful to work with
terraform show -json ~/tmp/tfplan.
Things I haven’t tried yet
- Data Retriever automates the first steps in the data analysis pipeline by downloading, cleaning, and standardizing datasets, and importing them into relational databases, flat files, or programming languages.
- VisiData is a terminal interface for exploring and arranging tabular data.
- SQLFluff is a dialect-flexible and configurable SQL linter.
- angle-grinder allows you to parse, aggregate, sum, average, min/max, percentile, and sort your data. You can see it, live-updating, in your terminal.
- immudb is a database with built-in cryptographic proof and verification. It can track changes in sensitive data and the integrity of the history will be protected by the clients, without the need to trust the server. It can operate as a key-value store or as relational database (SQL).
- lux is a Python library that facilitate fast and easy data exploration by automating the visualization and data analysis process. By simply printing out a dataframe in a Jupyter notebook, Lux recommends a set of visualizations highlighting interesting trends and patterns in the dataset. Visualizations are displayed via an interactive widget that enables users to quickly browse through large collections of visualizations and make sense of their data. Blog. Demo