I was staring at a 500MB JSON log file from our API gateway. My manager needed a list of all unique user agents from the last week. Simple request, right?
I fired up VS Code, created a new Python file, and started typing:
import json
with open('api-logs.json', 'r') as f:
data = json.load(f) # This is where my laptop froze
The Python process consumed 4GB of RAM and crashed. I tried streaming the JSON with ijson. Then I tried pandas. Thirty minutes later, I had a working script, but I couldn’t shake the feeling that I was using a sledgehammer to crack a nut.
That’s when my coworker walked by and said, “Why not just use jq?”
The Python Reflex: A Familiar Trap
Like most developers, my default response to “process some JSON” was to reach for Python. It’s what I know. It’s what’s comfortable.
But here’s the pattern I kept falling into:
# Step 1: Write the boilerplate (5 minutes)
import json
from collections import Counter
# Step 2: Handle the file size issues (15 minutes)
import ijson # Wait, do I have this installed?
# pip install ijson
# Okay, how does ijson work again?
# Step 3: Write the actual logic (10 minutes)
user_agents = []
with open('api-logs.json', 'rb') as f:
parser = ijson.items(f, 'logs.item.user_agent')
user_agents = list(set(parser))
# Step 4: Debug why it's not working (another 15 minutes)
# Oh right, the JSON structure is nested differently...
This was my routine for every single JSON exploration task. Small asks ballooning into 30-45 minute detours.
The One-Liner That Changed Everything
Here’s what my coworker showed me:
cat api-logs.json | jq -r '.logs[].user_agent' | sort -u
Three seconds. One line. Done.
I sat there feeling like I’d been carving wood with a butter knife while a chainsaw sat on my desk the whole time.
What Makes jq Perfect for JSON Exploration
jq is a command-line JSON processor. Think of it like sed or awk, but designed specifically for JSON. And it has three superpowers:
Memory Efficiency: jq streams through JSON. That 500MB file? No problem. I’ve used it on multi-gigabyte files without breaking a sweat.
Speed: What took my Python script 45 seconds runs in under 3 seconds with jq.
No Boilerplate: No imports, no file handling, no error checking. Just query and go.
The Problem: jq’s Syntax Is… Weird
Here’s where I hit the wall. jq’s syntax feels alien if you’re coming from Python or JavaScript:
# What does this even mean?
jq '[.items[] | select(.status == "active") | {id, name}]'
# Or this?
jq 'group_by(.category) | map({key: .[0].category, count: length})'
It’s powerful, but it has a learning curve. I’d spend 10 minutes searching Stack Overflow for “jq filter array by value” and get lost in documentation.
This is exactly where Claude Code shines.
The Breakthrough: Claude Code as Your jq Translator
I started using Claude Code as my jq query generator. I’d describe what I wanted in plain English, and Claude would write the jq command.
Here’s a real conversation I had last week:
Me: I have a JSON file with API response times. Each entry looks like {"endpoint": "/api/users", "response_time_ms": 145, "timestamp": "2025-01-10T14:23:11Z"}. I want to find the average response time for each endpoint.
Claude Code:
jq -r 'group_by(.endpoint) |
map({
endpoint: .[0].endpoint,
avg_response_time: (map(.response_time_ms) | add / length)
})' api-logs.json
Ten seconds. Perfect query. Working on the first try.
My Real-World jq + Claude Code Workflow
Here’s how I actually use this in my daily work:
Use Case 1: Quick Data Exploration
When I get an unfamiliar JSON file, I start with:
Me to Claude: “Show me the structure of this JSON file. I want to understand what fields are available.”
jq 'paths(scalars) | join(".")' data.json | head -20
This shows me all the paths to actual values in the JSON, not the full data. It’s like an instant schema viewer.
Use Case 2: Extracting Nested Data
I had a deployment manifest with nested service configurations. I needed all image tags being deployed:
Me: “Extract all Docker image tags from this Kubernetes deployment JSON. They’re nested under spec.template.spec.containers[].image”
Claude:
jq -r '.items[].spec.template.spec.containers[].image |
split(":")[1]' deployments.json | sort -u
Perfect. Got my list of unique tags in 2 seconds.
Use Case 3: Transforming JSON Structure
Our monitoring system exports metrics in one format, but our dashboard needs them in another:
Me: “I have an array of metrics like [{name: 'cpu', value: 78.5}]. I need to convert it to an object like {cpu: 78.5, memory: 12.3}”
Claude:
jq 'map({(.name): .value}) | add' metrics.json
I didn’t even know jq could do this. Claude taught me the map({(.name): .value}) pattern that I now use weekly.
The Scenarios Where This Shines
After a few months of this workflow, here’s where jq + Claude Code became my go-to:
Large log files: Anything over 100MB. Python chokes; jq flows.
Quick exploratory queries: “Show me all unique error codes in this file.” Done in seconds instead of writing a script.
Pipeline operations: Combining jq with other CLI tools (grep, sort, uniq, wc) creates incredibly powerful one-liners.
API response debugging: When an API returns a complex nested JSON response, I pipe it through jq to extract just what I need.
CI/CD scripts: Our deployment pipelines use jq to extract config values from JSON files. It’s faster and more reliable than Python scripts.
AWS/Cloud infrastructure auditing: AWS CLI outputs massive JSON responses. jq makes it trivial to extract exactly what you need without boto3 scripts.
Real Examples from My Recent Work
Example 1: Database Migration Analysis
I needed to analyze 10,000 JSON documents from MongoDB to see which ones had a specific deprecated field:
# What Claude Code generated for me
cat mongo-export.json |
jq -r 'select(.legacy_user_id != null) |
"\(.id),\(.legacy_user_id),\(.created_at)"' |
wc -l
Found 2,847 documents with the deprecated field in 4 seconds. My initial Python approach would have taken minutes to write and run.
Example 2: GitHub API Exploration
I was using GitHub’s API to audit our repositories. The response was deeply nested JSON with tons of fields I didn’t care about:
Me: “From this GitHub API response, I only want repo name, star count, and last push date”
gh api /user/repos |
jq -r '.[] |
"\(.name),\(.stargazers_count),\(.pushed_at)"'
Clean, readable CSV output. Ready to paste into a spreadsheet.
Example 3: AWS CloudFormation Output Parsing
Our infrastructure team exports CloudFormation stack outputs as JSON. I needed specific resource IDs:
# Claude wrote this after I described what I needed
aws cloudformation describe-stacks --stack-name prod |
jq -r '.Stacks[0].Outputs[] |
select(.OutputKey | contains("DatabaseEndpoint")) |
.OutputValue'
Got the exact database endpoint without manually parsing through 300 lines of JSON.
The Really Cool Stuff: Advanced jq Patterns I Never Would Have Found
This is where Claude Code gets interesting. Once I started asking for more complex transformations, Claude showed me jq patterns that felt like magic.
Pattern 1: Flattening Nested JSON to Dot Notation
I needed to convert a deeply nested configuration object into flat key-value pairs for a dashboard. This one blew my mind:
Me: “I have nested JSON like {a:{b:{c:1},d:[2,3]},e:4}. I want to flatten it to dot notation like a.b.c: 1, a.d.0: 2, etc.”
Claude Code:
jq -n '{a:{b:{c:1},d:[2,3]},e:4}' |
jq '[paths(scalars) as $p |
{($p|map(tostring)|join(".")): getpath($p)}] |
add'
Output:
{"a.b.c":1,"a.d.0":2,"a.d.1":3,"e":4}
This was perfect for migrating config files between systems with different nesting requirements. I would have spent an hour writing Python code to recursively walk the JSON tree. Instead: one line.
Pattern 2: AWS EC2 Instance Exploration
Working with AWS CLI output is painful. The JSON is massive and deeply nested. Here are the patterns Claude taught me:
Finding all running instances with their tags:
aws ec2 describe-instances |
jq -r '.Reservations[].Instances[] |
select(.State.Name == "running") |
{
id: .InstanceId,
type: .InstanceType,
name: (.Tags[]? | select(.Key=="Name") | .Value),
ip: .PrivateIpAddress
}'
Listing instances by cost (approximating with instance type):
aws ec2 describe-instances |
jq -r '.Reservations[].Instances[] |
select(.State.Name == "running") |
"\(.InstanceType),\(.InstanceId),\((.Tags[]? | select(.Key=="Name") | .Value) // "no-name")"' |
sort | uniq -c
This showed me we had 12 m5.2xlarge instances running that nobody knew about. Saved us $3,000/month.
Pattern 3: AWS S3 Bucket Analysis
Finding largest objects across buckets:
aws s3api list-objects-v2 --bucket my-bucket |
jq -r '.Contents[] |
select(.Size > 1000000000) |
"\(.Size/1000000000 | floor)GB,\(.Key),\(.LastModified)"' |
sort -rn
Counting objects by file extension:
aws s3api list-objects-v2 --bucket my-bucket |
jq -r '.Contents[].Key |
match("\\.[^.]+$") |
.string' |
sort | uniq -c | sort -rn
Discovered we had 45,000 .tmp files that should have been cleaned up. Another easy win.
Pattern 4: AWS Lambda Function Audit
List all Lambda functions with their memory and last modified date:
aws lambda list-functions |
jq -r '.Functions[] |
"\(.FunctionName),\(.MemorySize)MB,\(.Runtime),\(.LastModified)"' |
column -t -s,
Find Lambda functions that haven’t been updated in over a year:
aws lambda list-functions |
jq --arg cutoff "$(date -d '1 year ago' -I)" -r '
.Functions[] |
select(.LastModified < $cutoff) |
"\(.FunctionName),\(.LastModified)"'
We found 23 Lambda functions from 2021 that were still running (and costing money) but no longer used.
Pattern 5: Comparing Two JSON Files
This one is brilliant for config management:
Me: “I need to see what changed between two JSON config files”
Claude Code:
# Create a merged object showing differences
jq -n --slurpfile old config-old.json \
--slurpfile new config-new.json '
$old[0] as $o |
$new[0] as $n |
($o | keys_unsorted) + ($n | keys_unsorted) |
unique |
map(. as $key |
if $o[$key] != $n[$key]
then {key: $key, old: $o[$key], new: $n[$key]}
else empty
end)'
This saved me during a production incident when I needed to quickly see what config values changed between deployments.
Pattern 6: Complex AWS IAM Policy Analysis
Finding all IAM policies granting S3 full access:
aws iam list-policies --scope Local |
jq -r '.Policies[] | .Arn' |
while read policy; do
aws iam get-policy-version \
--policy-arn "$policy" \
--version-id $(aws iam get-policy --policy-arn "$policy" |
jq -r '.Policy.DefaultVersionId') |
jq --arg policy "$policy" -r '
if .PolicyVersion.Document.Statement[] |
select(.Effect == "Allow" and
(.Action | contains("s3:*") or contains("s3:Full")))
then $policy
else empty
end'
done
This is the kind of query that would take a full Python script with boto3. With jq and Claude Code, it’s a powerful one-liner (okay, a few-liner).
Pattern 7: Creating Lookup Tables on the Fly
Me: “I have a list of user IDs and need to create a quick lookup map”
Claude Code:
jq -r 'map({(.user_id): .}) | add' users.json
This converts an array of user objects into a single object keyed by user_id, perfect for fast lookups without loading into Python.
Pattern 8: Recursive Deep Merging
When dealing with multiple config files that need to be merged:
jq -s '.[0] * .[1] * .[2]' base.json override-dev.json override-local.json
This does a deep merge of multiple JSON files in order, with later files overriding earlier ones. Perfect for environment-specific config management.
When Claude Code Teaches You jq Patterns
The unexpected benefit: I’m actually learning jq by using it with Claude Code.
After Claude generated similar queries a few times, I started recognizing patterns:
[]unpacks arrays|pipes data (just like bash)select()filtersmap()transforms-rfor raw output (no quotes)
Now I can write simple jq queries myself. But for complex operations, I still lean on Claude. Why memorize obscure syntax when I can describe what I want in English?
The Performance Difference Is Staggering
I compared my Python script approach vs jq on that original 500MB log file:
Python script (with ijson):
- Development time: 30 minutes
- Execution time: 45 seconds
- Memory usage: ~800MB
- Lines of code: 23
jq one-liner:
- Development time: 30 seconds (asking Claude)
- Execution time: 2.8 seconds
- Memory usage: ~20MB
- Lines of code: 1
That’s not an optimization. That’s a different universe of productivity.
What Makes This Different from Using ChatGPT
I tried this workflow with ChatGPT first. It worked, but there were friction points:
Context switching: Copy question → paste in ChatGPT → copy answer → paste in terminal → test → repeat.
No file access: I’d have to describe my JSON structure. Claude Code can just read the file.
No execution: ChatGPT gives you the command; Claude Code can run it and iterate if it doesn’t work.
With Claude Code, it’s conversational. I can say “that didn’t work, here’s the error” and Claude adjusts the query immediately. It’s like pair programming with someone who actually knows jq.
The Limitations (Because Nothing’s Perfect)
jq Isn’t Always the Answer
When the JSON structure is wildly inconsistent or I need complex business logic, Python is still the right tool. jq is for querying and transforming, not for complex algorithms.
Some Operations Are Still Awkward
Math operations in jq feel clunky. Calculating percentages or doing statistical analysis? I still reach for Python.
The Learning Curve Is Real
Even with Claude Code helping, some advanced jq features (recursive descent, complex conditionals) still make my brain hurt. I’m relying on Claude more than I’d like for those.
You Still Need to Understand JSON Structure
jq won’t magically understand your data. You need to know roughly what structure you’re working with. Claude Code helps, but you can’t be completely clueless about your JSON schema.
What I’m Still Figuring Out
When to stop using jq and switch to Python: There’s a blurry line where jq becomes harder than just writing a quick Python script. I don’t have a firm rule yet.
How to debug complex jq queries: When Claude generates a 5-line jq query that doesn’t work, debugging it is hard. I usually just ask Claude to fix it, but I’d like to understand what’s broken.
Building reusable jq query libraries: I’ve saved a few common queries in my shell aliases, but I haven’t figured out a good way to organize and share them with my team.
Integrating jq into automated workflows: Some of our CI/CD pipelines would benefit from jq, but I’m not confident enough to refactor them yet.
Mastering AWS cost optimization patterns: Those EC2 and Lambda queries saved us real money, but I know there are more patterns to discover. I want to build a library of jq queries specifically for AWS cost analysis.
Try This Today
Next time you’re about to write a Python script to process JSON, try this instead:
- Open Claude Code
- Say: “I have a JSON file with [describe structure]. I want to [describe what you need]. Write me a jq command.”
- Run the command
- If it doesn’t work, paste the error back to Claude
You’ll solve your problem faster and learn jq in the process.
Your Turn
How are you handling JSON exploration in your workflow? Are you still writing scripts for everything, or have you found CLI tools that changed your process?
I’m particularly curious if anyone has great jq patterns for working with streaming JSON APIs, or tips for debugging complex jq queries.
Drop me an email at hello@ashishacharya.com. I’m collecting command-line productivity workflows and always looking for better patterns.
And if you have a favorite jq one-liner that feels like magic, please share it. I’m building a collection.