duck_hunt

Search Shortcut cmd + k | ctrl + k

Documentation

duck_hunt

Downloads 486this week

GitHub stars 3

Extension repository on GitHub

Extension descriptor (YAML)

Parse and analyze test results, build outputs, and CI/CD pipeline logs from 90+ formats with severity filtering and format auto-detection

Maintainer(s): teaguesterling

Installing and Loading

INSTALL duck_hunt FROM community;
LOAD duck_hunt;

Example

-- Parse build errors (warnings and above)
SELECT ref_file, ref_line, message
FROM read_duck_hunt_log('build.log', 'auto', severity_threshold := 'warning')
WHERE status = 'ERROR';

-- Parse test results with all events
SELECT test_name, status, execution_time
FROM read_duck_hunt_log('pytest.json', 'pytest_json', severity_threshold := 'all')
WHERE status = 'FAIL';

-- Process multiple files, skip failures (new in v1.4.0)
SELECT * FROM read_duck_hunt_log('logs/*.json', ignore_errors := true);

-- Control log_content size for memory efficiency (new in v1.4.0)
SELECT * FROM read_duck_hunt_log('huge.log', content := 200);
SELECT * FROM read_duck_hunt_log('huge.log', content := 'smart');

-- Parse GitHub Actions ZIP archives (new in v1.4.0, requires zipfs)
-- INSTALL zipfs FROM community; LOAD zipfs;
SELECT job_name, severity, message
FROM read_duck_hunt_workflow_log('workflow_run.zip', 'github_actions_zip')
WHERE severity = 'error';

-- Debug format detection
SELECT format, can_parse, events_produced, is_selected
FROM duck_hunt_diagnose_read('build.log');

-- Build health badge
SELECT status_badge(
  COUNT(*) FILTER (WHERE status = 'ERROR'),
  COUNT(*) FILTER (WHERE status = 'WARNING')
) FROM read_duck_hunt_log('build.log', 'auto');

Duck Hunt is a comprehensive DuckDB extension for parsing and analyzing development tool outputs. It provides a unified SQL interface to query test results, build logs, linting output, and CI/CD pipeline data from 90+ tools and formats.

Documentation: https://duck-hunt.readthedocs.io/

Core Table Functions:

read_duck_hunt_log(source, format, severity_threshold, ignore_errors, content) - Parse tool outputs from files
parse_duck_hunt_log(text, format, severity_threshold, content) - Parse tool outputs from strings
read_duck_hunt_workflow_log(source, format, severity_threshold, ignore_errors) - Parse CI/CD workflow logs
parse_duck_hunt_workflow_log(text, format, severity_threshold) - Parse workflow strings

New in v1.4.0:

ignore_errors parameter - Continue processing when individual files fail
content parameter - Control log_content size: integer limit, 'smart', 'none', or 'full'
github_actions_zip format - Parse ZIP archives from GitHub Actions (requires zipfs extension)
Workflow delegation - Workflow parsers auto-delegate to tool parsers (make, pytest, etc.)

Diagnostic Functions:

duck_hunt_detect_format(content) - Auto-detect format, returns format name or NULL
duck_hunt_diagnose_read(path) - Debug which parsers match a file
duck_hunt_diagnose_parse(content) - Debug which parsers match content
duck_hunt_formats() - List all supported formats with priorities

Scalar Functions:

status_badge(status) - Convert status to badge: [ OK ], [FAIL], [WARN], [ .. ], [ ?? ]
status_badge(errors, warnings) - Compute badge from counts
status_badge(errors, warnings, is_running) - Badge with running state

Severity Filtering: Filter events by severity level: 'all' < 'info' < 'warning' < 'error' < 'critical'

Supported Formats (90+): See: https://duck-hunt.readthedocs.io/en/latest/formats/

Dynamic: regexp:<pattern> - Custom patterns with named capture groups
Test Frameworks: pytest, Go test, Cargo test, JUnit, RSpec, Mocha/Chai, Google Test, NUnit/xUnit
Linting Tools: ESLint, RuboCop, Pylint, Flake8, MyPy, Clippy, SwiftLint, PHPStan, isort, and more
Build Systems: CMake, Make, Maven, Gradle, Cargo, MSBuild, Node.js, Python, Spack
CI/CD Engines: GitHub Actions, GitLab CI, Jenkins, Docker, Drone CI, Terraform
Distributed Systems: HDFS, Spark, Zookeeper, OpenStack, Android, Blue Gene/L
Infrastructure: Kubernetes, AWS CloudTrail, VPC Flow, S3 Access, Syslog
Debugging: Valgrind, GDB/LLDB, strace

Schema Fields (39): See: https://duck-hunt.readthedocs.io/en/latest/schema/

Reference context: ref_file, ref_line, ref_column (locations mentioned in logs)
Log source: log_file, log_content, log_line_start, log_line_end
Core: event_id, tool_name, event_type, status, severity, message
Error Analysis: fingerprint, similarity_score, pattern_id
Hierarchy: scope, group, unit, subunit (and corresponding _id, _status fields)

Key Features:

Automatic format detection with diagnostic debugging
Transparent compression support (GZIP built-in, ZSTD via parquet)
GitHub Actions ZIP archive support (via zipfs extension)
Severity-based event filtering
Error pattern clustering and fingerprinting
Multi-file glob processing with ignore_errors for robust batch processing
Memory-efficient content modes (limit, smart truncation, omit)
Workflow delegation to tool-specific parsers
Pipeline integration with stdin support
Hierarchical CI/CD workflow parsing

Perfect for CI/CD analysis, automated debugging, test aggregation, quality gates, and agent-driven development workflows.

Added Functions

function_name	function_type	description	comment
duck_hunt_detect_format	scalar	NULL	NULL
duck_hunt_diagnose_parse	table	NULL	NULL
duck_hunt_diagnose_read	table	NULL	NULL
duck_hunt_formats	table	NULL	NULL
duck_hunt_match_command_patterns	table_macro	NULL	NULL
parse_duck_hunt_log	table	NULL	NULL
parse_duck_hunt_workflow_log	table	NULL	NULL
read_duck_hunt_log	table	NULL	NULL
read_duck_hunt_workflow_log	table	NULL	NULL
status_badge	scalar	NULL	NULL

Installing and Loading

Example

About duck_hunt

Added Functions

In this article