Parse and Upload Actions¶
Files:
.github/actions/parse/action.yml- Parse benchmark logs.github/actions/upload/action.yml- Upload to LogStash/Kibana
These custom composite actions handle parsing benchmark logs and uploading results to Elasticsearch/Kibana for visualization and analysis.
Purpose¶
After each benchmark job completes, these actions:
- Parse action: Parses the log file to extract timing and performance metrics, generating a JSON payload
- Upload action: Sends the JSON payload to LogStash for routing and storage in Elasticsearch/Kibana
Architecture¶
The parsing and upload workflow uses two separate actions to process and send benchmark data:
Key architectural details:
- LogStash endpoint: LogStash routes requests to the appropriate Kibana instance based on the
token+kindcombination - Authentication/Routing: The
tokenandkindfields are sent as part of the JSON document body (NOT as Bearer authentication in HTTP headers) - No traditional credentials: This design eliminates the need for username/password authentication - routing is handled via the token+kind fields in the data payload
This routing approach allows different benchmark types and projects to be automatically directed to their respective Kibana instances while maintaining a single, simple integration point for GitHub Actions workflows.
Inputs¶
Parse Action Inputs¶
| Input | Description | Required | Example |
|---|---|---|---|
job | Job name | Yes | rucio, eventloop-columnar, evnt-native |
log-file | Path to log file | Yes | rucio.log |
log-type | Type of log parser to use | Yes | rucio, athena, coffea, fastframes |
cluster | Cluster name | Yes | UC-AF, SLAC-AF, BNL-AF |
kibana-token | Token for benchmark ID | Yes | From secrets |
kibana-kind | Kind for benchmark ID | Yes | From secrets |
host | Hostname to identify the machine | Yes | ${NODE_NAME} |
payload-file | Path to payload file for size calculation | No | Empty string (default, payloadSize = -1) |
output-file | Output JSON file path | No | payload.json (default) |
Upload Action Inputs¶
| Input | Description | Required | Example |
|---|---|---|---|
payload-file | Path to JSON payload file | Yes | payload.json |
kibana-uri | URI endpoint for LogStash | Yes | From secrets |
Implementation Steps¶
Parse Action¶
The parse action performs these steps:
- Setup pixi: Sets up the
kibanapixi environment with Python 3.13 and required dependencies - Parse log file: Runs the parsing script to generate JSON payload:
pixi run -e kibana python parsing/scripts/ci_parse.py \
--job <job> \
--log-file <log-file> \
--log-type <log-type> \
--cluster <cluster> \
--token <token> \
--kind <kind> \
--host <host> \
--output payload.json
- Output: Generates
payload.jsonfile in the workspace
Upload Action¶
The upload action performs these steps:
- Validate payload file: Checks that the payload file exists and displays its contents for debugging
- Upload to LogStash: POSTs the JSON payload using curl:
curl -X POST "https://<kibana-uri>" \
-H "Content-Type: application/json" \
-d @payload.json \
-w "%{http_code}" \
-s -o /tmp/response.txt
- Verify response: Checks HTTP status code and fails if not 2xx
Data Structure¶
The parsed data sent to LogStash (which routes it to Kibana) is validated against a JSON schema (parsing/schema/payload.schema.json) to ensure correctness before upload.
Required structure:
{
"job": "rucio",
"cluster": "UC-AF",
"submitTime": 1234567890000,
"queueTime": 0,
"runTime": 3600,
"payloadSize": 1073741824,
"status": 0,
"host": "hostname.example.com",
"token": "<TOKEN>",
"kind": "benchmark"
}
Field Descriptions¶
| Field | Type | Description | Source |
|---|---|---|---|
job | String | Job name (e.g., rucio, eventloop-columnar) | Passed from workflow via ${{ github.job }} |
cluster | String | AF cluster name (UC-AF, SLAC-AF, BNL-AF) | Passed from workflow |
submitTime | Integer | UTC timestamp (ms since epoch) | Parsed from log |
queueTime | Integer | Queue time (seconds) | Parsed from log |
runTime | Integer | Execution time (seconds) | Parsed from log |
payloadSize | Integer | Output file size (bytes) | Calculated from payload-file input using Path().stat().st_size (-1 if not provided, 0 for empty file) |
status | Integer | Exit code (0=success, non-zero=failure) | Parsed from log |
host | String | Hostname where job executed (idn-hostname format) | Passed from workflow via ${NODE_NAME} |
token | String | Benchmark identifier AND LogStash routing key | Passed from workflow (secrets) |
kind | String | Benchmark type AND LogStash routing kind | Passed from workflow (secrets) |
Static vs Parsed Fields¶
Static fields (from workflow configuration):
cluster- Set per site (UC-AF, SLAC-AF, etc.)token- Benchmark identifier token AND LogStash routing keykind- Benchmark kind/category AND LogStash routing kindhost- Hostname from workflow environment (${NODE_NAME})
Parsed fields (extracted from logs):
testType- Determined from job type or log contentsubmitTime- Start timestamp from logqueueTime- Time waiting before executionrunTime- Total execution durationpayloadSize- Size of output filesstatus- Job exit code
Note: The token and kind fields are included in the JSON document body sent to LogStash, where they serve the dual purpose of identifying the benchmark and routing the data to the appropriate Kibana instance.
Failure Handling¶
The action uses continue-on-error: true in workflows, which means:
- Parsing failures don't fail the benchmark job
- Logs are always uploaded as artifacts
- Parsing errors are visible in workflow logs
- Benchmarks complete successfully even if Kibana upload fails
This design ensures benchmark execution is never blocked by parsing/upload issues.
Usage Example¶
- name: parse benchmark log
if: always() # Run even if benchmark failed
uses: ./.github/actions/parse
with:
job: ${{ github.job }}
log-file: rucio.log
log-type: rucio
cluster: UC-AF
kibana-token: ${{ secrets.KIBANA_TOKEN }}
kibana-kind: ${{ secrets.KIBANA_KIND }}
host: ${NODE_NAME}
continue-on-error: true # Don't fail job if parsing fails
- name: upload to kibana
if: always() # Run even if parsing failed
uses: ./.github/actions/upload
with:
payload-file: payload.json
kibana-uri: ${{ secrets.KIBANA_URI }}
continue-on-error: true # Don't fail job if upload fails
LogStash/Elasticsearch Configuration¶
Data is sent to LogStash, which routes it to Elasticsearch:
- LogStash endpoint: this acts as routing service
- Elasticsearch index:
af_benchmarks - Protocol: HTTPS
- Routing/Authentication: Via
token+kindfields in the JSON document body
The token and kind fields serve a dual purpose:
- Benchmark identification: Uniquely identify and categorize the benchmark run
- LogStash routing: Direct the data to the appropriate Kibana instance
This body-based routing mechanism eliminates the need for traditional HTTP authentication headers or stored credentials.
Debugging¶
Viewing Parsing Logs¶
- Go to the workflow run in GitHub Actions
- Click on the specific job
- Expand the "parse and upload to kibana" step
- Review the output for parsing errors
Common Parsing Issues¶
Log file not found:
- Verify the log file path matches the actual output
- Check that the benchmark script completed
- Look for the file in workflow artifacts
Parsing errors:
- Check log file format matches expected structure
- Verify parsing script handles this job type
- Review error messages in workflow logs
Upload failures:
- Check network connectivity to LogStash endpoint
- Verify
tokenandkindvalues are correctly set - Ensure the
af_benchmarksindex exists in Elasticsearch
Testing Locally¶
Test parsing and upload separately:
Test parsing:
pixi shell -e kibana
python parsing/scripts/ci_parse.py \
--job rucio \
--log-file path/to/rucio.log \
--log-type rucio \
--cluster UC-AF \
--token $KIBANA_TOKEN \
--kind $KIBANA_KIND \
--host $HOSTNAME \
--output payload.json
Test upload:
curl -X POST "https://$KIBANA_URI" \
-H "Content-Type: application/json" \
-d @payload.json \
-w "\nHTTP Status: %{http_code}\n"
Integration with Other Workflows¶
These actions are currently used by:
- UChicago Benchmark Workflow - All 10 benchmark jobs
Can be extended to:
- SLAC benchmark workflows
- BNL benchmark workflows
- NERSC benchmark workflows
Next Steps¶
- Review the UChicago benchmark workflow
- Learn about local development
- Understand the pixi environments