Skip to main content

Command Palette

Search for a command to run...

Mapping Zscaler Internet Access (ZIA) Threats to ENISA Framework Using Splunk

Updated
β€’15 min read

πŸ“ Objective

This task aims to standardize the categorization and reporting of threat events detected by Zscaler Internet Access (ZIA) to ensure alignment with the ENISA Threat Landscape framework. Currently, threat reports from multiple data sources lack consistency. Mapping threats detected by ZIA to ENISA’s categories facilitates:

  • Establishment of a unified threat taxonomy across systems.

  • Consistent reporting practices aligned with the GRIT team’s ENISA-based framework.

Enhanced accuracy in dashboards and analysis of risk trends.


πŸ“ Objective

This task aims to standardize the categorization and reporting of threat events detected by Zscaler Internet Access (ZIA) to ensure alignment with the ENISA Threat Landscape framework. Currently, threat reports from multiple data sources lack consistency. Mapping threats detected by ZIA to ENISA’s categories facilitates:

  • Establishment of a unified threat taxonomy across Hydro systems.

  • Consistent reporting practices aligned with the GRIT team’s ENISA-based framework.

  • Enhanced accuracy in dashboards and analysis of risk trends.


βš™οΈ Background

ZIA logs often contain fields such as urlcategory and urlclass; however, these fields are frequently unpopulated. Therefore, classification primarily depends on the threatname field, utilizing pattern matching and conditional logic implemented through eval and case functions.

The output generated includes:

  • Standardized threat categories consistent with ENISA guidelines.

  • Associated MITRE ATT&CK techniques.

  • Calculated risk metrics including average, maximum, minimum scores, and block percentages.

  • Detection confidence scoring derived via regex applied to the threat name, assigning low confidence to generic or heuristic-based detections.

  • Data enrichment with hyperlinks to the Zscaler Threat Library for comprehensive threat information.

  • Visualization of data as a pie chart representing threat mind map categorization.


🧩 Approach overview

The Splunk query executes the following key steps to analyze threat data:

  1. Data source selection
    Filters ZIA events for security threats using:

    index="proxy_zia" (urlclass IN ("Security Risk", "Advanced Security Risk") OR threatname!="None")

  2. Field normalization
    Renames and standardizes fields to enhance readability and consistency (e.g., time to timestamp, user remains user, url category to threat_classification).

  3. Threat mapping (regex-based categorization)
    Applies eval with case statements and regular expressions to categorize common threat names (e.g., Trojan, Worm, Phish) according to ENISA categories.

  4. Fallback to ZIA Url categories
    If no match is found in threatname, the query utilizes threat_category values from ZIA as a secondary classification source.

  5. MITRE technique association
    Associates each threat type with a corresponding MITRE ATT&CK technique (e.g., Trojan to T1204 User Execution).

  6. Confidence scoring
    Assigns detection confidence levelsβ€”High, Medium, or Lowβ€”based on patterns identified in the threat name.

  7. Threat intelligence enrichment
    Incorporates direct links to the Zscaler Threat Library for detailed threat information:

    eval "threat_library_link" = "https://threatlibrary.zscaler.com/?threatname=" . threat_name
    
  8. Summary statistics
    Aggregates metrics using the stats command, including:

    • Total and blocked request counts.

    • Distinct counts of users and geographic locations.

    • Average, maximum, and minimum risk scores.

    • Mapped ENISA and MITRE threat categories.


πŸ’» Splunk Query

index="proxy_zia" (urlclass IN ("Security Risk", "Advanced Security Risk") OR threatname!="None")
| rename time AS timestamp,
         user AS user,
         url AS accessed_url,
         serverip AS server_ip,
         clientpublicip AS client_public_ip,
         host AS host_name,
         action AS action_taken,
         reason AS reason_for_action_taken,
         urlclass AS threat_classification,
         urlcategory AS threat_category,
         threatname AS threat_name,
         location AS user_location,
         pagerisk AS webpage_risk_score,
| eval webpage_risk_score = coalesce(tonumber(webpage_risk_score), 0)
| eval threat_regex_map = case(
    match( threat_name, "(?i)trojan"), "Malware and Viruses β†’ Trojan",
    match( threat_name, "(?i)worm"), "Malware and Viruses β†’ Worm",
    match( threat_name, "(?i)virus"), "Malware and Viruses β†’ Virus",
    match( threat_name, "(?i)ransom"), "Malware and Viruses β†’ Ransomware",
    match( threat_name, "(?i)adware"), "Potentially Unwanted Software β†’ Adware",
    match( threat_name, "(?i)spyware"), "Potentially Unwanted Software β†’ Spyware",
    match( threat_name, "(?i)exploit"), "Nefarious Activity / Abuse β†’ Exploitation of Software Bugs",
    match( threat_name, "(?i)phish"), "Social Engineering β†’ Phishing",
    match( threat_name, "(?i)miner|cryptomining|crypto"), "Nefarious Activity β†’ Cryptomining",
    match( threat_name, "(?i)JS/Redir|Redirector"), "Malware and Viruses β†’ Trojan β†’ Redirector / FakeUpdate",
    match( threat_name, "(?i)JS/Refresh"), "Malware and Viruses β†’ Trojan β†’ JavaScript Trojan",
    match( threat_name, "(?i)Cryxos"), "Malware and Viruses β†’ Trojan β†’ Call Support Scam",
    match( threat_name, "(?i)Iframe"), "Malware and Viruses β†’ Trojan β†’ Malicious Iframe Injection",
    match( threat_name, "(?i)Zbot|Formbook|Rustystealer"), "Malware and Viruses β†’ Trojan β†’ Spyware / Password Stealer",
    match( threat_name, "(?i)Android.Banker"), "Malware and Viruses β†’ Trojan β†’ Banking Trojan",
    match( threat_name, "(?i)Gen.Mal|Heur.Malicious"), "Malware and Viruses β†’ Generic Malware Detection",
    match( threat_name, "(?i)HTML.Hacked.Site|Injection.SEO-SPAM|Injection.Wordpress"), "Manipulation of information β†’ HTML Injection / Site Compromise",
    match( threat_name, "(?i)HTML.MalURL|Html.Malurl"), "Malware and Viruses β†’ Malicious URL",
    match( threat_name, "(?i)Spammer"), "Nefarious Activity β†’ Unsolicited email / WebSpam",
    match( threat_name, "(?i)Downloader.Balada|Downloader.SocGholish|ABDownloader"), "Malware and Viruses β†’ Trojan β†’ Downloader",
    match( threat_name, "(?i)ClearFake"), "Malware and Viruses β†’ Trojan β†’ FakeUpdate / Drive-by Download",
    match( threat_name, "(?i)Magecart"), "Malware and Viruses β†’ Trojan β†’ POS Skimmer / Web Injection",
    match( threat_name, "(?i)Vipersoftx"), "Malware and Viruses β†’ Backdoor / InfoStealer",
    match( threat_name, "(?i)Malicious.Extension"), "Malware and Viruses β†’ Trojan β†’ Malicious Extension",
    match( threat_name, "(?i)Agent.CKQ"), "Malware and Viruses β†’ Trojan β†’ JavaScript Agent",
    match( threat_name, "(?i)Dataexfil"), "Abuse of information leakages",
    match(threat_name, "(?i)hacktool"), "Nefarious Activity β†’ Exploitation of software bugs / unauthorised activity",
    true(), "Uncategorized (Regex Fallback)"
)

| eval threat_mind_map_category = case(
    isnotnull(threat_name) AND threat_name!="None", threat_regex_map,
    threat_category=="Adware/Spyware Sites", "Potentially Unwanted Software β†’ Adware / Spyware",
    threat_category=="Botnet Callback", "Malware and Viruses β†’ Botnets",
    threat_category=="Browser Exploit", "Nefarious Activity / Abuse β†’ Exploitation of Software Bugs",
    threat_category=="Cross-site Scripting", "Input β†’ Cross-Site Scripting (XSS)",
    threat_category=="Cryptomining & Blockchain", "Nefarious Activity β†’ Malware and Viruses (General)",
    threat_category=="Custom Encrypted Content", "Generation and use of rogue certificates β†’ Improperly issued SSL certificates",
    threat_category=="Domain Generation Algorithm Domains", "Intended Similarity Of Identifiers β†’ Domain name collision",
    threat_category=="Dynamic DNS Host", "Manipulation of information β†’ DNS Manipulations",
    threat_category=="End-to-End Encrypted Content", "Generation and use of rogue certificates β†’ Improperly issued SSL certificates",
    threat_category=="Newly Revived Domains", "Intended Similarity Of Identifiers β†’ Domain name collision",
    threat_category=="Other Security", "Nefarious Activity",
    threat_category=="Peer-to-Peer", "Nefarious Activity β†’ Remote activities(execution)",
    threat_category=="Phishing", "Social Engineering β†’ Phishing",
    threat_category=="Spyware Callback", "Malware and Viruses β†’ Spyware",
    threat_category=="Spyware/Adware", "Potentially Unwanted Software β†’ Adware/Spyware",
    threat_category=="Suspicious Destination", "Nefarious Activity",
    threat_category=="Unauthorized Communication", "Unauthorised activities β†’ Unauthorised access to information systems/networks",
    threat_category=="WebSpam", "Nefarious Activity β†’ Unsolicited email",
    true(), "Uncategorized"
)

| eval classification_source = if(isnotnull(threat_name) AND threat_name!="None", "Threat Name", "Threat Category")

| eval detection_confidence = case(
    match(threat_name, "(?i)Heur|Gen"), "Low (Heuristic/Generic Signature)",
    match(threat_name, "(?i)Eldorado"), "Medium",
    match(threat_name, "(?i)Downloader|Trojan|Backdoor|POS|Spyware|Banker|Virus|Worm|Ransom|Adware|Exploit|Phish|Miner|Crypto"), "High (Behavioral/Confirmed)",
    true(), "Unkown"
)
| eval mitre_technique = case(
    match(threat_name, "(?i)Zbot|Formbook|Rustystealer|Spyware"), "Credential Access β†’ Input Capture (T1056)",
    match(threat_name, "(?i)Magecart"), "Credential Access β†’ Input Capture: Web Portal Capture (T1056.003)",
    match(threat_name, "(?i)Trojan|Virus|Worm|Downloader|ABDownloader|Balada|SocGholish|Gen.Mal|Heur.Malicious"), "Execution β†’ User Execution (T1204)",
    match(threat_name, "(?i)Cryxos|FakeUpdate|ClearFake|JS/Redir|Redirector"), "Initial Access β†’ Drive-by Compromise (T1189)",
    match(threat_name, "(?i)JS/Refresh|Agent.CKQ"), "Execution β†’ Command and Scripting Interpreter (T1059)",
    match(threat_name, "(?i)Iframe|Injection|XSS|HTML.Hacked.Site|HTML.MalURL|SEO-SPAM|Injection.Wordpress"), "Defense Evasion β†’ Obfuscated Files or Information (T1027)",
    match(threat_name, "(?i)Vipersoftx|Backdoor"), "Command and Control β†’ Application Layer Protocol (T1071)",
    match(threat_name, "(?i)Phish|Spammer|Cryxos"), "Initial Access β†’ Phishing (T1566)",
    match(threat_name, "(?i)Dataexfil"), "Exfiltration β†’ Exfiltration Over Web Service (T1567)",
    match(threat_name, "(?i)Malicious.Extension"), "Persistence β†’ Malicious File / Browser Extension (T1204.002)",
    match(threat_name, "(?i)Ransom"), "Impact β†’ Data Encrypted for Impact (T1486)",
    match(threat_name, "(?i)Miner|Cryptomining|Crypto"), "Impact β†’ Resource Hijacking (T1496)",
    match(threat_name, "(?i)Exploit|Hacktool"), "Initial Access β†’ Exploit Public-Facing Application (T1190)",
    true(), "Uncategorized β†’ Unknown Technique"
)
| eval "threat_library_link" = "https://threatlibrary.zscaler.com/?threatname=" . threat_name
| eventstats count as Total_Events
| stats 
    count AS "Total Requests",
    dc(user) AS "Unique Users",
    dc(user_location) AS "Unique Locations",
    avg(webpage_risk_score) AS "Average Risk Score",
    max(webpage_risk_score) AS "Max Risk Score",
    min(webpage_risk_score) AS "Min Risk Score",
    values(action_taken) AS "Actions Taken",
    count(eval(action_taken=="Blocked")) AS "Blocked Requests",
    values(threat_classification) AS "Threat Classification",
    values(threat_name) AS "Threat Names",
    values(threat_library_link) AS "Threat Detail Link",
    values(accessed_url) AS "Accessed Url",
    values(severity_tier) AS "Severity Tier",
    values(detection_confidence) AS "Detection Confidence",
    values(mitre_technique) AS "MITRE Technique",
    values(classification_source) AS "Classification Source"
by threat_mind_map_category
| eval "Average Risk Score" = round('Average Risk Score', 2)
| eval "Blocked Percentage" = round(('Blocked Requests'/'Total Requests')*100, 2)
| fields  threat_mind_map_category, "Total Requests", "Unique Users", "Unique Locations", "Average Risk Score", "Max Risk Score", "Min Risk Score", "Actions Taken", "Blocked Percentage", "Detection Confidence", "MITRE Technique", "Threat Classification", "Threat Names", "Threat Detail Link", "Accessed Url",threat_category ,"Classification Source"

(The Final Splunk query for the)


πŸ“Š Sample Output Fields

Field Name Description
threat_mind_map_category The category assigned to the identified threat according to ENISA definitions.
Total Requests The total number of detections associated with the threat.
Blocked Percentage The percentage of requests that were successfully blocked.
Detection Confidence Indicates the confidence level in the accuracy and reliability of the detection.
MITRE Technique The specific MITRE ATT&CK technique associated with the threat.
Threat Names The original threat names as designated within the ZIA system.
Threat Detail Link A direct hyperlink to the Zscaler Threat Library for additional information.
Classification Source Indicates whether the categorization is based on the threat name or the threat category.
Unique Users The count of distinct users affected or involved.
Unique Locations The number of different geographic or network locations identified.
Average Risk Score The mean risk score calculated based on the risk assigned by Zscaler to the associated web pages.
Max Risk Score The highest risk score assigned by Zscaler to any related web page.
Min Risk Score The lowest risk score assigned by Zscaler to any related web page.
Actions Taken The specific actions executed by Zscaler in response to the threat.
Accessed URL The URL or web link that was accessed and associated with the threat.
Threat Category Indicates the threat category as classified by the Hydro system.

Step-by-step explanation of the Splunk query

The following is a detailed, sequential explanation of the query's operations, the rationale behind each step, and practical advice for testing, optimizing performance, and enhancing robustness. The explanation follows the query's flow from top to bottom to facilitate integration into Confluence or Jira tasks.


1) Search & initial filter

index="proxy_zia" (urlclass IN ("Security Risk", "Advanced Security Risk") OR threatname!="None")

What it does

  • Limits events to those in the proxy_zia index, which is the most efficient location for filtering.

  • Includes events where urlclass indicates a risky URL or where threatname is present.

Why

  • Early filtering reduces data volume before computationally intensive operations such as regex matching, evaluations, and statistical aggregations, thereby improving performance and relevance.

2) Field renaming / normalization

| rename time AS timestamp,
         user AS user,
         url AS accessed_url,
         serverip AS server_ip,
         clientpublicip AS client_public_ip,
         host AS host_name,
         action AS action_taken,
         reason AS reason_for_action_taken,
         urlclass AS threat_classification,
         urlcategory AS threat_category,
         threatname AS threat_name,
         protocol AS protocol_used,
         location AS user_location,
         devicehostname AS device_hostname,
         deviceowner AS device_owner,
         userAgent AS user_agent,
         pagerisk AS webpage_risk_score,
         Country AS country,
         City AS city,
         lat AS latitude,
         lon AS longitude,

What it does

  • Renames fields to consistent, descriptive names that are used throughout the query and in output results.

Why

  • Enhances clarity for team members and prevents confusion when binding data to dashboards, lookups, or saving results.

Tip

  • Rename only fields that are utilized downstream; unnecessary renaming increases query length without benefit.

3) Coerce risk score to number

| eval webpage_risk_score = coalesce(tonumber(webpage_risk_score), 0)

What it does

  • Converts the webpage_risk_score field to a numeric type; if conversion fails or the field is null, it defaults to 0.

Why

  • Ensures that subsequent numeric aggregations such as average, maximum, and minimum calculations operate correctly.

4) Regex-based mapping (primary mapping by threat_name)

| eval threat_regex_map = case(
    match(threat_name, "(?i)trojan"), "Malware and Viruses β†’ Trojan",
    ...
    true(), "Uncategorized (Regex Fallback)"
)

What it does

  • Utilizes the case() function with match() regular expressions to analyze the threat_name field and assign a standardized category string consistent with ENISA classifications (stored in threat_regex_map).

Key points

  • The (?i) prefix in the regex ensures case-insensitive matching.

  • Rules are evaluated sequentially; the first match returns the assigned category.

  • A final true() condition provides a default label if no prior matches occur.

Why

  • The threat_name field contains the most detailed information from ZIA; regex matching enables effective capture of diverse vendor naming conventions.

Tip

  • To maintain classification accuracy and reduce reliance on fallback labels, regularly update the regex map to incorporate new patterns reflecting evolving threat names.

5) Secondary mapping using ZIA threat_category

| eval threat_mind_map_category = case(
    isnotnull(threat_name) AND threat_name!="None", threat_regex_map,
    threat_category=="Adware/Spyware Sites", "Potentially Unwanted Software β†’ Adware / Spyware",
    ...
    true(), "Uncategorized"
)

What it does

  • If threat_name is present, uses threat_regex_map for classification.

  • Otherwise, maps based on threat_category values provided by ZIA.

  • Defaults to "Uncategorized" if neither condition is met.

Why

This strategy enhances robustness by prioritizing the more detailed threat_name field and using threat_category as a fallback.


6) Classification source flag

| eval classification_source = if(isnotnull(threat_name) AND threat_name!="None", "Threat Name", "Threat Category")

| eval classification_source = if(isnotnull(threat_name) AND threat_name!="None", "Threat Name", "Threat Category")

What it does

  • Creates a field indicating whether classification was derived from threat_name or threat_category.

7) Detection confidence heuristic

| eval detection_confidence = case(
    match(threat_name, "(?i)Heur|Gen"), "Low (Heuristic/Generic Signature)",
    match(threat_name, "(?i)Eldorado"), "Medium",
    match(threat_name, "(?i)Downloader|Trojan|Backdoor|POS|Spyware|Banker|Virus|Worm|Ransom|Adware|Exploit|Phish|Miner|Crypto"), "High (Behavioral/Confirmed)",
    true(), "Unkown"
)

What it does

  • Assigns detection confidence levels (Low, Medium, High) based on textual patterns within threat_name.

Why

  • Provides users with an indication of the reliability of the detection.

Tip

  • Regularly tune the regex patterns to accurately reflect the confidence level of different threat detections.

8) MITRE technique mapping

| eval mitre_technique = case(
    match(threat_name, "(?i)Zbot|Formbook|Rustystealer|Spyware"), "Credential Access β†’ Input Capture (T1056)",
    ...
    true(), "Unknown"
)

What it does

  • Assigns a likely MITRE ATT&CK technique string to each threat based on the threat name.

Why

  • Facilitates correlation of detections with the ATT&CK framework for analyst use.

Tip

  • Update this mapping regularly to maintain accuracy and minimize the number of unknown classifications.

| eval "threat_library_link" = "https://threatlibrary.zscaler.com/?threatname=" . threat_name
| eventstats count as Total_Events

What it does

  • Generates a clickable link to the Zscaler Threat Library for quick reference.

  • Adds a total event count to each event to assist with rate inspection.

Why

  • Provides analysts with rapid access to detailed threat information and overall event volume.

10) Aggregation / statistics

| stats 
    count AS "Total Requests",
    dc(user) AS "Unique Users",
    dc(user_location) AS "Unique Locations",
    avg(webpage_risk_score) AS "Average Risk Score",
    max(webpage_risk_score) AS "Max Risk Score",
    min(webpage_risk_score) AS "Min Risk Score",
    values(action_taken) AS "Actions Taken",
    count(eval(action_taken=="Blocked")) AS "Blocked Requests",
    values(threat_classification) AS "Threat Classification",
    values(threat_name) AS "Threat Names",
    values(threat_library_link) AS "Threat Detail Link",
    values(protocol_used) AS "Protocols Used",
    values(accessed_url) AS "Accessed Url",
    values(country) AS "Countries",
    values(city) AS "Cities",
    values(latitude) AS "Latitude",
    values(longitude) AS "Longitude",
    values(severity_tier) AS "Severity Tier",
    values(detection_confidence) AS "Detection Confidence",
    values(mitre_technique) AS "MITRE Technique",
    values(classification_source) AS "Classification Source"
by threat_mind_map_category

What it does

  • Generates grouped summary statistics by threat_category (ZIA) and threat_mind_map_category (ENISA mapping).

  • The values() function returns distinct values for multi-valued fields.

  • The count(eval(action_taken=="Blocked")) expression calculates the number of blocked requests.

Why

This aggregated data is suitable for presentation on dashboards or for export to reporting teams.


11) Post-processing: rounding and block percentage

| eval "Average Risk Score" = round('Average Risk Score', 2)
| eval "Blocked Percentage" = round(('Blocked Requests'/'Total Requests')*100, 2)

What it does

  • Rounds the average risk score to two decimal places.

  • Calculates and rounds the percentage of requests that were blocked.

Why

  • Provides cleaner, more readable metrics for dashboards and reports.

12) Output field selection

| fields threat_mind_map_category, "Total Requests", "Unique Users", "Unique Locations", "Average Risk Score", "Max Risk Score", "Min Risk Score", "Actions Taken", "Blocked Percentage", "Detection Confidence", "MITRE Technique", "Threat Classification", "Threat Names", "Threat Detail Link", "Accessed Url", threat_category ,"Classification Source"

What it does

  • Restricts the final output to only the specified columns of interest.

Why

  • Maintains dashboard clarity and reduces data payload when forwarding results or exporting to CSV.

Output of splunk Query

Fig Showing Splunk Statistics tab


Fig Showing Splunk Visulization tab


πŸŽ“ Learning Takeaways

For colleagues new to Splunk:

  • Employ eval combined with case for dynamic data classification.

  • Utilize match() with regular expressions for flexible string matching.

  • Use stats and eventstats commands to summarize datasets effectively.

  • Integrate ENISA, MITRE, and ZIA data to achieve standardized threat reporting across systems.


🧩 Next Steps

  • Implement regular monitoring and review of generated reports.

  • Refine reports to minimize unknown or fallback classifications and enhance accuracy of threat mind map mappings.

  • Utilize dashboard features to make Zscaler Threat Library links clickable, facilitating direct user access to threat information.