Comprehensive Explanation: What is a SIEM (in 2020 and beyond.)

SOCComprehensive Explanation: What is a SIEM (in 2020 and beyond.)

[I have not had the time to proof read nor correct grammatical errors, spelling mistakes and typos. ]

SIEM unifies Threat Detection and Hunting.

This is an old topic worth revising and level setting with the latest advancements, concepts and learning from a decades of unsuccessful SIEM deployments! It is worth revisiting as allot people don’t understand the value and even less understand how to effectively operationalise and achieve business outcomes utilising the power of a SIEM.

After reading this you will gain enough insight into the basics of SIEM.

I am continually asked the same questions around SIEM design, so glad to finally brain dump this knowledge and share with the community

(SIEM in Public Cloud is beyond the scope of this article, while all the information is relevant, I will write another article focusing specifically for Threat Detection for Public Cloud environments. )

Security Information and Event Management

A SIEM seeks to provide a holistic approach to an organisation’s IT security. A SIEM represents a combination of services, appliances, and software products. It performance real-time collection of log data from devices,  applications and hosts. It also process the collected log data, enabling real-time analysis of security alerts generated by network hardware and applications, Advanced Correlation for security and operational events, as well as real-time alarming and scheduled  reporting.

SIEM technology is used in many enterprise organizations to provide real time reporting and long term analysis of security events. SIEM products evolved from two previously distinct product categories, namely security information management (SIM) and security event management (SEM).

Table 1 shows this evolution.

Table 1 . SIM and SEM Product Features Incorporated into SIEM

Separate SIM and SEM Products

Security Information Management:

Log collection, archiving, historical reporting, forensics

Security Event Management:

Real time reporting, log collection, normalization, correlation, aggregation

Combined SIEM Product

Log collection





SIEM combines the essential functions of SIM and SEM products to provide a comprehensive view of the enterprise network using the following functions:

  • Log collection of event records from sources throughout the organization provides important forensic tools and helps to address compliance reporting requirements.
  • Normalization maps log messages from different systems into a common data model, enabling the organization to connect and analyze related events, even if they are initially logged in different source formats.
  • Correlation link slogs and events from disparate systems or applications, speeding detection of and reaction to security threats.
  • Aggregation reduces the volume of event data by consolidating duplicate event records.
  • Reporting presents the correlated aggregated event data in real-time monitoring and long-term summaries.

Internal IT environment consists of services, networking equipment, application, and components that they want to protect and prevent intrusion into. In order to protect these assets and data, you can deploy protection in the form of firewalls, antivirus, IPS/IDS and Authentication. Protection Examples such as;

  • Firewalls
  • Antivirus
  • IPS
  • IDS
  • Authentication
  • Web Security
  • Email Security
  • Traffic Capture
  • WAF
  • DLP
  • FIM
  • Secure Access Service Edge
  • MFA
  • EDR

Despite all of the systems and effort put into these solutions, those trying to breach that environment will get in. Once they are in, detecting and responding to their attack is time critical. 

A SIEM receives or taps into all of these activity as it is continually receiving thousands of logs per second from all of these devices and systems within the environment. The SIEM process log data to make meaning of what is actually happening on a device aka Detection, and analytics are used to analyses data activity, providing more input into what is actually happening.

SIEM solutions also provides the ability to analysis log historic data and generate reports for compliances purposes as well as providing digital forensic and fulfilling additional parts of overall information security strategy. 

SIEM solutions centralising log data within IT environments, augmenting security measures and enabling real-time analysis. It is constantly watching, monitoring and analysing events and alerts with the environment in an effort to detect attacks and intrusions.

Fourth Wave of SIEM

SIEMs sometimes gets a bad name as it is incredibly powerful and yet takes enormous amount of skills and effort to get working. Not because of the SIEM, but it requires data from all of your IT environment and that particularly causes massive delays in successful SIEM deployment. (This can be easily solved. Keep reading.) SIEM has evolved to very mature platforms. E.g. ArcSight 20+ years of evolution. Read ArcSight History here

  • First Wave
    • PCI-DSS really drove first phase of SIEM deployment for Complaint Business outcome.
  • Second Wave
    • Then people started to detect bad things in network activity.
  • Third Wave
    • This phase was when customer started to build SOCs.
  • Fourth Wave
    • This is about SOCs developing Threat Hunting utilising NDR, EDR, SIEM and SOAR

Machine Data

SIEM processes all types of Machine data produced by devices in a IT environment.

Machine data is one of the most underused and undervalued assets of any organization. But some of the most important insights that you can gain—across IT and the business—are hidden in this data: where things went wrong, how to optimize the customer experience, the fingerprints of fraud. All of these insights can be found in the machine data that’s generated by the normal operations of your organization.

Machine data is valuable because it contains a definitive record of all the activity and behavior of your customers, users, transactions, applications, servers, networks and mobile devices. It includes configurations, data from APIs, message queues, change events, the output of diagnostic commands, call detail records and sensor data from industrial systems, and more.

The challenge with leveraging machine data is that it comes in a dizzying array of unpredictable formats, and traditional monitoring and analysis tools weren’t designed for the variety, velocity, volume or variability of this data.


In computingsyslog /ˈsɪslɒɡ/ is a standard for message logging. It allows separation of the software that generates messages, the system that stores them, and the software that reports and analyzes them. Each message is labeled with a facility code, indicating the software type generating the message, and assigned a severity level.

The syslog protocol, defined in RFC 3164, protocol provides a transport to allow a device to send event notification messages across IP networks to event message collectors, also known as syslog servers. The protocol is simply designed to transport these event messages from the generating device to the collector. The collector doesn’t send back an acknowledgment of the receipt of the messages.

Syslog uses the User Datagram Protocol (UDP), port 514, for communication. Being a connectionless protocol, UDP does not provide acknowledgments. Additionally, at the application layer, syslog servers do not send acknowledgments back to the sender for receipt of syslog messages. Consequently, the sending device generates syslog messages without knowing whether the syslog server has received the messages. In fact, the sending devices send messages even if the syslog server does not exist.

The syslog packet size is limited to 1024 bytes and carries the following information:

  • Facility
  • Severity
  • Hostname
  • Timestamp
  • Message

Computer system designers may use syslog for system management and security auditing as well as general informational, analysis, and debugging messages. A wide variety of devices, such as printers, routers, and message receivers across many platforms use the syslog standard. This permits the consolidation of logging data from different types of systems in a central repository. Implementations of syslog exist for many operating systems.

When operating over a network, syslog uses a client-server architecture where a syslog server listens for and logs messages coming from clients.

The Syslog protocol is defined by Request for Comments (RFC) documents published by the Internet Engineering Task Force (Internet standards). The following is a list of RFCs that define the syslog protocol:[13]

  • The BSD syslog ProtocolRFC3164. (obsoleted by The Syslog ProtocolRFC5424.)
  • Reliable Delivery for syslogRFC3195.
  • The Syslog ProtocolRFC5424.
  • TLS Transport Mapping for SyslogRFC5425.
  • Transmission of Syslog Messages over UDPRFC5426.
  • Textual Conventions for Syslog ManagementRFC5427.
  • Signed Syslog MessagesRFC5848.
  • Datagram Transport Layer Security (DTLS) Transport Mapping for SyslogRFC6012.
  • Transmission of Syslog Messages over TCPRFC6587.

More reading on Syslog;


SIEM is a mandatory requirement for Compliance Audits such as PCI-DSS, ISO, 27001, Sarbanes–Oxley Act of 2002(thanks Enron), and other standards.

The Payment Card Industry  (PCI) Security Standards Council was founded by five global payment brands: American Express, Discover Financial Services, JCB International, MasterCard, and Visa. These five payment brands had a common vision of strengthening  security policies across the industry to prevent data breaches for businesses that accept and process payment cards. Together they drafted and released the first version of PCI Data Security Standard (PCI DSS 1.0) on December 15, 2004.

PCI DSS is a regulation with twelve requirements that serve as a security baseline to secure payment card data.

  • PCI-DSS v 3.2.1 Requirements;
    • Requirement 10: Track and monitor all access to network resources and cardholder data.
    • Requirement 11.5: Deploy a change detection mechanism (for example, file integrity monitoring tools) to alert 24 personnel to unauthorized modification (including changes, additions, and deletions) of critical system files, configuration files or content files. Configure the software to perform critical file comparisons at least weekly. Implement a process to respond to any alerts generated by the change-detection solution.
    • PCI DSS v3.2.1 Quick Reference Guide 2020-05-01 11-42-23

Depending on your PCI-DSS merchant level and number of Credit Card transactions you process, you will need to adhere to different levels of PCI-Auditing.

Cyber Threat Intelligence

Threat intelligence, or cyber threat intelligence, is information an organization uses to understand the threats that have, will, or are currently targeting the organization. This info is used to prepare, prevent, and identify cyber threats looking to take advantage of valuable resources.

Cyber Threat Intelligence consists of many number of information including; Indicators of Comprise and Indicators of Attacks

Indicators of compromise (IOCs) are “pieces of forensic data, such as data found in system log entries or files, that identify potentially malicious activity on a system or network.” Indicators of compromise aid information security and IT professionals in detecting data breaches, malware infections, or other threat activity. By monitoring for indicators of compromise, organizations can detect attacks and act quickly to prevent breaches from occurring or limit damages by stopping attacks in earlier stages.

Indicators of compromise act as breadcrumbs that lead infosec and IT pros to detect malicious activity early in the attack sequence. These unusual activities are the red flags that indicate a potential or in-progress attack that could lead to a data breach or systems compromise.

Indicators of attack are similar to IOCs, but rather than focusing on forensic analysis of a compromise that has already taken place, indicators of attack focus on identifying attacker activity while an attack is in process. Indicators of compromise help answer the question “What happened?” while indicators of attack can help answer questions like “What is happening and why?” A proactive approach to detection uses both IOAs and IOCs to discover security incidents or threats in as close to real time as possible

Example IoCs;

  • Unusual Outbound Network Traffic
  • Anomalies in Privileged User Account Activity
  • Geographical Irregularities
  • Log-In Red Flags
  • Increases in Database Read Volume
  • HTML Response Sizes
  • Large Numbers of Requests for the Same File
  • Mismatched Port-Application Traffic
  • Suspicious Registry or System File Changes
  • Unusual DNS Requests
  • Unexpected Patching of Systems
  • Mobile Device Profile Changes
  • Bundles of Data in the Wrong Place
  • Web Traffic with Unhuman Behavior
  • Signs of DDoS Activity

ATPs and Tactics, Techniques and Procedures (TTPs)

SIEM can utilise Cyber threat intelligence/IoCs/IoAs/TTPS and correlate with the IT environment log data to Detect threats in real-time and history log data. 

Correlation Rules, Behaviour patterns, Pattern matching, Anomaly detection, Conditions, Thresholds, Network Modelling and Machine learning (Phew give me a pay rise. )

Correlation is one of the key components of any effective SIEM tool. As information from across your digital environment feeds into a SIEM, it uses correlation to identify any possible issues. It does so by comparing sequences of activity against preset rules, conditions and thresholds. SIEMs allow sophisticated ways to implement risk based rules.

The latest SIEM, can now implement Anomaly detection via Machine learning.

All integrated with Threat Intelligence information.

The Brains inside a SIEM is based on Correlation Rules, Pattern matching, Conditions, Thresholds and now implementation of Machine learning via Unsupervised and Supervised Models.

  • Correlation Rules
  • Pattern Matching
  • Conditions
  • Thresholds
  • Supervised Machine Learning
  • Unsupervised Machine Learning
  • Network Modelling and Risk Scoring

Use Case

Use case is a term used for Threat Detection in terms of Business Context. It combines the value and context in SIEM platform.

Leading SIEM platforms such as ArcSight has built-in ESM Default Content Use Cases for 80% of your Threat Detection requirements. There are also 3rd Party Use Case library’s including SOCPrime ATT&CK® and SIGMA generic SIEM rules format. SIGMA Rules 

You can catch just about everything with ArcSight Default Content and SIGMA Rules! The rest you need to pay someone like me to workshop and write.

Machine Data Sources

Data Type Use Cases Examples
Amazon Web Services Security & Compliance, IT Operations Data from AWS can support service monitoring, alarms and a dashboards for metrics, and can also track security-relevant activities, such as login and logout events.
APM Tool Logs Security & Compliance, IT Operations APM tool logs can provide end-to-end measurement of complex, multi-tier applications, and be used to perform post-hoc forensic analytics on security incidents that span multiple systems.
Authentication Security & Compliance, IT Operations, Application Delivery Authentication data can help identify users that are struggling to log in to applications and provide insight into potentially anomalous behaviors, such as activities from different locations within a specified time period.
Firewall Security & Compliance, IT Operations Firewall data can provide visibility into blocked traffic in case an application is having communication problems. It can also be used to help identify traffic to malicious and unknown domains.
Industrial Control Systems (ICS) Security & Compliance, Internet of Things, Business Analytics ICS data provides visibility into the uptime and availability of critical assets, and can play a major role in identifying when these systems have fallen victim to malicious activity.
Medical Devices Security & Compliance, Internet of Things, Business Analytics Medical device data can support patient monitoring and provide insights to optimize patient care. It can also help identify compromised protected health information.
Network Protocols Security & Compliance, IT Operations Network protocol data can provide visibility into the network’s role in overall availability and performance of critical services. It’s also an important source for identifying advanced persistent threats.
Sensor Data Security & Compliance, IT Operations, Internet of Things Sensor data can provide visibility into system performance and support compliance reporting of devices. It can also be used to proactively identify systems that require maintenance.
System Logs Security & Compliance, IT Operations System logs are key to troubleshooting system problems and can be used to alert security teams to network attacks, a security breach or compromised software.
Web Server Security & Compliance, IT Operations, Business Analytics Web logs are critical in debugging web application and server problems, and can also be used to detect attacks, such as SQL injections.

SIEM Data  formats

Typical formats supported by SIEM platform to ingest Log data;

Syslog, SNMP, SMTP,  SCP, FTP, flat file, SQL query, Database Reader, cloud APIs, REST_api, XML, Secure syslog, Cisco FIREsight and SDEE, Checkpoint LEA. AWS Guard duty, Cloudwatch, AWS S3, SCP, JDBC, etc.

Common Event Format (CEF)

In the realm of security event management, a myriad of event formats streaming from disparate devices makes for a complex integration. Common Event format by ArcSight  promote interoperability between various event- or log-generating devices.

Although each vendor has its own format for reporting event information, these event formats often lack the key information necessary to integrate the events from their devices.
The ArcSight standard attempts to improve the interoperability of infrastructure devices by aligning the logging output from various technology vendors.
Common Event Format (CEF) is a Logging and Auditing file format from ArcSight and is an extensible, text-based format designed to support multiple device types by offering the most relevant information.
Message syntaxes are reduced to work with Arcisght normalization. Specifically, Common Event Format defines a syntax for log records comprised of a standard header and a variable extension, formatted as key-value pairs.The format called Common Event Format (CEF) can be readily adopted by vendors of both security and non-security devices.
This format contains the most relevant event information, making it easy for event consumers to parse and use them. To simplify integration, the syslog message format is used as a transport mechanism.


  • Time Normalisation
    • Ensures timestamps all reflect the same time zone to correlate events from different timezones.
    • Time is an important piece for threat detection. Some time zones around the world don’t observe Daylight Savings Time (DST) and some time zones are actually a half hour different than others. In addition to time zone issues, some devices don’t include a time in the log message. A SIEM needs to timestamp a log with a single time zone.
  • Data Enrichment (Meta data extracting, tagging and enrichment)
    • SIEM parses and breaks down log message into core components and adding context. e.g. adding customer tag, etc.
    • Log data is not uniform, they following a standard protocol, but the information within isn’t standard followed by  log source providers, so a SIEM has to process the log into a unified threat detection taxonomy and universal schema in order to run mathematical rules.
    • Log information needs to be assigned into common schema so that a [User Log on] message from various system from Unix, Windows, Active Directory, AWS, etc will all be tagged as User Log on to assist threat detection search rules.
  • Threat and Risk Contextualisation
    • Evaluate each log and provide risk-based priority value. e.g. Information for Edge services / DMZ or Authentication such as Active Direction, DNS information, etc.
May 11 10:00:39 scrooge SG_child[808]: [ID 748625] m:WR-SG-SUMMARY c:X (http) GET / => http://bali/ , status:200 , redirection URL: , referer: , mapping:bali , request size: 421 , backend response size: 12960 , audit token:- , time statistics (microseconds): [request total 16617 , allow/deny filters 1290 , backend responsiveness 11845 , response processing 1643 , ICAP reqmod  , ICAP respmod  ] timestamp: [2012-05-11 10:00:39] [ rid:T6zHJ38AAAEAAAo2BCwAAAMk sid:910e5dd02df49434d0db9b445ebba975 ip: ]

Securty Schema


Events are a collections of syslogs that is created after processing with Threat Intelligence and/or correlation rules. An Event is a actionable log items sent to human Analysts for further triage, performing investigations and reporting.

Sizing SIEM solutions

Sizing a SIEM solutions, begins with the basic list of devices that you want to monitor. See Example Device List collection Tool;

Device List
Device TypeVendorModelLocationQuantity
Windows Server (Active Directory)Microsoft1
Windows Server (DNS)Microsoft1
AWS (CloudTrail)AWS1
Fortinet Firewall (IDS/IPS/VPN)Fortinet1
Citrix Access GatewayCitrix1

SIEM Sizing (Events Per Second)

Critical to sizing and design of a SIEM platform, is to determine Events Per Second produced by the quantity of devices Size,

You need to determine and estimate the following SIEM fundamentals;

  • Events Per Second
  • Events Per Day:
  • Online Retention Period and requirement Storage in GBs
  • Retention Period and required Storage in GBs
  • Network Bandwidth Peak requirements: (GB /per second for all Devices.)
  • EPS Peak
  • EPS average (Day, Week, Month, etc.)
  • Estimated Device Growth over 3 years
  • EPS Headroom (Allow 10-30%)
  • Recovery Point Objective
  • Recovery Time Objective
  • Uptime requirement
  • Event / Alert Size (512 Kbs per Event is a rough estimate.)

SIEM Sizing Rosetta Stone

GB (1 GB = 1,000,000,000 BYTES)EPS (1 EVENT = 600 BYTES)

Storage and Archival are critical for any Security Logging platform

  • Raw Event Size
  • Normalised Event Size
  • Retention Time
  • Online Retention Period
  • Events Per Day
  • Compression Ration
  • GB Storage per day/Retention time.


It is vital to understand the way your SIEM platform receivers and processing data; What is the Schema format, Schema on Read, Schema on Write. Is it using Distributed Search or in-memory Real-time, etc. The last thing you want to do is HORD data and not understand what you are collecting and be scared of getting rid of it and not even be able to get any value from the data; Don’t turn into this guy, because the Finance department will start knocking on your door and the day will come when you will have to provide justification and prove business results. If you ever get breached and can’t even useful information after you stored tons of data. You might need to find another job.

Hoarding (With images) | Compulsive hoarding, Hoarding, Hoarder


Overwhelming about of logs sources without proper sanitisation and normalisation can lead to massive amount of useless information in SIEM leading to alert fatigue

False-Positive and False-Negatives

false positive state is when the SIEM identifies an activity as an attack but the activity is acceptable behavior. A false positive is a false alarm.

false negative state is the most serious and dangerous state. This is when the SIEM identifies an activity as acceptable when the activity is actually an attack. That is, a false negative is when the SIEM fails to catch an attack. This is the most dangerous state since the security professional has no idea that an attack took place.

False positives, on the other hand, are an inconvenience at best and can cause significant issues. However, with the right amount of overhead, false positives can be successfully adjudicated; false negatives cannot.

  • Airport Security: a “false positive” is when ordinary items such as keys or coins get mistaken for weapons (machine goes “beep”)
  • Medical screening: low-cost tests given to a large group can give many false positives (saying you have a disease when you don’t), and then ask you to get more accurate tests.
  • Antivirus software: a “false positive” is when a normal file is thought to be a virus

Popular SYSLOG Servers

  • ArcSight Logger
  • Nagios
  • Zabix
  • Logstash
  • NXLog

Log Sources Categories

  • Operations Systems
    • Windows
    • Linux
    • OSX
  • Mobile
    • iOS
    • Android
    • Microsoft
    • windows Phone
  • OT/IOT
    • err no clue
  • APIs
  • Databases
  • Policy Devices
    • Firewals
    • IDS/IPS
    • Authentication
    • Antivirus
  • Network Devices
    • Switches
    • Firewalls
    • Routers
  • Applications
  • Entities/Users
  • Public Cloud

SIEM – Real-Time vs Search

As the ever increasing volume of data increases, it becomes increasingly difficult to gain critical insights into to massive volumes of data for SIEMs and other data analytics platforms. SIEMs need to detect threats in-real time and search years of log source archives at the same time. So you are trying to solve two critical problems at the same time;

  1. Security Event Management 
    1. Real-Time Streaming Data Analytics
  2. Security Information Management
    1. Searching Large Data sets at scale and speed

These two requirements are incredibly difficult to solve at scale.  So, lo and behold, Open source to the rescue; Apache Kafka and Apache Hadoop provide solutions for both of these requirements.

Apache Kafka

A streaming platform has three key capabilities:

  • Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
  • Store streams of records in a fault-tolerant durable way.
  • Process streams of records as they occur.

Kafka is generally used for two broad classes of applications:

  • Building real-time streaming data pipelines that reliably get data between systems or applications
  • Building real-time streaming applications that transform or react to the streams of data

Apache Hadoop (aka Data Lake)

The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.

Security Operations and Automated Response (SOAR.)

This subject is beyond the scope of this article. I will dive into this in the near future.

Leading SIEM Vendor Solutions

  • ArcSight Data Platform
    • ArcSight really almost invited the SIEM industry with 20+ year Product portfolio and invented CEF format for cyber security now supports Apache Kafka and Apache Hadoop. Integrating Unsupervised Machine learning via Vertica, IDOL and Interset.
  • Splunk
    • While gaining popularity for general purpose IT monitoring, they do have some capability in Security and Big Data Analytics. Splunk Enterprise is the Base, solution, with Splunk Enterprise Security, Splunk UBA, Splunk Cloud and Splunk Phantom. , Splunk Machine Learning Toolkit, Splunk uses Common Information Model
  • IBM QRadar
    • Another original SIEM vendor.
    • I don’t have any experience with QRadar.
  • ELK Security Onion / HELK
    • Fastest growing Open source Search stack. ELK is Opensource. Elastic is very powerful opensource platform, recently acquired Endgame. ELK stack; Elasticsearch, Kibana, Logstash, Beats. ECS  Elastic Common Schema
  • McAfee Nitro
    • Popular due to McAfee Enterprise license agreements.
  • LogRythm
    • 100% Windows Server Based, no linux edition. Every complex to deploy and requires high resources and application administration. Does have SYSMON, FIM, NETMON, UEBA and SOAR as part of the solution.
  • FireEye / Mandiant
    • Premium products for Banking and Defence Grade Technology combined with 24/7 DFIR SOC services. So this is Product solution and arguably the best DFIR Team (Mandiant). Every expensive.. HX, NX, MX proud lines, for Endpoint, Network and Cloud SIEM.

Thank you for reading this article, please support my sharing, Next article,  I will look at Log collection and SIEM Design patterns in Cloud.

If you would like to sponsor my next article or this blog, please get in touch.

InterSET UEBA – Unsupervised Machine Learning for SOC operations.

InterSET UEBA – Unsupervised Machine Learning for SOC operations.


1. StartwithYourUseCase

You can’t find a solution without understanding the problem. Before buying or implementing new machine learning technology, identify the security use cases that are most critical for your organization. Once you understand and can articulate the problem you’re trying to solve, you are then ready to select the technol- ogy that is best suited for your needs .

2. AvoidtheBuzzwordTrap

AI and machine learning are ubiquitous terms in cybersecurity, but there is plenty of snake oil among ven- dors who claim to use these technologies . Do your homework to understand what type of machine learning is behind a vendor’s solution and whether or not that type of machine learning meet your security team’s needs. You don’t need to be a data scientist, but knowing just a little bit about how machine learning works can help you ask better questions when evaluating a vendor, like “What threats are not covered with existing tools and techniques?” or “Which data feeds contain valuable information but are currently underused?”

3. Don’tTreatMachineLearningLikeaSilverBullet

Your best defense comes from covering as many bases as possible. Machine learning alone will not find and stop a bad actor. Pairing a powerful UEBA with a next-gen SIEM provides a layered approach to security analytics that enables more visibility, better detection, and easier, quicker avenues for responding to both known and unknown threats. Real-time correlation quickly and effectively finds the known threats, and UEBA detects the subtle threats that will otherwise escape detection. The truth is that real-world threat scenarios often require a combination of both of these approaches .

4. CreateaHuman-MachineTeam

The humans in your SOC are more valuable than ever, but they are facing formidable challenges. SOC teams consistently struggle to deal with snowballing feeds of data and constantly evolving threats . A proactive security posture comes from a human-machine team that leverages the strengths of each: faster-than-human analysis by machines to identify leads for investigation and the contextual understand- ing of SOC analysts and threat hunters .

Different Types of Machine Learning

machine-learning-in-the-soc-wp.pdf (page 4 of 7) 2020-04-03 09-39-41

UEBA MITRE – Machine learning Use Case Examples

machine-learning-in-the-soc-wp.pdf (page 5 of 7) 2020-04-03 09-41-03

A sample of MITRE ATT&CK threat tactics and associated behavioral indicators detected by anomaly detection powered by unsupervised machine learning, such as Interset UEBA.


ArcSight Multi-Tenancy Design for MSSPs

ArcSight Multi-Tenancy Design for MSSPs

Customer tagging is a feature developed mainly to support MSSP environments, although private organizations can use the technique to denote cost centers, internal groups, or business units.

A Customer is not a source or target of an event, but it can be thought of as the owner of an event. Content developers can also use the Customer tag to develop customer-aware content.

Why is customer tagging critical in MSSP environments? The Customer designation identifies who owns the events. This ensures each customer (tenant) can view only its own events.

Consider this scenario: The customer tag is usually assigned based on the reporting device IP address. In an MSSP environment, different customers can have overlapping networks. This requires an elaborate mechanism for assigning a customer attribute to events.

Since most organizations use private address spaces (see network), addresses included in events from different customers may contain identical addresses but referring to different assets. For example, two tenants may use the private address space 192.0.2.x, and therefore the address may be used by both tenants to refer to a local system.

Make sure you have the proper network information model, which includes zone information, and the asset model, which requires correct zone information. When a connector enriches an event with asset information derived from the ESM asset model, the event uses the asset address as key for locating asset information. The ESM asset model would therefore need a mechanism to differentiate between assets with the same address but belonging to different customers.

  • SMART Connector CEF Customer URI field
  • Isolation of Resources
  • Logger Storage Groups Separation
  • ESM Network Modeling
  • ESM Best Practices: MSSP
  • Logger Storage Groups
    • Allow for separation of events by retention periods or by event type and categorisation URI field.
    • Segregate Customers
  • Voltage SecureData data encryption and masking for Data Sovereignty
  • Global ID will assign a unique ID to each security event coming into ArcSight. This ID is globally unique and can be used to facilitate easier cross-portfolio analysis across multiple ESM installations as well as other ArcSight solutions.
  • ArcSight – Network Model in an MSSP Environment



Microfocus Security ArcSight ESM

Microfocus / ArcSight Data Platform / ArcSight ESM

Content Player 2020-03-11 12-38-08

arcsight_securedata_add_on_for_adp_enabling_privacy_compliance_flyer.pdf 2020-03-26 10-48-37

arcsight_enterprise_security_manager_ds.pdf 2020-03-26 10-51-16

Reference Architecture

  1. Connector -> Logger -> ESM (Ideally.)
  2. Connector -> ESM -> Logger
  3. Connector -> Logger & ESM


Here’s the video showing what is possible with that CIRCP MISP integration

How ArcSight, CIRCL MISP and MITRE ATT&CK matrix can be used to provide realtime protection against these attacks capitalizing on Corona/COVID-19 fears.

Achieving True Zero-Day Protection with ArcSight, MITRE ATT&CK, and MISP CIRCL

How To: Configure MISP & ESM to address COVID-19 & Coronavirus threats

ArcSight Family

Difference between a Smart Connector and Smart Collector

To undersand the Collectors v.s Connectors, we need to step back and look at what the SmarConnectors do.

Conceptually, the standard SmartConnectors have two main responsibilties: “Collect” raw data from various sources, and “Process” the collected data to become enriched security events and post them to a destination.

Introduced in ADP 2.30, customers can take advantage of the massive scalabilty and robustness of the Event Broker infrastructure, and move the computationaly intensive “Process” step to the highly scalable and more robust Event Broker streaming infrastructure.

This is done by using syslog Colelctors and syslog CEBs: Collectors are standalone compnents very similar to the SmartConenctors, but they only “Collect” raw syslog data like the syslog SmartConnectors do, wrap it up and post it to a dedicated eb-con-syslog topic in Event Broker.

At that point, the Event Broker’s CEB stream processors (CEB stands for Connector in Event Broker) read the data from the eb-con-syslog topic, do the parsing/normalization/enrichment/filtering processing (as the standalone SmartConnectors destination pipelines do) and post the security events on the EB topics for consumption.

In other words: as their name suggests, the syslog Collectors are lightweight component responsible for collecting raw syslog data and passing it to Event Broker for processing.

Main advantages of the new architecture:

  1. Potential for hardware consolidation and data throughput increase in the data collection layer where the Collectors are deployed: due to moving the processing to the EB streaming infrastructure.
  2. Improved stabilty and easy horizontal scalability as the data flows increase with time, or fluctuate during operations: CEBs are deployed or undeployed on the EB nodes with a single click in the ArcMC UI.
  3. Reduced network traffic due to a single data feed to Event Broker, instead of having tmultiple destinations coming from SmartConnectors
  4. The raw Syslog data is now available on the EB topic for any system that customer would like to share it with.

Note that at this time Colectors and CEBs are only available for Syslog data.

SmartConnector formats;

  • Log File Readers (including text and log file)
  • Syslog
  • SNMP
  • Database
  • XML
  • Proprietary protocols, such as OPSEC

Connector Types

  • API Connectors
  • Database Connectors
    • Database connectors use SQL queries to periodically poll for events. Connectors support major database types, including
    • MS SQL, MS Access, MySQL, Oracle, DB2, Postgres, and Sybase.
    • IBM DB2 connectors: DB2 drivers are no longer provided in the connector installation due to licensing requirements.
    • Microsoft SQL Server Multiple Instance DB connector
    • McAfee Vulnerability Manager DB.
    • Time-Based Queries use a time field to retrieve events found since the most recent query time until the current time.
    • ID-Based Queries use a numerically increasing ID field to retrieve events from the last checked ID until the maximum ID.
    • Job ID-Based Queries use Job IDs that are not required to increase numerically. Processed Job IDs are filed in such a way that only new Job IDs are added. Unlike the other two types of database connector, Job IDs can run in either Interactive mode or Automatic mode
  • FlexConnectors
  • File Connectors
    • Real Time
    • Folder Follower:
  • Microsoft Windows Event Log Connectors
    • SmartConnector for Microsoft Windows Event Log
    • SmartConnector for Microsoft Windows Event Log – Native
    • SmartConnector for Microsoft Windows Event Log – Unified
  • Model Import Connectors
    • Rather than collecting and forwarding events from devices, Model Import Connectors import user data from an Identity Management system into ArcSight ESM. See individual configuration guides for Model Import Connectors on Protect724 for information about how these connectors are used
    • Model Import Connectors extract the user identity information from the database and populate the following lists in ESM with the data:
    • Identity Roles Session List
    • Identity Information Session List
    • Account-to-Identity Map Active List
  • Scanner Connectors
  • SNMP Connectors
    • SNMP Traps contain variable bindings, each of which holds a different piece of information for the event. They are usually sent over UDP to port 162, although the port can be changed. SNMP connectors listen on port 162 (or any other configured port) and process the received traps. They can process traps only from one device with a unique Enterprise OID, but can receive multiple trap types from this device. SNMP is based upon UDP, so there is a slight chance of events being lost over the network. Although there are still some SNMP connectors for individual connectors, most SNMP support is provided by the SmartConnector for SNMP Unified. Parsers use the knowledge of the MIB to map the event fields, but, unlike some other SNMP-based applications, the connector itself does not require the MIB to be loaded
  • Syslog Connectors
    • Syslog messages are free-form log messages prefixed with a syslog header consisting of a numerical code (facility + severity), timestamp, and host name. They can be installed as a syslog daemon, pipe, or file connector. Unlike other file connectors, a syslog connector can receive and process events from multiple devices. There is a unique regular expression that identifies the device.
    • Syslog Daemon connectors listen for syslog messages on a configurable port, using port 514 as a default. The default protocol is UDP, but other protocols such as Raw TCP are also supported. It is the only syslog option supported for Windows platforms.
    • Syslog Pipe connectors require syslog configuration to send messages with a certain syslog facility and severity. The Solaris platform tends to under perform when using Syslog Pipe connectors. The operating system requires that the connector (reader) open the connection to the pipe file before the syslog daemon (writer) writes the messages to it. When using Solaris and running the connector as a nonroot user, using a Syslog Pipe connector is not recommended. It does not include permissions to send an HUP signal to the syslog daemon.
    • Syslog File connectors require syslog configuration to send messages with a certain syslog facility and severity. For high throughout connectors, Syslog File connectors perform better than Syslog Pipe connectors because of operating system buffer limitations on pipe transmissions
    • Raw Syslog connectors generally do no parsing and takes the syslog string and puts it in the rawEvent field as-is . The Raw Syslog destination type takes the rawEvent field and sends it as-is using whichever protocol is chosen (UDP, Raw TCP, or TLS). The Raw Syslog connector is always used with the Raw Syslog destination. The event flow is streamlined to eliminate components that do not add value (for example, with the Raw Syslog transport the category fields in the event are ignored, so the categorization components are skipped). If you are transporting data to ArcSight Logger, you can use specific configuration parameters to provide minimal normalization of the syslog data (for source and timestamp)
    • Syslog NG Daemon connectors support Syslog NG version 3.0 for BSD syslog format. Support is provided for collection of IETF standard events. This connector is capable of receiving events over a secure (encrypted) TLS channel from another connector (whose destination is configured as CEF Syslog over TLS), and can also receive events from devices
    • CEF Encrypted Syslog (UDP) connectors allow connector-to-connector communication through an encrypted channel by decrypting events previously encrypted through the CEF Encrypted Syslog (UDP) destination. The CEF connector lets ESM connect to, aggregate, filter, correlate, and analyze events from applications and devices that deliver their logs in the CEF standard, using the syslog transport protocol.
    • UNIX supports all types of syslog connector. If a syslog process is already running, you can end the process or run the connector on a different port. Because UDP is not a reliable protocol, there is a slight chance of missing syslog messages over the network. Generally, TCP is a supported protocol for syslog connectors. There is a basic syslog connector, the connector for UNIX OS Syslog, which provides the base parser for all syslog sub-connectors. For syslog connector deployment information, see the connector Configuration Guide for UNIX OS Syslog. For device-specific configuration information and field mappings, see the connector configuration guide for the specific device. Each syslog sub-connector has its own configuration guide. During connector installation, for all syslog connectors, choose Syslog Daemon, Syslog Pipe, or Syslog File. The names of the syslog sub-connectors are not listed
  • IP NetFlow (NetFlow/J-Flow) Retrieves data over TCP in a Cisco-defined binary format.
  • ArcSight Streaming Connector Retrieves data over TCP from Logger in an ArcSight-proprietary format
  • Connectors for Transformation Hub
    • Connectors in =Transformation Hub supports ArcSight customers who want to have large-scale distributed ingestion pipelines with 100% availability, where data from any existing or new source at any scale can be ingested while maintaining enterprise level robustness. Transformation Hub can take messages with raw data collected from any source the ArcSight connector framework understands and automatically perform the data ingestion processing currently done by connectors, but deployed and managed at scale as Transformation Hub processing engines. Users deploy the Transformation Hub using the ArcSight Installer and Management Center to achieve the desired layout. New topics can be created in Management Centerand designated to process raw data from a particular technology framework with output into a specific format.
    • The connector technology in Transformation Hub performs all processing a connector would normally do: parser selection, normalization, main flow, destination specific flows, and categorization, as well as applying network zoning and Agent Name resolution. For more information, see the ArcSight Transformation Hub Administrator’s Guide and the ArcSight Management Center Administrator’s Guide.
Note: If you are using the Linux Red Hat 6.x or later platforms, ensure that you have these libraries or packages installed before installing a connector: 
X libraries 
fontconfig \ dejavu-sans-fonts 
When installing the 32-bit SmartConnector executable on 64-bit machines, the 32-bit versions of glibc, libXext, libXrender, and libXtst must be installed as well as the 64-bit versions

ESM Install

Hyper-V Configuration

Partitions Sizes



  • /tmp – more than 6 GB
  • /opt – more than 100 GB

CentOS Software Selection

  • GNOME Desktop
    • Compatibility Libraries
    • Development Tools
    • System Administration Tools


ESM bin



Log files /


Properties files



  • Make sure that the partition in which your /tmp directory resides has at least 6 GB of space. Make sure that the partition in which your /opt/arcsight directory resides has at least 100 GB of space.
  • Specifying a Global Event ID Generator ID, Global event IDs uniquely identify events across the ArcSigh
  • The Manager host name is used to generate a self-signed certificate. The Common Name (CN) in the certificate is the host name that you specify when prompted
  • The Manager host name is the IP address (for IPv4 only) or the fully-qualified domain name of the machine where the Manager is installed. All clients (for example, the ArcSight Console) use this name to connect to the Manager. For flexibility, Micro Focus recommends using a fully-qualified domain name instead of an IP address.
  • Make sure that the IP address is resolved to localhost in the /etc/hosts file, otherwise, the ESM installation will fail. This applies to IPv4 and IPv6 systems.

  • If you do not want the host name on your DNS server, add a static host entry to the /etc/hosts file to resolve the host name locally.
  • 8443/tcp 22/tcp (ssh)
  • TCP ports used internally for inter-component communication: 1976, 28001, 2812, 3306, 5555, 6005, 6009, 7777, 7778, 7779, 7780, 8005, 8009, 8080, 8088, 8089, 8666, 8765, 8766, 8881, 8808, 8880, 8888, 8889, 9095, 9090, 9123, 9124, 9999, 45450
  • 8443/TCP – SmartConnectors and consoles 9000/TCP – Peering694/UDP – High Availability module 7789/TCP – High Availability module 22/TCP – SSH login

  • Open the following TCP ports for inter-component communication:

  • 1976, 2812, 3306, 5555, 6005, 6009, 7777, 7778, 7779, 7780, 8005, 8009, 8080, 8088, 8089, 8666, 8765, 8766, 8808, 8880, 8881, 8888, 8889, 9000, 9090, 9095, 9123, 9124, 9999, 28001, 45450

  • The information repository uses ports 3179, 3180, 3181, and 3182.

  • Port





    SSH log in (Unix only)



    DNS requests and responses



    SmartConnectors and Consoles



    SMTP to mail server



    POP3 to mail server, if applicable



    IMAP to mail server, if applicable



    RADIUS, if applicable



    RADIUS, if applicable



    LDAP to LDAP server, if applicable

    636/TCP Outbound LDAP over SSL to LDAP server, if applicable

  • <ARCSIGHT_HOME>/config/jetty/keystore (to prevent the ArcSight Manager private key from being stolen)
  • <ARCSIGHT_HOME>/config/jetty/truststore (with SSL Client authentication only, to prevent injection of new trusted CAs)
  • <ARCSIGHT_HOME>/config/ (has database passwords)
  • <ARCSIGHT_HOME>/config/ (has cluster configuration properties and SSL properties common to persistor, correlator, and aggregator services on the node) This properties file is present on each node in a distributed correlation cluster.
  • <ARCSIGHT_HOME>/config/jaas.config (with RADIUS or SecurID enabled only, has shared node secret)
  • <ARCSIGHT_HOME>/config/ (with SSL Client authentication only, has keystore passwords)
  • <ARCSIGHT_HOME>/reports/ (to protect the report license)
  • <ARCSIGHT_HOME>/reports/archive/* (to prevent archived reports from being stolen)
  • <ARCSIGHT_HOME>/jre/lib/security/cacerts (to prevent injection of new trusted CAs)
  • <ARCSIGHT_HOME>/lib/* (to prevent injection of malicious code) l <ARCSIGHT_HOME>/rules/classes/* (to prevent code injection)
  • The xmlrpc.accept.ips property restricts access for ArcSightConsoles.

  • The agents.accept.ips property restrict saccess for SmartConnectors.

  • For registration, the SmartConnectors need to be in xmlrpc.accept.ips as well, so that they can be registered. (Being “registered” does not mean you can then remove them.)

    • The format for specifying subnets is quite flexible, as shown in the following example:

    • xmlrpc.accept.ips=

    • agents.accept.ips=10.*.*.*,

System Requirements for ESM 7.2;

  • Community Enterprise Operating System (CentOS) 7.6 and 6.10



High Performance


8 cores (16 preferred)

32 cores

40 cores


48 GB RAM (64 preferred)

192 GB RAM

512 GB RAM

Hard Disk

Six 600 GB disks (1.5 TB) (RAID 10)

10,000 RPM

20 1 TB disks (10 TB) (RAID 10)

15,000 RPM

12 TB (RAID 10)

Solid state

Linux Install

Download Install CentOS 7.6

//Use CentOS 7.6 -
Boot intro Troubleshooting —&gt; install CentOS 7 in basic graphics mode

Download the ArcSightESMSuite- 7.0.0.xxxx.1.tar from <a href=""></a>

scp  root@

//Install TMUX for remote installations

yum install tmux
tmux list-sessions
tmux attach -t number-of-session

// USB Mount
fdisk -l
mkdir /mnt/usb
mount -v -t auto /dev/sdf1 /mnt/usb
cd /mnt/usb/
umount /dev/sdf1

//Nic on laptop enp0s31f6
nmtui edit enp0s31f6

// Add hostanme to IP address in hosts file
nano /etc/host

//Mount USB
fdisk -l mkdir
/mnt/usb mount -v -t auto /dev/sdf1 /mnt/usb
cd /mnt/usb/
umount /dev/sdf1

// Unarchive installer

Create arcsight user with GUID and SU rights
Create a folder called esm_installer
chown arcsight: esm_installer
<span style="color:var(--color-text);">tar xvf ArcSightESMSuite-7.0.0.xxxx.1.tar</span>
cd \Tools
sudo ./

// Copy the license files to same location

ulimit -a (<span style="color:var(--color-text);">open files 65536/</span><span style="color:var(--color-text);">max user processes 10240)</span>

// Download and set Timezone
wget tzdata-2019b-1.el7.noarch.rpm <span style="color:var(--color-text);">/opt/work/
rpm -Uvh /opt/work/

sudo yum install tzdata -y
timedatectl list-timezones
timedatectl list-timezones | egrep -o “*Australian*.*”
timedatectl set-timezone “Asia/Kolkata”
timedatectl set-timezone America/Los_Angeles
timedatectl set-timezone UTC
timedatectl set-time 15:58:30
timedatectl set-time 20151120
timedatectl status
timedatectl | grep local
timedatectl set-local-rtc 1
timedatectl set-local-rtc 0
timedatectl set-ntp true

su arcsight | Pwd
Login under user account: arcsight into Console and install
/etc/init.d/arcsight_services stop all
/opt/arcsight/manager/bin/arcsight tzupdater /opt/arcsight /opt/arcsight/manager/lib/jre-tools/tzupdater</span>
/etc/init.d/arcsight_services start all

//Starting the installer

chmod +x /tmp/esm_install/ArcSightESMSuite.bin

chown -R arcsight:arcsight ../Tools

// Error: You are installing this product on an unsupported platform.
// If you are install on later version you might need to downgrade the version manual then update it later
sudo nano /etc/centos-release
sudo nano /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
CentOS Linux release 7.6 (Core)

// LOGIN into CONSOLE as arcsight
./ArcSightESMSuite.bin -i console

/opt/arcsight/manager/bin/arcsight firstbootsetup -boxster -soft -i console

/opt/arcsight/kubernetes/scripts/ &gt; /tmp/ca.crt

//To install the time zone update package after you complete the ESM
/etc/init.d/arcsight_services stop all

/opt/arcsight/manager/bin/arcsight tzupdater /opt/arcsight
/etc/init.d/arcsight_services start all

// As arcsight user

// Install ESM Login under user account: arcsight into Console and install
/opt/arcsight/manager/bin/arcsight firstbootsetup -boxster -soft -i console

IMPORTANT: The root user must run the following script to start up required services:

// START SERVICES as arcsight user
/etc/init.d/arcsight_services start
/etc/init.d/arcsight_services stop all
/etc/init.d/arcsight_services start all

//Set the hostname in local hosts file

//Applications/Google\\ Chrome --ignore-certificate-errors &amp;&gt; /dev/null &amp;

// Access https://arcsight:8443

/Chrome SSL Error type "thisisunsafe"

// Remove ESM
su arcsight

Remove all files in /tmp and /opt/arcsight rm -r *

The volume or partition required for installation of the /opt/arcsight directory does not contain the minium of 50GB of space to successfully install arcsight


df /opt/arcsight

df /opt/arcsight 	50GB
df/tmp	 		6GB

df -Th

echo 1 > /sys/block/sda/device/rescan

pvresize /dev/sda3
lvextend -l +100%FREE -r /dev/mapper/centos-root

Resizing the Linux Root Partition in a Gen2 Hyper-V VM


/opt/arcsight/connector/replay_pd/current/bin/arcsight agents

/opt/arcsight/connector/replay/current/bin/arcsight agents

/sbin/service arcsight_services start

/sbin/service arcsight_services start manager

/sbin/service arcsight_services stop

/sbin/service arcsight_services stop manager

tail -f /opt/arcsight/var/logs/manager/default/server.std.log

/opt/arcsight/manager/bin/arcsight deploylicense

Installations Options
0- ArcSight Content Management - This package contains resources to track content that is being managed across multiple ESM systems.
1- ArcSight ESM HA Monitoring - This package contains resources to track High Availability (HA) status and changes.
2- ArcSight Transformation Hub Monitoring - This package contains resources for monitoring Transformation Hub.
3- Security Threat Monitoring - This package contain default security threat monitoring content.
4- Threat Intelligence Platform - This package contains default content for threat intelligence platform.

Install ArcSight Console

  • Download software\



  1. DisableHyperThreading.This setting exists on most server class processors (for example, Intel processors) that support hyper threading. AMD processors do not have an equivalent setting.
  2. DisableIntelVT-d.This setting is specific to Intel processors and is likely to be present on most recent server class processors. AMD processors have an equivalent setting called AMD- Vi.
  3. SetPowerRegulatortoStaticHighPerformance.This setting tells the CPU(s) to always run at high speed, rather than slowing down to save power when the system senses that load has decreased. Most recent CPUs have an equivalent setting.
  4. SetThermalConfigurationtoIncreasedCooling.This setting increases the server fan speed to avoid issues with the increased heat that results from constantly running the CPU(s) at high speed.
  5. EnabletheMinimumProcessorIdlePowerPackageStatesetting.This setting tells the CPU not to use any of its C-states (various states of power saving in the CPU).
  6. SetPowerProfiletoMaximumPerformance. This setting results in the following changes:
    • QPI power management (the link between physical CPU sockets) is disabled.
    • PCIe support is forced to Gen 2.
    • C-states are disabled.
    • Lower speed settings on the CPUs are disabled so that the CPUs constantly run at high speed.

Silent Deployment using Terraform


ArcSight SmartConnector Install

//Insure FULL Java version on CentOS
[arcsight@vm-esm700-demo ~]$ java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
[arcsight@vm-esm700-demo ~]$

Microfocus has many product lines that is very interesting for cyber security intergrations;


ESM 101 2020-02-12 11-09-36

ESM 101 2020-02-03 15-35-31

User Roles

ESM_101_7.0P1.pdf (page 14 of 161) 2020-02-03 14-36-18

ESM 101 2020-02-12 12-57-40

ArcSight Connectors

automate the process of collecting and managing logs from any device and in any format through normalization and categorization of logs into a unified format known as Common Event Format (CEF), which is now an industry standard for log format. You can use this unified data for searching, reporting, analyzing or storing logs. ArcSight Connectors also manage ongoing updates, upgrades, configuration changes and administration of distributed deployments through a centralized web-based interface. They can be deployed as software or on an appliance

ESM 101 2020-02-03 15-39-55

ArcSight Connectors helps you with:

  • Scale easily to manage extreme machine data across IT
  • Reduce the cost of handling large volumes of logs and events in various formats
  • Automate the process of managing connectors to collect audit-quality log data
  • Share, upload, or download connectors within your ArcSight community
  • Seamlessly integrate with the ArcSight platform
  • Broadest set of built-in connectors that collect, aggregate, filter, and parse the logs
  • Managing log records in hundreds of different formats from hundreds of vendors
  • Patented technology to normalize and categorize logs that enables full-text English searching on rich metadata
  • High compression of log data up to 10:1 to reduce your storage costs significantly
  • Automate bandwidth management with low footprint

FlexConnector The FlexConnector framework is a software development kit (SDK) that enables you to create your own SmartConnector tailored to the nodes on your network and theirspecific event data. FlexConnector typesinclude file reader, regular expression file reader, time-based database reader, syslog, and Simple Network Management Protocol (SNMP) readers.

Forwarding Connector The Forwarding Connectorsforward events between multiple Managersin a hierarchical ESM deployment, and/or to one or more Logger deployments.

ArcSight Manager

The ArcSight Manager isthe heart of the solution. It is a Java-based server that drives analysis, workflow, and services. It also correlates output from a wide variety of security systems. The Manager writes eventsto the CORR-Engine asthey stream into the system. Itsimultaneously processesthem through the correlation engine, which evaluates each event with network model and vulnerability information to develop real-time threatsummaries. ESM comes with default configurations and standard foundation use cases consisting of filters, rules, reports, data monitors, dashboards, and network modelsthat make ESM ready to use upon installation



The Correlation Optimized Retention and Retrieval (CORR) Engine is a proprietary data storage and retrieval framework that receives and processes events at high rates, and performs high-speed searches


Security Use Case and Activate Framework Marketplace

ArcSightActivate Framework

ArcSight Activate Framework is a modular content development framework that allows you to implement ArcSight SIEM quickly and effectively. The framework provides a standard way of creating content. Standardized content means new analysts and engineers can easily review and understand existing content reducing the ramp-up time for new employees. It also opens up the possibility of sharing content with other ArcSight users. Best of all, the base content has been created from 10 years of experience implementing ArcSight in thousands of environments. What does this mean? It is proven and it works! ArcSight Activate Framework makes implementing SIEM easy. It helps you with:

  • Deploy modular content and standardized use cases to implement ArcSight quickly and effectively in your environment with minimal setup required.
  • Enable inexperienced users to create content quickly. Content created is easier to understand reducing training and maintenance costs.
  • Provide a standardized approach to creating content that can be shared between ArcSight installations and within the community to easily keep up on the latest IT security threats. This results in a robust SIEM that is easier to set up and maintain.
  • Leverage proven use cases developed by ArcSight SIEM experts to provide a robust implementation to increase your effectiveness and deployment success.

Click to access downloads-309.pdf

Interactive Discovery

ArcSight Interactive Discovery (AID) is a separate software application that augments Pattern Discovery, dashboards, reports, and analytical graphics. AID provides enhanced historical data analysis and reporting capabilities using a comprehensive selection of pre-built interactive statistical graphics. You can use AID to: l Quickly gain visibility into your complex security data l Explore and drill down into security data with precision control and flexibility l Accelerate discovery of hard-to-find eventsthat may be dangerous l Presentstate of security in compelling visualsummaries l Build a persuasive, non-technical call to action l Prove IT Security value and help justify budgets

Pattern Discovery

Pattern Discovery can automatically detectsubtle, specialized, or long-term patternsthat might otherwise go undiscovered in the flow of events. You can use Pattern Discovery to: l Discover zero-day attacks—Because Pattern Discovery does not rely on encoded domain knowledge (such as predefined rules or filters), it can discover patternsthat otherwise go unseen, or are unique to your environment. l Detect low-and-slow attacks—Pattern Discovery can process up to a million eventsin just a few seconds(excluding read-time from the disk). This makes Pattern Discovery effective to capture even low-and-slow attack patterns. l Profile common patterns on your network—New patterns discovered from current network traffic are like signaturesfor a particularsubset of network traffic. By matching against a repository of historical patterns, you can detect attacksin progress. The patterns discovered in an event flow that either originate from or target a particular asset can be used to categorize those assets. For example, a pattern originating from machinesthat have a back door (unauthorized program that initiates a connection to the attacker) installed can all be visualized as a cluster. If you see the same pattern originating from a new asset, it is a strong indication that the new asset also has a back door installed. l Automatically create rules—The patterns discovered can be transformed into a complete rule set with a single mouse click. These rules are derived from data patterns unique to your environment, whereas predefined rules must be generic enough to work in many customer environments. Pattern Discovery is a vital tool for preventive maintenance and early detection in your ongoing security management operations. Using periodic, scheduled analysis, you can always be scanning for new patterns over varying time intervalsto stay ahead of new exploitative behavior

Logger ArcSight Logger is an event data storage appliance that is optimized for extremely high event throughput. Loggerstoressecurity events on board in compressed form, but can alwaysretrieve unmodified events on demand for historical analysis-quality litigation data. Logger can be deployed stand-alone to receive eventsfrom syslog messages or log files, or to receive eventsin Common Event Format from SmartConnectors. Logger can forward selected events assyslog messagesto ESM. Multiple Loggers work together to scale up to support high sustained input rates. Event queries are distributed across a peer network of Loggers.

Content, Solutions, and CIPs for ESM and Logger

ArcSight ESM Compliance Insight Package for the Payment Card Industry (PCI) version 4.1 is now generally available. It can be downloaded by licensed customers from the HP support web site. The solution guide and release notes can be found here.

What’s New?

ESM Compliance Insight Package for PCI 4.1 contains the following important updates:

  • Support for PCI requirements specified in Payment Card Industry Data Security Standard 3.2 (PCI DSS 3.2)
  • Support for logs generated by applications subject to Payment Application Data Security Standard 3.2 (PA DSS 3.2)

About ESM Compliance Insight Package for PCI:

The ESM Compliance Insight Package for PCI provides a system of reports and real-time checks specifically designed to monitor systems that contain cardholder data, manage vulnerability and access control, monitor networks, and maintain security policies to help demonstrate to stakeholders and auditors that the controls over your company’s credit card data systems expose little or no risk.



ESM uses objects called resources to manage event-processing logic. A resource defines the properties, values, and relationships used to configure the functions that ESM performs. Resources can also be the output of such a configuration (such as archived reports, or Pattern Discovery snapshots and patterns).

ESM has more than 30 different types of resources and comes with hundreds of these resources already configured to give you functionality as soon as the product is installed. These resources are presented in the Navigator panel of the ArcSight Console.

Modeling Resources “The Network Model” on page 120 enables you to build a businessoriented view of data derived from physical information systems. These distinctions help ESM to clearly identify events in your network, providing additional layers of detail for correlation. “The Actor Model” on page 146 creates a real-time user model that maps humans or agents to activity in applications and on the network. Once the actor model is in place, you can use category models to visualize relationships among actors, and correlation to determine if their activity is above board. l Assets l Asset Ranges l Asset Categories l Zones l Networks l Customers l Vulnerabilities l Locations l Actors l Category Models

Correlation Resources Correlation is a process that discovers the relationships between events, infers the significance of those relationships, prioritizes them, then provides a framework for taking action. l Filters l Rules l Data Monitors l Active Lists l Session Lists l Integration Commands l Pattern Discovery

Monitoring and Investigation Resources Active channels and dashboards are tools that monitor all the activity that ESM processes for your network. Each of these views enables you to drill down on a particular event or series of events in order to investigate their details. Saved searches are those you run on a regular basis. They include query statements, the associated field set, and a specified time range. Search filters contain only the query statements. You define and save searches and search filters in the ArcSight Command Center, and export these resources as packages in the ArcSight Console. l Active Channels l Field Sets l Saved Searches and Search Filters l Dashboards l Query Viewers

Workflow and User Management Resources Workflow refers to the way in which people in your organization are informed about incidents, how incidents are escalated to other users, and how incident responses are tracked. l Annotations l Cases l Stages l Users and User Groups l Notifications l Knowledge Base l Reference Pages

Reporting Resources Reporting resources work together to create batch-oriented functions used to analyze incidents, find new patterns, and report on system activity. l Reports l Queries l Trends l Templates l Focused Reports

Administration Resources Administration resources are tools that manage ESM’s daily maintenance and long-term health. l Packages l Files l Storage and storage volumes l Retention periods

Standard Content Standard content is a series of coordinated resources that address common enterprise network security and ESM management tasks. Many of these resources are installed automatically with ESM to provide essential system health and status operations. Others are presented as install-time options organized by category. l ArcSight Administration l ArcSight System

Content Synchronization and Management Content synchronization provides the ability to publish content from one ESM instance to multiple ESM instances. Synchronization is managed through the creation of supported packages, establishment of ESM subscribers, and scheduling the publication of content. Packages

Normalising Event Data

Normalize meansto conform to an accepted standard or norm. Because networks are heterogeneous environments, each device has a different logging format and reporting mechanism. You may also have logsfrom remote sites where security policies and procedures may be different, with different types of network devices, security devices, operating systems and application logs. Because the formats are all different, it is difficult to extract information for querying without normalizing the eventsfirst. The following examples are logsfrom differentsourcesthat each report on the same packet traveling acrossthe network. These logsrepresent a remote printer buffer overflow that connectsto IIS servers over port 80.

Check Point:

“14” “21Nov2016” “12:10:29” “eth-s1p4c0” “ip.of.firewall” “log” “accept” “www-http” “” “” “tcp” “4” “1355” “” “” “” “” “” “” “” “” “” “firewall” “len 68”

Cisco Router:

Nov 21 15:10:27: %SEC-6-IPACCESSLOGP: list 102 permitted tcp ->, 1 packet Cisco PIX: Nov 21 2016 12:10:28: %PIX-6-302001: Built inbound TCP connection 125891 for faddr gaddr laddr


[**] [1:971:1] WEB-IIS ISAPI .printer access [**] [Classification: Attempted Information Leak] [Priority: 3] 11/21-12:10:29.100000 -> TCP TTL:63 TOS:0x0 ID:5752 IpLen:20 DgmLen:1234 DF ***AP*** Seq: 0xB13810DC Ack: 0xC5D2E066 Win: 0x7D78 TcpLen: 32 TCP Options (3) => NOP NOP TS: 493412860 0 [Xref => [Xref =>

In order to productively store this diverse data in a common data store, SmartConnectors evaluate which fields are relevant and arrange them in a common schema. The choice of fields are content driven, ESM 101 Chapter 4: Data Collection and Event Processing Micro Focus ESM (7.0 Patch 1) Page 31 of 161 not based on syntactic differences between what Checkpoint may call target address and what Cisco calls destination address. To normalize, SmartConnectors use a parser to pull out those valuesfrom the event and populate the corresponding fieldsin the schema. Here is a very simple example of these same alerts after they have been normalized.

ESM 101 2020-02-12 11-29-43

Time stamp

Another factor in normalization is converting timestampsto a common format. Since the devices may all use different time zones, ESM normalization convertsthe timestampsto UTC (GMT).

Event Severity

During the normalization process, the SmartConnector collects data about the level of danger associated with a particular event asinterpreted by the data source that reported the event to the connector. These data points, device severity and agentseverity, become factorsin calculating the event’s overall priority described in “Evaluate the Priority Formula” on page 41.

Device severity capturesthe language used by the data source to describe itsinterpretation of the danger posed by a particular event. For example, if a network IDS detects a DHCP packet that does not contain enough data to conform to the DHCP format, the device flagsthis as a high-priority exploit.

Agent severity is the translation of the device severity into ESM-normalized values. For example, Snort uses a device severity scale of 1-10, whereas Checkpoint uses a scale of high, medium and low. ESM normalizesthese valuesinto a single agentseverity scale. The default ESM scale is Low, Medium, High, and Very High. An event can also be classified as AgentSeverity Unknown if the data source did not provide a severity rating.

Event Categories

Like the logsthemselves, differentsecurity devices also include a model for describing the characteristics of the eventsthey process. But no two devices or vendors use the same eventcharacteristic model. To solve this problem, ArcSight has also developed a common model for describing events, which enables you to understand the realsignificance of a particular event asreported from different devices. This common model also enables you to write device-independent content that can correlate events with normalized characteristics. This model is expressed as event categories, and the SmartConnector assignsthem using default criteria, which can be configured during connectorsetup. Event categories are a series of six criteria that translate the core meaning of an event from the system that generated it into a common format. These six criteria, taken individually or together, are a central tool in ESM’s analysis capability.

ESM 101 2020-02-12 11-41-23ESM 101 2020-02-12 11-40-53

Correlation is a four-dimensional processthat draws upon the network model, the priority formula, and optionally, Pattern Discovery to discover, infer meaning, prioritize, and act upon eventsthat meet specific conditions. For example, varioussystems on a network may report the following events: l UNIX operating system: multiple failed log-ins l IDS: Attempted brute force attack l Windows operating systems: multiple failed log-ins A correlation rule putsthese data pointstogether and detectsfive or more failed log-insin a oneminute period targeting the same source. Based on these facts, this combination of eventsis considered an attempted brute force attack. The Windows operating system next reports a successful log-in from the same source. The attempted brute force attack followed by a successful login from the same source elevatesthe risk that the attack may have been successful. To verify whether an attack wassuccessful, you can analyze the volume of traffic going to the Windows target. In this case, a sudden spike in traffic to thistarget can verify that a brute force attack was successful. ESM’s correlation tools use statistical analysis, Boolean logic, and aggregation to find events with particular characteristics you specify. Rules can then take automated action to protect your network.




ArcSight is developing Open and Extensible integration with BigData Analytics Kafka and Hadoop technology;