Threat modeling and automated response based on security big data

With the continuous deepening and maturity of enterprise information security management, we found that in the process of security protection and governance of enterprise information and data, the classification of digital assets and the construction of association analysis models have enabled enterprises to discover new Business ports, this new benefit is very precious to enterprise users, and the threat modeling technology and response and disposal methods of security data have played a major role in this process, but there are still shortcomings, because the security of most enterprises at this stage is Data processing capabilities cannot unearth the full value of assets, so enterprises are also facing many challenges in threat detection and response. In this issue of “Interview with Niuren”, we invited Mr. Shi Zehuan, Technical Director of Logeasy Security Products, to focus on data The technical characteristics and future development of collection, cleaning, detection, response, etc. have been deeply analyzed and professionally shared.

Shi Zehuan, Technical Director of Logeasy Security Products

1. With the continuous deepening of enterprise business digitization, enterprise digital assets are also growing exponentially, and the security protection of digital assets has become more and more important. What do you think enterprise users are facing in terms of log, data processing and security detection at this stage? challenge?

Shi Zehuan: In actual work, companies generally purchase security products from different manufacturers and build their own security defense system, which also leads to a variety of data types that need to be collected. , threat scenarios and other factors), collect different types of data (such as security equipment, network equipment, system logs, application logs, and traffic data), and normalize the data after completing the relevant data collection. In the process of normalization, if there are no built-in parsing rules and flexible parsing capabilities, it is necessary to parse the corresponding data sources one by one, which will consume a lot of implementation time, resulting in the analysis/delivery of upper-layer application scenarios. Insufficient investment.

Based on the actual experience of Logeasy, I believe that log processing and security detection are a whole process of unified threat management including a series of links such as full data collection, data cleaning, data storage and query, data analysis, and threat modeling. There are three main challenges it faces:

One is the lack of basic log processing capabilities such as data collection and cleaning (in the user environment). This directly affects the construction of the security threat detection model. In many environments, we have seen that after the unified management of logs, the actual effect is not outstanding, and it may eventually become a log storage platform. Security data assets cannot be truly and effectively mined and utilized, and the deep value cannot be highlighted. Therefore, the in-depth management and effective utilization of security data assets is a challenge faced by the unified management of logs at present.

The second is too much alarm noise. With the continuous development of enterprise security management and defense systems, even some small and medium-sized enterprises generate tens of thousands of alarm data from various security devices/systems every day. , and deal with the response one by one. At the same time, it is also necessary to extract real attack behaviors from noise (false positives generated by security equipment). Asset information, vulnerability information, etc.) to match, and improve the alarm accuracy is an effective way to solve this challenge.

The third is the lack of security personnel. With the continuous development of enterprise security management and defense system, the work content of most enterprise security personnel is also increasing. In addition to the maintenance of various equipment, daily daily, weekly and monthly reports, security compliance management, project management, etc. It has consumed most of the manpower, let alone normalized security operations (such as threat detection, threat analysis, and threat response, etc.). So this is also one of the reasons why the MDR market is gradually hot. Of course, automated responses are also a solution for businesses to address some of the repetitive security tasks mentioned above.

We have built a parsing rule base in the SIEM security big data analysis platform, which supports the data preprocessing of mainstream devices/systems at home and abroad, and also supports a variety of parsing methods (such as regular parsing, selection parsing, KV parsing, XML parsing, etc.). parsing, JSON parsing, data desensitization, custom rule parsing, field completion, etc.), which can greatly improve the delivery efficiency of data cleaning. At the same time, we also defined a set of data standards based on different types and brands of security-related data, and standardized the data in a unified manner.

Log Easy SIEM Security Big Data Analysis Platform Logical Topology

2. In response to the above problems, what SIEM security big data analysis platform solutions are currently available in the industry? What are the features of these products?

Shi Zehuan: Take Logeasy’s security analysis platform as an example. It is a threat detection, response and analysis platform based on a self-developed high-performance search engine (Beaver). Beaver can meet the basic needs of enterprise users for security big data search and security threat modeling, and provides capabilities such as security posture, threat disposal, investigation and analysis, asset management, vulnerability management, rule management, task management, and intelligence management. Based on historical long-period data and real-time data, it detects, analyzes, and responds to internal and external threats in the enterprise, and through automation capabilities, it helps users reduce the time to discover/respond to threats and improve security operation efficiency.

There is a term in the industry called “threat hunting”, which refers to the assumption that security personnel generate, and then actively analyze and verify security data around this assumption. Generally, an alarm event/abnormal event, such as a change in user rights (suspected privilege escalation), is the starting point for investigation and analysis, or because of an abnormality in an indicator, such as a surge in the number of DNS requests, a surge in the entropy value of DNS subdomain name fields, and then Threat hunting begins. Logeasy’s security analysis platform is based on the self-developed search engine Beaver, and completes the hunting of certain types of threats flexibly and quickly through SPL (Search Processing Language). SPL language is a processing language specially developed to search and analyze unstructured data such as logs. It implements hundreds of SPL functions and instructions, fully covering the needs of daily security analysis work, and docking with a variety of machines Learning algorithms to achieve anomaly detection in security scenarios.

Whether it is an attack in different scenarios such as boundary breach or intranet lateral movement, the relevant security data can be analyzed and processed through different functions in the SPL, so as to find possible anomalies; in addition, the logeasy security analysis platform also With graph analysis function. By visualizing the security data of enterprise users and related information, such as asset information and vulnerability information, as an attack relationship graph, some entities that may be abnormal can be found. These entities may be an IP, a host, a user or a domain name, etc., and then carry out further investigation on these entities to discover threat alerts, abnormal events and their correlation, and realize the exploration of security threats and risks and survey analysis. Therefore, the characteristic of Logeasy is that it is based on the self-developed self-developed search engine Beaver, which provides users with flexible security analysis and threat modeling capabilities through SPL (Search Processing Language) and graph analysis, and realizes different dimensions of security data (security device alarms, traffic , host logs, application logs, intelligence information, asset information and vulnerability information, etc.), so as to explore possible abnormal events in the enterprise network and trace back the attack link.

3. What analysis and threat detection rules are based on the various security analysis platform products on the market?

Shi Zehuan: First, we summarize the data types into two dimensions, one is the network dimension, such as data from firewalls, WAFs and other networks, and security devices; and the traffic data of protocols such as HTTP, DNS, TLS, SMB, and DHCP; the other It is the endpoint dimension, such as host system logs (Linux/Windows/AIX, etc.), HIDS/EDR data, and different rules can be generated based on specific security scenarios and different data sources. The threat detection rule base of our security analysis platform is mainly based on 1000+ rule scenario bases, and the rule base is also constantly updated and iterated based on external situations, project practices and security research.

At present, there are two main ideas in the market: blacklist detection and whitelist detection. The blacklist detection idea generally uses aggregation, statistics, and correlation analysis (feature matching, intelligence correlation, time series correlation, etc.) as the main landing mode of rule scenarios. For example: when an attack source is found on a security device (such as WAF or IPS), unknown attackers use this IP address to initiate multiple different types of exploit attempts. Although the alarm results are all unsuccessful, this When the attacked object (asset) is found, and after a certain period of time (the time period definition here needs to be measured), some abnormal behaviors (such as the appearance of a new account or the change of permissions of the original account) occur, then the attack will start from the attack. From an angle, there may be cases such as WAF Bypass/IPS Bypass (small probability events), then there will be a correlation between the two events (referring to multiple exploit attempts and abnormal account behaviors). It is configured as an association rule, and it deserves our more attention when an alarm is triggered. Therefore, this security scenario belongs to a threat detection rule.

Another whitelist detection idea generally has an anomaly detection mode. Generally speaking, in enterprises and institutions, most of the events (such as events at the system layer and events at the network layer) are normal events. Abnormal events are generally low-probability events. We need to find these small probability events, such as executing uncommon commands, uncommon parent-child processes, processes that appear for the first time, accounts that appear for the first time, silent accounts (such as no login behavior for 30 days) appear the first Therefore, it is also necessary to build a normal baseline based on historical data, and then compare it with real-time data to discover abnormal behaviors.

4. Based on your own actual experience, please talk about how the automatic response and manual response should cooperate in security operations? What is the current proportion of the distribution in the enterprise?

Shi Zehuan: Based on our research and practice, we believe that the premise of automatic response is to ensure the accuracy of alarms, the threat detection model must be able to output accurate analysis, and then these alarms should be handed over to the automatic response platform (or module). deal with. If the false alarm rate of the alarm is high and the noise is very loud, it is meaningless to do automatic response in this case, but it will affect the business. Therefore, SIEM is the premise of implementing SOAR.

What is the current process for automated response to security incidents? For example, when the platform detects a WEB-type attack event in the border area (such as a simple scenario: a source address initiates multiple SQL injections or a source address initiates various types of attack vectors), it can automatically Intelligently query and judge the source address in this attack event, and intelligently determine whether the attacking IP has been marked as a malicious tag according to the results of the intelligence query; if it is marked as a malicious tag, and it is already in the platform ban list , the system ends the response process; if it is not in the platform ban list, it will further determine whether the IP address has appeared for the first time or has appeared many times before, and intelligently and automatically link the border security equipment to realize the realization of the IP address according to the frequency of its appearance. Bans for different lengths of time are a common automated response process.

The manual response mainly refers to the manual response to some security events or some suspicious clues that are not in the automated security knowledge base (or that there is no corresponding Playbook). Human response also includes analysis work (similar to threat hunting mentioned above), because this is a process of analyzing various problems and making decisions based on different security scenarios, and from our point of view, automated response requires manual labor in the early stage The verification of the response is to judge whether a certain type of security incident can be analyzed by a solidified automatic analysis and response process. At the same time, it is also necessary to conduct reviews among various departments of the enterprise unit (such as led by the security department, business-related departments, and network-related departments) to review the process. After there is no objection, an automated response process can be formed.

Therefore, it is difficult to achieve automatic response to all security incidents in terms of the cooperation between automatic response and manual response. And automation is a product derived from manual analysis of responses, which are always important. For the specific allocation situation, on the premise of having security operation related technologies and process systems, we believe that 80% of security incidents should be handled by automated response traffic, and manual focus on deep correlation analysis and response of 20% of security incidents.

5. How do mainstream security analysis platform products on the market implement process orchestration and automated response (SOAR), and what is the technical route?

Shi Zehuan: The foreign SOAR market is more mature than the domestic market. At present, we see that there are mainly two technical routes. One is for Case Management, represented by Splunk Phantom, and the other is to integrate the concept of Chat ops, which is also derived from Such as the function of the war room, represented by Demisto (acquired by Palo Alto and renamed Cortex XSOAR), but its ultimate purpose is the same. That is to reduce the processing time of security incidents and improve the response efficiency.

Among them, the Case Management method uses Event (event) and Case (formed by an event or some event) as the driver, and realizes the automatic response of the entire process through the defined Playbook (script). The implementation level and concept of this technical route are also very clear, which is closer to a decision-making idea. Therefore, to achieve SOAR capabilities, it is necessary to have visual process orchestration (by dragging and dropping, quickly define scripts), componentization (application management) ) capabilities and task management capabilities.

In terms of architecture, the first level is the playbook, which contains the decision-making steps of the process (such as basic capabilities such as filtering, judgment, formatting, and manual review) and application components (such as an interface of a certain type of security device, Custom API interface); the second level is the application, that is, it integrates all the interfaces of a product, and can provide selection calls in the Playbook; the third level is the action (Action), which corresponds to a specific interface , such as intelligence query interface and IP query interface; the fourth is assets, for example, if 10 firewalls are deployed in an enterprise, these are 10 assets. When arranging Playbooks, it is necessary to define which asset to link with; the fifth The level is the user. When the system links assets, it needs an account on the security device with the corresponding response authority to link.

In the process of linkage, there are generally two types of actions, one is the “reading” action, and the other is the “writing” action. The “read” action is to obtain information from the security device or other third-party systems through the interface; the “write” action is to add/update/delete new policies to the security device or other third-party systems through the interface. A certain IP address is written into the firewall’s blacklist to block malicious IPs, and users are used to control permissions.

The second route is actually the same as the first route, which integrates the concept of Chat ops. After a security incident occurs, in the process of responding to it, it is necessary to strengthen the cooperation between various departments or between different personnel, and to recommend appropriate disposal actions more intelligently. This route is to extend this logic and concept to SOAR. In fact, it combines the historical experience of offensive and defensive confrontation, and realizes a more intelligent automatic response by strengthening the active invocation and in-depth cooperation between various products and modules in the security system.

6. What do you think will be the development trend of security response and disposal in the future? What new development features will there be?

Shi Zehuan: From the perspective of our research and practice, the future security response will still develop towards automation and intelligence. The application scenarios (not just a series of analysis and judgment, to ban IP/lock accounts) will also more and more abundant. And with the development of security orchestration and automated response, it will help security personnel to be separated from repetitive security operations (such as threat management), so that security personnel can devote more energy to the work at the security analysis level. In this way, we can discover some potential or more threatening security risks, which are often more harmful to enterprises. Therefore, we also believe that the exploration of threat hunting scenarios will be further in-depth.

At the same time, automated responses can also be smarter. For example, after triggering an alarm, the platform can conduct a comprehensive evaluation based on past cases of the same type, and recommend an appropriate disposal strategy and solution for the user. Therefore, in our opinion, automation and intelligence are the two development characteristics of future security response and disposal.

Safety Cow Review

To combat advanced and sophisticated threats like adversarial machine learning, enterprises need to adopt more advanced solutions. Logeasy has been committed to the research on the intelligent processing of data and log information, and meets the basic needs of enterprise users for massive data search and security threat modeling through its self-developed high-performance search engine and SPL language. Use big data and artificial intelligence methods to intelligently detect and automate the threats hidden in log information, achieve faster security threat detection and response, improve the work efficiency of the enterprise security operation and maintenance team, and provide the security development of the enterprise. an efficient solution.

The Links:   PM200DVA120 NL6448BC33-59