Automatic Extraction of Threat Actions from Cyber Threat Intelligence Sources
Type: Master CS
Period: 2018 - 2019
If you are interested please contact :
Dr. Andreas Peter (SCS)
Dr. Christin Seifert (DMB)
State-of-the-art cyber threat detection mechanisms are largely based on cyber threat intelligence collected from diverse sources resulting in very heterogeneous data, called “cyber threat reports (CTRs)”. A CTR can be any (electronic) textual description produced with the aim to inform about a cyber threat, its characteristics and its targets. The capability of security experts to transform such CTRs into effective detection checks is a key process measured by Key Performance Indicators (KPIs) such as speed, precision, and expressiveness. This Master thesis project investigates how to improve and (semi-)automate this process and eventually enhance the created detection checks.
CTRs available today can be grouped into two main families depending on their degree of comprehensiveness. On the one hand, reports describing newly discovered threats still acting in the wild usually include brief overviews and little information about available indicators. On the other hand, reports about well-known threats, generally provided months after their first discovery and as a result of a forensics analysis, include in-depth descriptions of their technical functioning as well as information on targets and motivations. The Master thesis project will primarily focus on the former category of reports and, therefore, on processing brief textual descriptions and identifying any valuable security-related information. Examples of reports of this kind are freely available through threat intelligence sharing programs such as IBM X Force (https://exchange.xforce.ibmcloud.com) and AlienVault (https://otx.alienvault.com).
Starting from a single CTR, the main research goal is to automatically process all available text (as well as any additional text given by referred links and documents) to select and extract information related to at least one of the following concepts:
- Indicators of compromise (IoCs) such as IP addresses, file hashes, user names, or any other information describing artifacts that may be observed in networks or hosts being compromised by the described threat
- Attack tactics and techniques being used by the described threat to fulfill its goal in relation to the ATT&CK framework provided by MITRE at https://attack.mitre.org.
- Attack steps performed by the described threat in relation to a kill chain of choice (e.g., Lockheed’s kill chain)
- Possible countermeasures against the described threat that may be included in the report itself or linked to the related tactics and techniques (e.g., as for the “actions” provided along with tactics and techniques of the ATT&CK framework) To achieve this goal, techniques such as machine learning (e.g., supervised and unsupervised methods) and natural language processing will be used. The techniques of choice will be embedded within a proof-of-concept tool whose capabilities are expected to meet the following requirements:
- Ability of ingesting the textual data of a given CTR
- Ability of processing available data carrying on the identification of information related to the four concepts mentioned above
- Ability of producing machine-readable outputs describing all extracted information The proof-of-concept tool will be built on top of any suitable open-source technology and tested with available data coming from free-of-charge threat intelligence data sources as mentioned above. Reference (first entry into the topic)G. Husari, E. Al-Shaer, M. Ahmed, B. Chu, X. Niu: TTPDrill: Automatic and Accurate Extraction of Threat Actions from Unstructured Text of CTI Sources. ACSAC 2017, pp. 103-115, ACM. ContactThe project will be co-supervised by Dr. Andreas Peter (SCS; email@example.com) and Dr. Christin Seifert (DMB; firstname.lastname@example.org).