One of the latest CS-AWARE deliverables focused on generally identifying possible external information sources to be used for CS-AWARE that can help in analysing LPA cybersecurity incidents and help raise cybersecurity awareness. The main task was to find a way to assess the relevance and quality of those sources (and additional sources that have emerged over the past year), and to create a short-list of sources that are relevant for the CS-AWARE continuous data collection and analysis. The focus of the external information sources analysed and ranked using the quality indicators was on threat intelligence platforms providing information on cyber threats online. These are a combination of relevant cyber intelligence sources and information sharing tools, cyber intelligence data feeds and vulnerability data.

After the initial analysis of external information sources it became apparent that especially in the field of threat intelligence, there are many providers active and that a shortlist of high quality information sources applicable to the CS-AWARE context is necessary, since the body of knowledge that is shared by the different providers is not expected to differ to such an extent that collection from all sources is required. Analysis of related work has shown that only limited work has been done in providing a reliable scoring system, ideally based on quantitative criteria, and that there is no readily available metric or scoring system available that would suit our needs. An interesting approach is presented in (Meier, Scherrer, Gugelmann, Lenders, & Vanbever, 2018), where a ranking algorithm similar to Googles page rank is proposed to assess cyber threat intelligence feeds based on the criteria “completeness of information”, “accuracy of information” and “speed”. However, to the best of our knowledge, such a system is not yet available for public use. Another interesting approach is presented in (Hanson, 2015) where the admiralty code (“Validity of Claim”, “Reliability of Source”) is used in an automated system to evaluate information. Again, to the best of our knowledge such an approach has not been applied to assess the quality of threat intelligence and we can thus not rely on such results. Most other relevant related work is concerned with qualitative indicators applicable to specific contexts, none of which fulfilled the requirements of CS-AWARE.

Based on the findings in related work, and the fact that there are no readily available metrics or scoring systems for our context, it was decided to define a set of indicators/metrics to assess, on a qualitative level, the quality of the relevant information sources. We identified 6 main indicators, each split into sub-categories, which are described in detail in Table 1.

Indicator Explanation
1 Quality of Data An indicator to assess the expected quality of data from an information source. It was decided to assess the quality based on the complexity of information that is shared by a source, according to state-of-the-art threat intelligence concepts.
1.1 Indicators An indicator is a collection of cyber security relevant information containing patterns that can be used to detect suspicious or malicious cyber activity.
1.2 Sightings A sighting is an observation that someone has shared with the community, without adding additional intelligence to it.
1.3 Courses of Action A course of action is information concerning how to prevent or mitigate an event shared by e.g. an indicator.
1.4 Vulnerabilities A vulnerability is a weakness in a hardware or software appliance that could be exploited to breach the appliance.
2 Provider Classification An indicator to assess the type of sharing an information sharing provider does, and how much original information can be expected from this provider.
2.1 Data Feed Provider A data feed provider is the entity that produces cyber security information, or shares received information with minimal or no additional intelligence added to it.
2.1.1 Provides Original Data A data feed provider that is the original provider of this information, which has been shared in one form or another by the source of e.g. an incident.
2.1.2 Provides Aggregated Data A feed provider that aggregates data originating from various sources.
2.2 Intelligence Platform A provider that adds intelligence/ analysis to the information that is shared with the provider in one form or another.
2.3 Report Provider A provider that provides e.g. statistical information in form of reports rather than data feeds.
3 Licencing Options An indicator to assess the licensing of data use and/or access to the data source API.
3.1 Open (Publicly available) The data is freely available to collect and use.
3.2 Restricted use Some restrictions apply as to how the data can be used (e.g. academic or commercial context).
3.3 Commercial The data provider has commercial interest and provides the data for a fee.
3.4 Information Reuse Specifies how the data provided by a data source can be reused. Options include commercial, academic or personal use.
3.4.1 Commercial use allowed The data can be reused in a commercial context, it is allowed to offer services and collect fees for services based on this data.
3.4.2 Academic use allowed The data can be used in an academic context without restrictions, restrictions apply in other contexts..
3.4.3 Personal use allowed The data can be used for personal use without restrictions, restrictions apply for other contexts.
4 Interoperability/Standards An indicator to assess the interoperability of a tool provider with state-of-the art cybersecurity threat exchange standards and relevant tools/libraries.
4.1 STIX1 Supports the STIX1 threat expression standard.
4.2 STIX2 Supports the STIX2 threat expression standard.
2.3 TAXII Supports the TAXII threat exchange protocol standard.
4.4 OpenIOC Supports the OpenIOC cybersecurity artefact description standard.
4.5 RSS Supports the RSS feed standard.
4.6 JSON Supports the JSON protocol data exchange format.
4.7 CSV Supports the CSV data expression standard.
4.8 Plain Text Supports plain text data expression.
5 Advanced API An indicator to assess if a data source supports or enables relevant advanced API features.
5.1 Filtering based on time Supports filtering based on time for data access. Relevant for data collection to only collect entries since last access.
5.2 Filtering based on content Supports filtering based on content. Relevant for context specific data collection.
6 Context applicable content An indicator to assess if a data source provides data that is in general relevant to the CS-AWARE context.
6.1 Vulnerabilities The data source provides information about vulnerabilities.
6.2 Threats The data source provides information about threats.
6.3 Campaigns The data source provides advanced intelligence about cybersecurity campaigns.
6.4 Hashes The data source provides cybersecurity relevant hashes (e.g. malware hashes)
6.5 Recommendations The information source provides general cybersecurity recommendations.
6.6 Incidents (Sightings) The information source provides observed incidents or sightings without advanced intelligence

Our shortlisting efforts were based on applying an importance value (weight) to each of the indicators found in Table 1. Scores of 5 are seen as the highest priority for CS-AWARE, while scores of 1 are seen as the lowest priority for CS-AWARE. The shortlist results can be found in Table 2, having reduced the total amount of about 100 sources which were analysed, to a final eleven. Meaningful results were produced in our effort of shortlisting the information sources, and in many cases confirmed the analysts’ intuition of what should be the most relevant threat intelligence sources for our context.

Position Source Name Source URL
1 MISP Platform http://www.misp-project.org/index.html
2 Anomali Staxx https://www.anomali.com/community/staxx
3 HailATaxii http://hailataxii.com
4 Soltra Edge https://soltra.com/en/
5 US-CERT AIS https://www.us-cert.gov/ais
6 CVEDetails https://www.cvedetails.com/
7 Abuse.ch https://abuse.ch/
8 OTX Ailienvault https://otx.alienvault.com/
9 Blocklist http://www.blocklist.de
10 Nist NVD https://nvd.nist.gov/
11 Collaborative Research into Threat (CRiT) https://github.com/crits/crits

University Vienna
InnoSec

SOURCES:

Meier, R., Scherrer, C., Gugelmann, D., Lenders, V., & Vanbever, L. (2018). FeedRank: A tamper-resistant method for the ranking of cyber threat intelligence feeds. 10th International Conference on Cyber Conflict (CyCon), (pp. 321-344).
Hanson, J. M. (2015). The Admirality Code: A Cognitive Tool for Self-Directed Learning. International Journal of Learning, Teaching and Education Research, (pp. 97-115).