Open source intelligence (OSINT) tools help companies collect, analyze, and apply public data to improve and enhance business operations, security practices and risk exposures.
But what exactly is “public data”? Where does it cross the line with private information, and how can companies ensure they’re using OSINT ethically? Let’s dive in.
Public Perception: Understanding the Scope of Open Source Data
Open source data is information that can be captured, used, and shared without needing explicit permission or access from the content creator/target or subject. Common examples include demographic data published on government websites, posts made on social media platforms, open chat forums, or statistics reported by research sites.
But this is just the tip of the iceberg. Consider social media sites. User agreements can change and affect rights and property of posts, photos or videos, lengthy terms and conditions have clauses so that once content is posted it becomes the site’s property, which can then be labeled as public data.
There are several restrictions on who is able to collect certain types of data. This is prevalent in the United States Intelligence community or Department of Defense who are subject to Intelligence Oversight rules and regulations. The European Union has stringent data collection regulations focused on marketing cookies and other forms of third-party collection techniques used by a host of companies and governments to collect everything from private demographic data to location details. Most open-source information can be categorized into two buckets: publicly available information, or PAI, and commercially available information, or CAI. PAI is essentially everything we can see online without a paywall or logging into a website. CAI is any data or information that can be bought openly – this can be marketing data, net-flow internet data, or geospatial imagery. The resulting product, that includes synthesis of the information and analysis, is OSINT.
Ethical Collection: Exploring OSINT Operations
OSINT focuses on the ethical collection of data — capturing and using information that falls within the public domain. For companies just getting started with OSINT, however, it can be challenging to understand what constitutes ethical operation and what tips the scales into problematic data collection.
Ethical OSINT tools only capture data from public sources, which falls into the PAI category of accessing information. Attempting to use this data by bypassing paywalls or collecting information under false pretenses comes with two possible problems. First, it may put companies on the wrong side of regulatory compliance, especially if they’re capturing and using customers’ personal data. Second, there’s no guarantee this data is accurate. While public data often has a digital “paper” trail, private or secured data may be unconnected to other results, calling its accuracy and authenticity into question. The collection of information must also be auditable and reconstructable. This is essential for law enforcement or law firms who are conducting investigations that must maintain judicial integrity.
The ethical collection is restricted to data that has been shared freely by individuals, published by government agencies, or made available by private companies. One good example of ethical data collection is information published by market research websites. While on the surface this data may appear private, it falls under the purview of public data if the terms and conditions signed by users allow companies to sell or share their data with third-party firms.
OSINT is shorthand for the operation of collecting and analyzing public data. OSINT operations don’t have a defined or standardized process — in theory, however, methodical approaches to investigations are highly beneficial and recommended. An approach to an OSINT investigation can be tailored to fit any organization. Manually conducting an investigation can be extremely time and resource intensive. Without a methodical approach and the right OSINT platform, tools and software, an investigation can lead to data errors and duplications, in turn reducing the value of the results.
ShadowDragon has developed OSINT tools capable of accessing multiple public data sources simultaneously to discover commonalities and report connections.
Along with knowing what OSINT is, it’s also important to understand what it isn’t. Put simply, it isn’t the end-all-be-all. While analysis of public data may put companies or agencies on the right track, OSINT analysis doesn’t always provide a complete answer. Instead, OSINT acts as a jumping-off point for more in-depth Q&A. In addition, there are clear limits on what OSINT can capture. Data that is encrypted or password-protected lies outside the purview of ethical open source intelligence.
Effective Application of OSINT
The collection of PAI or utilization of CAI within the confines of ethical use can be constraining in an investigation. ShadowDragon was built with the purpose-driven investigator or analyst in mind to catch bad people, doing bad things. If compromising the ethics of collection puts an investigation at risk, it shouldn’t be done.
Transparency and trust are essential to maintaining ethical integrity. ShadowDragon is an ethical OSINT company, made in the United States, where all of our developers are located. We’ve undergone SOC2 Type II compliance within the cybersecurity compliance framework and audit that assesses a service provider’s security controls and practices. ShadowDragon has also completed the US-EU Data Privacy Framework, a certification achieved and administered by the United States Department of Commerce. We also maintain a ‘Trust Center’ on our website for all the latest privacy agreements and commitments to our customers.