Data Security Posture Management (DSPM) is a combination of practices and tools that aim to protect what often matters most in a company's information security program. Data is, after all, information. If you've used Cloud Security Posture Management (CSPM) tools in the past, think of a DSPM as a data-centric version of that.
Instead of looking at your entire environment at once, these tools help you focus on the data that matters. In general, most DSPM vendors are built for cloud environments (IaaS, PaaS) and provide features such as:
- Discovery, cataloging and classification
- Activity monitoring
- Access Governance
- Data lineage
- Some SaaS support, for example, for classifying data in Google Workspace
While the rest of this post mentions multiple AWS-centric technologies, most DSPM vendors also support different cloud service providers.
How DSPM works
Most current DSPM vendors offer flexible deployment scenarios, which usually include the ability to deploy the ‘scanners’ in your own cloud environment. This has the clear benefit of reducing the amount of data that must be shared directly by the vendor, but also of reducing cloud networking costs.
These scanners might be virtual machines (EC2), containers, or functions, but the concept remains the same – they'll inventory data stores using cloud APIs, and then connect to them and sample a subset of data. Some vendors have more advanced algorithms than others for deciding what to scan and how much of it to scan. The idea is that you don’t need to scan an entire Petabyte data store to identify, for example, medical data being present.
The results from these scanners are typically sent to a SaaS console, from where you can manage the solution as well as look at the results, which sometimes include redacted examples of the data found.
Activity monitoring and governance usually compiles permissions, roles, and policies to show who or what could have access to data, and uses logs such as CloudTrail or data store-specific logs to show who *actually* accessed the data.
Data lineage uses those same logs, sometimes combined with a ‘Shazam-for-data’ type of algorithm that can detect similar data, even if used in different platforms. For example, it might be able to tell you a database was backed up in production and restored in a development environment, where it's less protected!
The system will then inventory data stores, scan them to detect sensitive data (usually using built-in rules that can be customized), and finally show you which data stores don’t comply with your policies. The theory is that if you detect a thousand unencrypted data stores, it is a much more efficient idea to fix those that contain sensitive data than those only containing public data!
Using JupiterOne for DSPM
As JupiterOne is not a DSPM, but a platform that supports a very wide variety of use cases, you can leverage it to secure your data. On its own, it won't scan data stores to tell you what they contain. However, if your environment already has labels or tags for data classification, you can use JupiterOne to track data stores, their expected policies, and alert on violations, as well as investigate access privileges.
First, ensure you are using the AWS integration in JupiterOne. This will import information about a multitude of data stores in your JupiterOne environment. You can leverage dashboards that allow you to zoom in on your data security, but you can also query ‘DataStore’ objects and filter more specifically.
For example, to find S3 buckets that do not have a classification tag, you could run:
You can then find your AWS accounts with the most S3 data, by running a query such as:
To list S3 buckets that are not tagged as public and that do not use AES256:
JupiterOne also comes with built-in alerts for common data store misconfigurations. If you do use a DSPM tool and it can apply tags to objects directly, you can then use JupiterOne to track that this is being done effectively.
When is DSPM most valuable?
DSPM is typically more valuable for organizations that have a lot of data, but there are other factors at play, such as when:
- You have a lot of data that isn't already classified.
- Deploying new data stores can be done manually, or at least doesn't go through an official code review process with automated detections for classification and configuration.
- The data stores that you use are all compatible with your DSPM vendor.
- You have a large cloud environment that consists of multiple different applications. For example, a large health insurance company with hundreds of AWS or Azure accounts where manual deployments are, or were possible, will likely detect sensitive data they didn't know about much more frequently than a year-old startup that only deploys via Terraform.
- You store a lot of data that is sensitive.
- You have to be compliant with security and privacy regulations and other standards.
What to consider when shopping for DSPM?
There are many factors that are important when shopping for a DSPM solution. If you have determined that you need one, start by asking:
- Of the main features the vendor supports, does it cover those you need? For example, if you require data lineage, this is not a feature all vendors have.
- Do they support most of the technologies you use to store data? A DSPM tool is useless if it's unable to scan your data stores. Most of them support the common managed data stores of popular cloud service providers, but if you use something a bit more exotic extensively, make sure to check. For example, JupiterOne uses Neptune, something that is not needed by most companies, and therefore supported by few DSPM vendors (Teleskope.ai supports it!).
- Can they run scanners in your cloud environment? That way, you can minimize the data that is sent back to the vendor.
- Does their pricing model work for you? Some charge by volume in bytes, others charge by data store counts, and for those with a SaaS data security component, by user. Keep in mind that if you are running the scanners in your own infrastructure, you will already be paying your cloud provider in order to run the infrastructure to perform the scans. How does your DSPM vendor allow you to effectively scan a subset of data to save money while intelligently targeting to improve coverage?
- Does it allow you to create custom policies for data stores?
- Does it allow you to create custom classifiers for data types, or does it have support built-in for everything important that you think you have?
- When performing a proof of concept, be sure to measure false positive rates!
So, do you need a dedicated DSPM?
Most DSPM tools are recent. Gartner's 2021 Hype Cycle for data security puts it in the early "Innovation Trigger" category. A couple years have passed, and the tools have improved.
Improvements to application networking and encryption have enabled us to use mobile devices on unsafe networks securely, and to assume all networks might be dangerous. Protecting data directly instead of only focusing on the surrounding infrastructure is a concept that is gaining traction.
JupiterOne's graph is a powerful way to improve your data security, without a DSPM, and if your environment requires automated discovery and classification of data, can be combined with a DSPM for ultimate power.
The Data Security Maturity Model (DSMM) is a new model, published during the RSA Conference this year, which Sounil Yu and I have contributed to in the last months. It brings a uniquely ‘data-centric’ approach not found in other models. DSPM tools can help us achieve some of the goals defined in the DSMM, and adding JupiterOne to the mix makes that future even more promising.
Let's make 2023 the year of data security together!