Page 110 - Cyber Defense eMagazine April 2023
P. 110
The first step in protecting data is knowing where it resides, who accesses it, and where it goes. This
seemingly simple process is called data mapping. It involves discovering, assessing, and classifying your
application's data flows.
Data mapping entails using manual, semi-automated, and fully automated tools to survey and list every
service, database, storage, and third-party resource that makes up your data processes and touches
data records.
Mapping your application data flows will give you a holistic view of your app's moving parts and help you
understand the relationships between different data components, regardless of storage format, owner,
or location (physical or logical).
Don’t expect an easy ride.
Mapping your data for compliance, security, interoperability, or integration purposes is easier said than
done. Here are the hurdles you can expect to face.
Depiction of a moving target
Depending on your application's overall size and complexity, a manual data mapping process can take
weeks or even months. Since most applications that require data mapping are thriving and growing
projects, you’ll often find yourself chasing the velocity of codebase expansion and deploying additional
data stores throughout micro-services and distributed data processing tasks. However, you spin it, your
data map is obsolete as soon as it’s complete.
The ease of data distribution
Why do new data stores pop up faster than you can map them? Because it’s so easy to deploy new data-
based features, microservices, and workflows using cloud-based tools and services. As your application
grows, so does the number of data-touching services. Furthermore, since developers love to experiment
with new technologies and frameworks, you may find yourself dealing with a complex containerized
infrastructure (with Docker and Kubernetes clusters) that may have been a breeze to deploy, but is a
nightmare to map.
The horrors of legacy code
As enterprises undertake digital transformation of their legacy systems, they must address the data used
and created by those systems. In many cases, especially with established enterprises, whoever originally
wrote and maintained the legacy code is no longer with the company. So it’s up to you to explore the
intricacies of service interconnectivity and data standardization in an outdated environment with limited
visibility or documentation.
110