Data Masking


Data Masking

Data masking or data obfuscation is the process of modifying sensitive data in such a way that it is of no or little value to unauthorized intruders while still being usable by software or authorized personnel.

a way to create a fake but realistic version of our organizational data.

protect sensitive data while providing a functional alternative when real data is not needed.

The data Masking process changes the values of the data while using the same format.

the goal is to create a version that cannot be deciphered or reverse engineered

Several ways to alter data
  • character shuffling
  • word or character substitution
  • encryption

Why is Data Masking Important?

Here are several reasons data masking is essential for many organizations:
  • Data masking solves several critical threats – data loss, data ex-filtration, insider threats or account compromise, and insecure interfaces with third-party systems.
  • Reduces data risks associated with cloud adoption.
  • Makes data useless to an attacker, while maintaining many of its inherent functional properties.
  • Allows sharing data with authorized users, such as testers and developers, without exposing production data.
  • Can be used for data sanitization – normal file deletion still leaves traces of data in storage media, while sanitization replaces the old values with masked ones.

Data Masking Types:

  • Static Data Masking
  • Deterministic data masking
  • On the fly data masking
  • Dynamic data masking

Data Masking Techniques:

  • Data Encryption
  • Data Scrambling
  • Nulling Out
  • Value Variance
  • Data Substitution
  • Data Shuffling
  • Pseudonymization

Data Masking Best Practices:

  • Determine Project Scope
  • Ensure Referential Integrity
  • Secure the data masking Algorithms

Data Masking Tools:


  • Consistent over multiple applications and databases.
  • XML and CSV file support.
  • Built-in synthetic data generators.
  • HTML audit / GDPR reporting.
  • Test data automation with REST API.
  • Web Portal for easy provisioning.
  • High performance on large data sets.
  • Free trial version available.
  • Easy to install and use.
  • Native support for all major relational databases.
  • English documentation only.
  • Development of templates requires Windows.
  • Execution of templates can be done on Windows or Linux.

IRI FieldShield

  • Multi-source data profiling, discovery (search), and classification.
  • Broad array of masking functions (including FPE) to de-identify and anonymize PII.
  • Assures referential integrity across schema and multi-DB/file scenarios.
  • Built-in re-ID risk scoring and audit trails for GDPR, HIPAA, PCI DSS, etc.
  • High performance, without the need for a central server.
  • Simple metadata and multiple graphical job design options.
  • Works with DB subsetting, synthesis, reorg, migration, and ETL jobs in Voracity, plus leading DB cloning, encryption key management, TDM portals, and SIEM environments.
  • Fast support, and affordable (especially relative to IBM, Oracle, Informatica).
  • 1NF structured data support only; DarkShield needed for BLOBs, etc.
  • Free IRI Workbench IDE is a thick client Eclipse UI (not web-based).
  • DDM requires FieldShield API call, or premium proxy server option.

Accutive Data Discovery & Masking

  • Data Discovery – Enables the efficient identification of sensitive data that need to meet regulatory compliance standards such as GDPR, PCI-DSS, HIPAA, GLBA, OSFI/PIPEDA, and FERPA.
  • Mask Link Technology – Ability to consistently and repeatedly mask source data to the same value (i.e. Smith will always be masked to Jones) across multiple databases.
  • Multiple Data Sources and Destinations – Data can be moved from any major source type to any major destination type such as Oracle, DB2, MySQL and SQLServer (e.g. data can be moved from a Flat File to an Oracle Database).
  • API support – Include data masking in your data processing pipeline.
  • User-friendly, configurable interface.
  • Cost-effective solution with transparent pricing models.
  • Performs masking configurations rapidly with a built-in progress display.
  • Groovy scripting to customize application behavior requires some knowledge of programming.
  • Not currently available in languages other than English, French, Spanish, and German.

IRI DarkShield

  • Built-in data classification and simultaneous ability to search, mask, and report.
  • Multiple search methods and masking functions, including fuzzy match and NER.
  • Deletion function for GDPR (and similar) right to be forgotten laws.
  • Integrates with SIEM/DOC environments and multiple logging conventions for audit.
  • High speed, multi-source, no need to mask in cloud or compromise control of data.
  • Consistent ciphertext assures referential integrity in structured and unstructured data.
  • Shares data classes, masking functions, engine, and job design GUI with FieldShield.
  • Proven worldwide, but still affordable (or free with FieldShield in Voracity subscriptions).
  • Standalone and embedded image capabilities limited by OCR may need tweaking.
  • API requires custom ‘glue code’ for cloud, DB, and big data sources.
  • Price options may seem complex in a mixed data source and use case scenarios.

IRI CellShield EE

  • Wide range of ergonomic PII searching and masking methods.
  • Supports formulas and multi-byte character set.
  • Leverages data classes, top masking functions, and search parameters of DarkShield GUI.
  • Excel charts intelligently display discovered and masked data across multiple sheets.
  • High-performance masking of very large and/or multiple sheets at once.
  • Consistent ciphertext assures referential integrity in sheets and other data sources.
  • Search and mask audit column results, plus log exports to email, Splunk, and Datadog.
  • Documented in-app and online. Easily upgradable from low-cost Personal Edition.
  • Only compatible with MS Excel 2007 or higher (not other sheet apps).
  • Sharepoint and macro support are still in development.
  • Free trial is only for Enterprise Edition (EE), not low-cost Personal Edition (PE).

Oracle Data Masking and Subsetting

  • Discovers Complex Data and its relationships automatically.
  • Wide Masking Plan Library and enhanced Application Models.
  • Revolutions of complete data masking.
  • Fast, Secure, and Assorted.
  • It proposes various customs for masking data.
  • It supports non-oracle databases as well.
  • It takes less time to run.
  • High-cost.
  • Less secured for development and testing environments.


  • End-to-end data masking and creating reports for the same.
  • Masking Combined with data virtualization to progress transport of the data.
  • Easy in use as no training is required to mask data.
  • It migrates data steadily across sites, on-premises, or in the cloud.
  • Easy and in-time regaining of records.
  • Virtualization of databases.
  • Data refreshing is fast.
  • High cost.
  • SQL Server databases are slow and limited.
  • Reliant on NFS old protocols.

Informatica Persistent Data Masking

  • Supports Robust Data Masking.
  • Creates and integrates the masking process from a single location.
  • Features to handle a large volume of databases.
  • It has wide connectivity and customized Application Support.
  • Decreases the risk of Data Break via a single audit trail.
  • Advances the Quality of Development, Testing, and Training events.
  • Easy deployment in the workstations.
Cons: Need to work more on UI.

Microsoft SQL Server Data Masking

  • Simplification in designing and coding for applications by securing data.
  • It doesn’t change or transform the stored data in the database.
  • It permits the data manager to choose the level of complex data to expose with a lesser effect on the application.
  • End operators are prohibited from visualizing complex data.
  • Generating a mask on a column field doesn’t avoid updates.
  • Changes to applications are not essential to read data.
  • Data is fully accessible while querying tables as a privileged user.
  • Masking can be unmasked via the CAST command by executing an ad-hoc query.
  • Masking cannot be applied for the columns like Encrypted, FILESTREAM, or COLUMN_SET.

IBM InfoSphere Optim Data Privacy

  • Mask private data on request.
  • Decrease risk by locking data.
  • Fasten data privacy application.
  • A secure environment for application testing.
  • Easily abstracts data with no coding.
  • Advanced-Data Masking Feature.
  • Smart filtering abilities.
  • Need to work on UI.
  • Complex architecture.

CA Test Data Manager

  • Creates Synthetic test data for data testing.
  • Creates future test scenarios and unexpected outcomes.
  • Stores data for reuse.
  • Creates virtual copies of test data.
  • Different filters and templates are present to mask data.
  • No additional permission is required to access the production data.
  • Very fast tools to mask data.
  • Works only on Windows.
  • Complex User Interface.
  • Automating everything is not easy.

Compuware Test Data Privacy

  • Decreases the difficulty by codeless masking.
  • Completes data normalization into and out of the masking process.
  • Dynamic Privacy Rules with complex test data essentials such as account numbers, card numbers, etc.
  • Allows to discover and mask data within a greater field.
  • Easy to use and is fast.
  • Secures test data against breaks.
  • Apply test data privacy to test data, so that it will be more secure.
  • Complex user Interface.

NextLabs Data Masking

  • Helps in classifying and sorting data.
  • Monitors data movement and its usage.
  • It prevents access to precise data.
  • Notifications on risky actions and irregularities.
  • Can be installed easily on each workstation.
  • Evades data breaking.
  • Data Safety across CAD, PLM, and email is good.
  • Software compatibility problems with PLM software.
  • Execution is tough at times for the suppliers and vendors.


  • Less time and Easy installation.
  • Supple, Robustness and takes less time to create workflows.
  • Easy and Robust Combination into SQL server, Biztalk, etc.
  • Custom SSIS agenda to mask data.
  • Speed up development.
  • No learning curves.
  • Create data by just the “INSERT” command.
  • In startups the growth is fast but the progress slows down in developed industries.
  • Limited control of data.

Apache Atlas

Post a Comment