Here you will learn a few ways you can redact PII from your data before you send it to Dimension Lab's platform

PII stands for Personally Identifiable Information and all digital organizations must explain how they are handling such information. Dimension Lab has a policy of not ingesting or storing any PII. Below are some example PII redaction libraries you can use in order to clean your data.

These libraries already have the logic baked in to look for a variety of PII types. There are several public libraries that are useful for redacting PII. These includes Amazon, Microsoft and Java Script library.

Here are some examples of the types of PII that should/can be removed:

PII CategoryPII type
PersonalNAME

ADDRESS

PHONE

EMAIL

AGE
NationalSSN

PASSPORT_NUMBER

DRIVER_ID
FinancialBANK_ACCOUNT_NUMBER

BANK_ROUTING

CREDIT_DEBIT_NUMBER

CREDIT_DEBIT_CVV

CREDIT_DEBIT_EXPIRY

PIN

PII Libraries

Microsoft offers a PII redaction function through its Azure AI-Language product.

Amazon offers a PII redaction function through its Amazon Comprehend product.

Node

  • deep-redact - a full object in process PII redaction library, highly configurable
  • @zakyyudha/habibi - a full object in process PII redaction library, highly configurable
  • @coffeeandfun/remove-pii - a single field simplified PII redaction library, light configuration
  • redact-pii - a single field simplified PII redaction library, light configuration (Deprecated)

Python

  • Microsoft Presidio - an open source comprehensive PII redaction SDK, highly configurable

Docker

  • Microsoft Presidio - an open source comprehensive PII redaction SDK, highly configurable

This is to serve as a selection to get you started in your search. Each library can be configured to remove names, email addresses, phone numbers, social security numbers, and other identifiable information. You should determine if any of these achieve your PII goals.

Dimension Lab also has a library that can be leveraged to help coordinate cleaning PII from data you are passing to our platform. Using PII redaction NPM libraries