US20150113659A1 - Consistent data masking - Google Patents

Consistent data masking Download PDF

Info

Publication number
US20150113659A1
US20150113659A1 US14/298,058 US201414298058A US2015113659A1 US 20150113659 A1 US20150113659 A1 US 20150113659A1 US 201414298058 A US201414298058 A US 201414298058A US 2015113659 A1 US2015113659 A1 US 2015113659A1
Authority
US
United States
Prior art keywords
data
masking
service providers
implemented method
service provider
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/298,058
Inventor
Noel H. E. D'Costa
Peter Hagelund
David J. Henderson
Robert J. Oakley
Ritesh Tandon
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GlobalFoundries Inc
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US14/298,058 priority Critical patent/US20150113659A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: OAKLEY, ROBERT J., D'COSTA, NOEL H. E., HAGELUND, PETER, HENDERSON, DAVID J., TANDON, RITESH
Publication of US20150113659A1 publication Critical patent/US20150113659A1/en
Assigned to GLOBALFOUNDRIES U.S. 2 LLC reassignment GLOBALFOUNDRIES U.S. 2 LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Assigned to GLOBALFOUNDRIES INC. reassignment GLOBALFOUNDRIES INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GLOBALFOUNDRIES U.S. 2 LLC, GLOBALFOUNDRIES U.S. INC.
Assigned to GLOBALFOUNDRIES U.S. INC. reassignment GLOBALFOUNDRIES U.S. INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: WILMINGTON TRUST, NATIONAL ASSOCIATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • H04L67/42

Definitions

  • Present invention embodiments relate to masking data, and more specifically, to masking data objects consistently across a plurality of different data resources to protect privacy.
  • Data privacy is a concern for enterprises around the world. Collection, disclosure, and protection of consumers' nonpublic personal information or personally identifiable information (e.g., medical history, financial information, etc.) are governed by a range of laws and regulations (e.g., the Gramm-Leach Bliley Act; the Health Insurance Portability and Accountability Act; the European Union Data Protection Directive; privacy laws in Canada, Japan, and Australia; the Payment Card Industry Data Security Standard; the Interagency Guidelines for Safeguarding Customer Information; Basel II operational controls and Sarbanes-Oxley internal controls; etc.).
  • laws and regulations e.g., the Gramm-Leach Bliley Act; the Health Insurance Portability and Accountability Act; the European Union Data Protection Directive; privacy laws in Canada, Japan, and Australia; the Payment Card Industry Data Security Standard; the Interagency Guidelines for Safeguarding Customer Information; Basel II operational controls and Sarbanes-Oxley internal controls; etc.
  • Extract, Transform, and Load (ETL) and Test Data Management (TDM) products are embedded in most commercially available Extract, Transform, and Load (ETL) and Test Data Management (TDM) products.
  • Some database products and application software e.g., enterprise resource planning (ERP) applications, customer relationship management (CRM) applications, human capital management (HCM) applications, etc.
  • ERP enterprise resource planning
  • CRM customer relationship management
  • HCM human capital management
  • point solutions have been developed to fill particular needs. Many companies build their own data masking solution to fit their situation if they can find no other appropriate tool.
  • a system masks data objects across a plurality of different data resources.
  • the system comprises a processor configured to include a plurality of service providers to mask the data objects, wherein each service provider corresponds to a different type of data masking for the data objects.
  • An interface provides access to the plurality of service providers from different data-consumers to mask the data objects according to the corresponding types of data masking, wherein resulting masked data maintains relational integrity across the different data resources.
  • Embodiments of the present invention further include a method and computer program product for masking data objects across a plurality of different data resources in substantially the same manners described above.
  • FIG. 1 depicts an example computing environment for an embodiment of the present invention.
  • FIG. 2 depicts a block diagram of a masking module according to an embodiment of the present invention.
  • FIG. 3 depicts a flow diagram illustrating an example manner of masking information using a public interface according to an embodiment of the present invention.
  • FIG. 4 an example form of an input parameter string for a service provider for credit card numbers according to an embodiment of the present invention.
  • Present invention embodiments relate to masking data objects (e.g., replacing persons' names with fictional names, obscuring all or part of credit card numbers, etc.) consistently across a plurality of data resources to protect privacy.
  • a large organization may support computing platforms with a variety of operating systems (e.g., AIX, z/OS, Linux, etc.) and data sources (e.g., relational databases based on different relational database management systems (RDBMSs), flat files, spreadsheets; Extensible Markup Language (XML) files, comma separated values (CSV) files, etc.).
  • RDBMSs relational database management systems
  • XML Extensible Markup Language
  • CSV comma separated values
  • persons' names may appear in both a relational database and a CSV file, and the organization may conduct research or test new software using the data masked in such a way that each name is always replaced with the same corresponding fictional name, whether in the database or the CSV file.
  • Different applications which may be interact with each other in an integrated manner, may use the masking services provided by a present invention embodiment and produce consistent results.
  • a set of masking service providers (also referred to as providers) encapsulate data masking algorithms for particular types of data objects (e.g., national identity number (NID) (e.g., Social Security Number (SSN), Canadian Social Insurance Number (SIN), etc.), credit card number (CCN), names, addresses, etc.) within a uniform application programming interface (API), so that different providers may be used with minimal changes to the software calling the API.
  • NID national identity number
  • SSN Social Security Number
  • SIN Canadian Social Insurance Number
  • CCN credit card number
  • API uniform application programming interface
  • the API may be used by applications written in a variety of programming languages (e.g., C, C++, Cobol, etc.).
  • masking may be incorporated via the API into Extract, Transform, and Load (ETL) tools, Hadoop platforms, etc.
  • ETL Extract, Transform, and Load
  • a masking grammar provides a high-level syntax that enables access to the masking service providers from high level programming and scripting languages (e.g., Perl, Lua, etc.), user-defined functions within a database, dynamic masking clients, etc. Regardless of the manner in which the data masking capabilities provided by a present invention embodiment are used, the same data may be masked identically and consistently.
  • a service provider interface allows users to implement their own masking service providers and plug them into the common framework so they may be used in the same manner as other masking service providers.
  • a further aspect of a present invention embodiment is to perform masking within a database server system.
  • a set of user-defined functions (UDFs) and user-defined table functions (UDTFs) are installed and invoked within a database. These functions use the masking grammar to enable use of the masking service providers inside Structured Query Language (SQL) queries.
  • SQL Structured Query Language
  • This allows masking to be performed within the database and may be invoked via a database stored procedure to control unit of work commits and rollbacks. For example, a user may make a full copy of a database, and then execute a user defined function using a SQL statement including a user-defined function (UDF) to perform masking in-place on the copy.
  • SQL Structured Query Language
  • a user may apply masking using the UDF while creating or copying a table in the database using a SQL statement. Since, the UDF is an object in the database, the masking is performed within the database and may consume less time than if the data were extracted from the database, processed by a masking operation, and re-inserted into the database.
  • a still further aspect of a present invention embodiment is to provide dynamic masking (also referred to as “on the fly” masking). For example, a query may be made against a non-masked data source using a client application, and sensitive data in the result set may be masked dynamically based on the security profile of the end-user making the request.
  • Yet another aspect of a present invention embodiment is to provide a masking-on-demand application, including a command line interface, that provides convenient masking of common, non-relational file formats (e.g., CSV, XML, etc.) stored within various file systems (e.g., POSIX, Windows, Hadoop, etc.) and relational data sources.
  • a wizard-driven front end places the power of the data masking service providers at the fingertips of the user without the complexity of implementing masking in a formal system (e.g., a test data management system, ETL system, etc.).
  • FIG. 1 An example environment for present invention embodiments is illustrated in FIG. 1 .
  • the environment includes one or more server systems 100 , one or more client or end-user systems 110 , and one or more data sources 120 .
  • Server systems 100 and client systems 110 may be remote from each other and communicate over a network 12 .
  • Network 12 may be implemented by any number of any suitable communications media (e.g., wide area network (WAN), local area network (LAN), Internet, intranet, etc.).
  • WAN wide area network
  • LAN local area network
  • Internet Internet
  • intranet etc.
  • server systems 100 , client systems 110 , and data sources 120 may be local to each other, and communicate via any appropriate local communication medium (e.g., local area network (LAN), hardwire, wireless link, intranet, etc.).
  • a server system 100 may include one or more applications 102 and masking module 104 .
  • Application 102 uses masking module 104 to mask information from data sources 120 .
  • Applications 102 may include user-created applications and/or other applications or utilities (e.g., a test data management suite, masking-on-demand application, user-defined functions, etc.) that use masking module 104 via API 202 ( FIG. 2 ) to mask data from one or more data sources 120 .
  • the application and masking module may be implemented across plural server systems. Alternatively, the application and/or masking module may reside on a client system 110 or other computer system in communication with the data sources.
  • Client systems 110 enable users to communicate with the application, masking module, and/or data sources (e.g., via network 12 ).
  • the client systems may present any graphical user (e.g., GUI, etc.) or other interface (e.g., command line prompts, menu screens, etc.) to receive commands from users and interact with the application, masking module, data sources and/or other modules or services.
  • GUI graphical user
  • other interface e.g., command line prompts, menu screens, etc.
  • Data sources 120 e.g., include relational databases, flat files, spreadsheets, comma separated value (CSV) files, etc.
  • Data sources 120 contain information accessed by application 102 including information that may be subject to masking.
  • Server systems 100 and client systems 110 may be implemented by any conventional or other computer systems preferably equipped with a display or monitor, a base (e.g., including at least one processor 20 , memories 30 and/or internal or external network interface or communications devices 10 (e.g., modem, network cards, etc.), optional input devices (e.g., a keyboard, mouse, or other input device), and any commercially available and custom software (e.g., masking module software).
  • a base e.g., including at least one processor 20 , memories 30 and/or internal or external network interface or communications devices 10 (e.g., modem, network cards, etc.), optional input devices (e.g., a keyboard, mouse, or other input device), and any commercially available and custom software (e.g., masking module software).
  • a base e.g., including at least one processor 20 , memories 30 and/or internal or external network interface or communications devices 10 (e.g., modem, network cards, etc.), optional input devices (e.g
  • the masking module may include one or more modules or units to perform the various functions of present invention embodiments described below (e.g., managing resources, hashing, masking data, etc.), may be implemented by any combination of any quantity of software and/or hardware modules or units, and may reside within memory 30 of a server system and/or client systems for execution by processor 20 .
  • FIG. 2 A block diagram of masking module 104 according to an embodiment of the present invention is illustrated in FIG. 2 .
  • the masking module includes public application programming interface (API) 202 , service manager 204 , service provider API 206 , service providers 210 , utilities 212 , and operating system (OS) interface 214 .
  • API application programming interface
  • the masking module may be implemented in a module framework with layers of functionality in separate libraries loosely coupled by the APIs.
  • Public API 202 is used by application 102 to communicate with the masking module (e.g., to apply masking to data from data source 120 ).
  • public API 202 may provide a C API comprising externalized functions callable from application 102 .
  • the public API may be used (e.g., via wrappers, mixed-language linking, etc.) by applications built using a variety of other programming languages (e.g., COBOL, C++, etc.).
  • Public API 202 supports a masking provider grammar that allows high level languages and scripting languages (e.g., Lua, Perl, etc.) to gain access to services provided by the masking module.
  • the public API (and the back-end, in general) is independent of the data source. This provides the flexibility to support structured and unstructured data sources without limitation.
  • the calling application is responsible for extracting data from a data source and passing the data to the masking module via the public API.
  • the input and output data structures represent data as rows and columns/fields within the rows. Standard data types are used to represent various types of data (e.g. integer, char, null terminated strings, date, time, etc.)
  • Service manager 204 manages global resources for masking module 104 and data being transported from public API 202 to individual masking service providers 210 .
  • Service providers interface (SPI) 206 is a C interface point to and from each masking service provider 210 .
  • Masking service providers 210 may include pre-defined masking service providers (for masking, e.g., a person's ages, credit card number (CCN), e-mail address, national identity, city, country, etc.) and user-written masking service providers. User-written service providers may be added into masking module 104 or may reside external to the masking module.
  • masking module 104 may include utility functions 212 (e.g., hashing functions, table lookup functions, swapping functions, etc.) that are exposed via service provider API 206 for use by pre-defined and/or user-written masking service providers.
  • utility functions 212 e.g., hashing functions, table lookup functions, swapping functions, etc.
  • the masking service providers are data source agnostic and support virtually all data types and character sets (e.g., ASCII, Unicode, Multi-byte, etc.).
  • Operating system (OS) interface 214 handles operating system-specific functions (e.g., input/output, logging, exception handling, etc.) for the masking module for each of the supported environments (e.g., AIX, Linux, Windows, Solaris, Hewlet-Packard UniX (HP UX), z/OS, etc.).
  • OS interface 214 may handle operating system-specific functions for applications (e.g., in an embodiment-provided masking on demand application).
  • FIG. 3 A manner of interacting with masking module 104 from application 102 according to an embodiment of the present invention is illustrated in FIG. 3 .
  • application 102 makes an initial call to the masking module via a Provider_FrmwInit function of public API 202 at step 301 .
  • the masking module receives control (e.g., of program execution on processor 20 ), loads other libraries (e.g., operating system specific libraries), acquires resources (e.g., memory for data to be masked, log file handles, etc.), and initializes itself to provide data masking services for any of the available masking service providers 210 .
  • control e.g., of program execution on processor 20
  • libraries e.g., operating system specific libraries
  • resources e.g., memory for data to be masked, log file handles, etc.
  • the application prepares a data structure for communicating information to the masking module.
  • This structure identifies the specific masking service provider needed by the application and control parameters to drive execution of the masking service provider.
  • the application then calls the masking module via a Provider_Init function of public API 202 to initialize (e.g., load dictionaries, set processing options, etc.) the specified masking service provider.
  • the masking module receives control, interprets the input structure, acquires resources, loads a library containing the specified masking service provider, and initializes the service provider for data masking.
  • the masking module returns a token identifier to the application. This token identifier is passed by the application to the masking module in subsequent service calls to identify the specified and initialized masking service provider or masking service provider instance from any other masking service providers that may be operating in the same process.
  • the application prepares the input structure with one or more input buffers for the data to be masked and with the token identifier returned from the Provider_Init function call.
  • the masking module may process masking tasks as single entities or in user-defined batch sizes.
  • the application then calls a Provider Service function of public API 202 to mask the data identified in the one or more input buffers.
  • the masking module receives control, interprets the token identifier, interprets the input buffer(s), masks the data, and returns the masked data to the application.
  • the masked data is returned either in the input buffer(s), or optionally, in corresponding output buffer(s).
  • the application determines whether more data remains to be masked. If so, processing returns to step 305 . Otherwise, at step 308 , the application calls a Provider_Term function of the public API (passing the token identifier in the call) to terminate use of the specified masking service provider by that application.
  • the masking module receives control, interprets the token identifier, releases resources, and terminates the masking service provider specified by the token identifier for the application.
  • the application calls the masking module via a Provider_Frmw/Term function of the public API to allow the masking module framework to be terminated.
  • the masking module receives control, releases resources, and terminates the masking module framework environment.
  • An example using the masking service provider for credit card numbers illustrates the masking grammar.
  • Example keywords and parameters are described, followed by examples of the use of the CCN provider in a UDF and within a Lua script.
  • An input parameter string contains control information using the masking grammar.
  • An example form of the input parameter string for a CCN service provider according to an embodiment of the present invention is illustrated in FIG. 4 .
  • a required parameter named PRO specifies the masking service provider.
  • a required parameter FLDDEFn describes the attributes of a field.
  • the n suffix correlates to the index of the field, argument or field-name specified in the query or expression.
  • FLDDEF1 describes the attributes of the first field.
  • FLDDEF2 describes the attributes of the second field, etc.
  • the FLDDEF parameter includes sub-parameters enclosed within parenthesis to separate them from other parameters.
  • a required FLDDEF sub-parameter NAME specifies the field name.
  • a required FLDDEF sub-parameter named DT (or DATATYPE) specifies the data type of the field.
  • Example values, and their characteristics, that may be assigned to the DT sub-parameter include the following:
  • the date is contained within three consecutive shorts integers. The first is a signed short that contains the year, the second is an unsigned short that contains the month and the third is an unsigned short that contains the day. In a C-type structure format the date appears as:
  • typedef struct s_odbc_date ⁇ signed short Year; unsigned short Month; unsigned short Day; ⁇ ODPP_ODBC_DATE;
  • a packed decimal field has two decimal numbers expressed in a single byte of storage in all but the rightmost/last portion of a packed decimal field. The last rightmost/last byte has the sign indicator in the rightmost/last part of the byte.
  • the standard signs used a 0xF for positive numbers and 0xD for negative numbers.
  • the time is contained in three consecutive unsigned shorts. The first contains the hour, the second contains the minute and the third contains the second. In a C-type structure format the time appears as:
  • typedef struct s_odbc_time ⁇ unsigned short Hour; unsigned short Minute; unsigned short Second; ⁇ ODPP_ODBC_TIME;
  • the timestamp is contained in a consecutive arrangement of six shorts followed by an unsigned integer.
  • the first is a signed short that contains the year
  • the second is an unsigned short that contains the month
  • the third is an unsigned short that contains the day
  • the fourth is an unsigned short that contains the hour
  • the fifth is an unsigned short that contains the minute
  • the sixth is an unsigned short that contains the second and at the end of this consecutive arrangement is an unsigned integer that contains the fractional second.
  • the timestamp appears as:
  • typedef struct s_odbc_timestamp ⁇ signed short Year; unsigned short Month; unsigned short Day; unsigned short Hour; unsigned short Minute; unsigned short Second; unsigned int Fraction; ⁇ ODPP_ODBC_TIMESTAMP;
  • some information is not needed because it can be determined within the UDF.
  • LEN an optional FLDDEF sub-parameter named LEN (or LENGTH) specifies the length of a character field as an integer value. This parameter is required only when this information is not available within the environment in which the masking module is executing, and is used only with character data types (e.g., CHAR, VARCHAR, VARCHAR_SZ, WCHAR, WVARCHAR, WVARCHAR_SZ, DATETIME_CHAR, DATETIME_SZ, DATETIME_VARCHAR, DATETIME_WCHAR, DATETIME_WSZ, DATETIME_WVARCHAR).
  • PRE PRECISION
  • CP an optional FLDDEF sub-parameter named CP (or CODEPAGE) specifies the code page of the data. This parameter is required only when: the type of data is CHAR, VARCHAR, VARCHAR_SZ, DATETIME_CHAR, DATETIME_SZ, or DATETIME_VARCHAR; and this information is not available within the masking module executing environment, the CP/CODEPAGE parameter was not specified outside of the FLDDEF, or the code page of the data for the subject FLDDEF is different than the CP/CODEPAGE specified outside of the FLDDEF.
  • CPT An optional FLDDEF sub-parameter named CPT or CPTYPE specifies the code page type. This parameter is required only when: the type of data is CHAR, VARCHAR, VARCHAR_SZ, DATETIMEI_CHAR, DATETIME_SZ, or DATETIME_VARCHAR; the CP/CODEPAGE sub-parameter is specified; and this information is not available within the ODPP executing environment, the CPT/CPTYPE parameter was not specified outside of the FLDDEF, or the source of the data for the subject FLDDEF is different than the CPT/CPTYPE specified outside of the FLDDEF.
  • Type Values Data Source DBZ or DB2zOS DB2 z/OS DB2 or DB2LUW DB2-LUW ORA or ORACLE Oracle SYB or SYBASE Sybase ODBC ODBC IFX or INFORMIX Informix MSS or SQLSERVER MS SQL Server TD or TERADATA Teradata NZ or NETEZZA Netezza ANY any DBMS NONE no DBMS
  • the source of the input data is a DBMS in which case a DBMS-type code page type value is required. This ensures that the masking module handles the data using DBMS-specific code pages.
  • the origin of the data is DBMS specific but not tied to any one DBMS, then the value should be specified as ANY.
  • the origin of the data is from a non-DBMS source, then the value should be specified as NONE.
  • Parameters that are specified within the input parameter string and that are used for more than one of the masking service provider specific grammar include CP (or CODEPAGE), CPT (or CPTYPE), and DLIM (or DISCARDLIMIT).
  • CP specifies the code page of the data for all data-related input. This parameter may be optionally overridden within a FLDDEFn-type parameter when there is a difference in the code pages between fields within the same syntax expression. This allows the masking module to handle data expressed in different code pages between different fields.
  • the default is UTF-8 (Unicode).
  • the parameter takes an integer value that specifies the codepage or character-set identifier.
  • CPT is an optional parameter that specifies the code page value. This code page type applies to all data-related input. This parameter may be optionally overridden within a FLDDEFn-type parameter when there is a difference in the code page types between fields within the same syntax expression. This allows the masking module to handle data expressed in different DBMS-specific code pages.
  • the DLIM parameter specifies the number of failed rows that should be discarded or ignored before a process takes an action.
  • the particular action depends on the specific implementation (e.g. Lua, UDF, etc.).
  • the input to an masking module-based UDF is specified with the following format:
  • OptimMask ⁇ ret-type> is the name of the ODPP-type UDFs.
  • ⁇ ret-type> is the return data type from the UDF which is based upon the categorization of data types that are supported within each DBMS.
  • the terms argument-1, . . . argument-n are the input arguments to the UDF. At least one argument is required as the object of the UDF. This argument may be any type of SQL expression supported by the hosting DBMS. In many cases, this will simply be the name of the source column.
  • the string ‘ODPP-provider-input-syntax’ is the syntax expression that is input to the ODPP-specific service provider, for example:
  • OptimMaskStr800Latin is the name of the masking module-based UDF, which can return VARCHAR string of max. 800 characters;
  • CCNCol is the table column-name to be masked;
  • the method of masking is repeatable.
  • Scripts may be used for customized column processing with a database. These scripts may invoke masking module 104 to mask data values.
  • a call to a masking service provider from a Lua script uses the same masking grammar as described above in the context of a UDF. For example, the following Lua code may be used to generate a masked value via the masking service provider for credit card numbers (CCN).
  • CCN credit card numbers
  • VALUE source.field.getvalue(“CreditCardNum”) -- get CreditCardNum field value
  • the environment of the present invention embodiments may include any number of computer or other processing systems (e.g., client or end-user systems, server systems, etc.) and storage systems (e.g., file systems, databases, or other repositories), arranged in any desired fashion, where the present invention embodiments may be applied to any desired type of computing environment (e.g., cloud computing, client-server, network computing, mainframe, stand-alone systems, etc.).
  • the computer or other processing systems employed by the present invention embodiments may be implemented by any number of any personal or other type of computer or processing system (e.g., desktop, laptop, PDA, mobile devices, etc.), and may include any commercially available operating system and any combination of commercially available and custom software (e.g., database software, communications software, etc.).
  • These systems may include any types of monitors and input devices (e.g., keyboard, mouse, voice recognition, touch screen, etc.) to enter and/or view information.
  • the various functions of the computer or other processing systems may be distributed in any manner among any number of software and/or hardware modules or units, processing or computer systems and/or circuitry, where the computer or processing systems may be disposed locally or remotely of each other and communicate via any suitable communications medium (e.g., LAN, WAN, intranet, Internet, hardwire, modem connection, wireless, etc.).
  • any suitable communications medium e.g., LAN, WAN, intranet, Internet, hardwire, modem connection, wireless, etc.
  • the functions of the present invention embodiments may be distributed in any manner among various server systems, end-user/client and/or any other intermediary processing devices including third party client/server processing devices.
  • the software and/or algorithms described above and illustrated in the flow charts may be modified in any manner that accomplishes the functions described herein.
  • the functions in the flow charts or description may be performed in any order that accomplishes a desired operation.
  • Application 102 , masking module 104 , and some or all components thereof may be coupled in any manner (e
  • the communication network may be implemented by any number of any types of communications network (e.g., LAN, WAN, Internet, Intranet, VPN, etc.).
  • the computer or other processing systems of the present invention embodiments may include any conventional or other communications devices to communicate over the network via any conventional or other protocols.
  • the computer or other processing systems may utilize any type of connection (e.g., wired, wireless, etc.) for access to the network.
  • Local communication media may be implemented by any suitable communication media (e.g., local area network (LAN), hardwire, wireless link, Intranet, etc.).
  • the system may employ any number of data storage systems and structures to store information.
  • the data storage systems may be implemented by any number of any conventional or other databases, file systems, caches, repositories, warehouses, etc.
  • the present invention embodiments may employ any number of any type of user interface (e.g., Graphical User Interface (GUI), command-line, prompt, etc.) for obtaining or providing information, where the interface may include any information arranged in any fashion.
  • GUI Graphical User Interface
  • the interface may include any number of any types of input or actuation mechanisms (e.g., buttons, icons, fields, boxes, links, etc.) disposed at any locations to enter/display information and initiate desired actions via any suitable input devices (e.g., mouse, keyboard, touch screen, pen, etc.).
  • the present invention embodiments are not limited to the specific tasks, algorithms, parameters, data, or network/environment described above, but may be utilized for any type of data object masking.
  • aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
  • a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof.
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • LAN local area network
  • WAN wide area network
  • Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Abstract

According to one embodiment of the present invention, a system masks data objects across a plurality of different data resources. The system comprises a processor configured to include a plurality of service providers to mask the data objects, wherein each service provider corresponds to a different type of data masking for the data objects. An interface provides access to the plurality of service providers from different data-consumers to mask the data objects according to the corresponding types of data masking, wherein resulting masked data maintains relational integrity across the different data resources. Embodiments of the present invention further include a method and computer program product for masking data objects across a plurality of different data resources in substantially the same manners described above.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. patent application Ser. No. 14/058,556, entitled “CONSISTENT DATA MASKING” and filed Oct. 21, 2013, the disclosure of which is incorporated herein by reference in its entirety.
  • BACKGROUND
  • 1. Technical Field
  • Present invention embodiments relate to masking data, and more specifically, to masking data objects consistently across a plurality of different data resources to protect privacy.
  • 2. Discussion of the Related Art
  • Data privacy is a concern for enterprises around the world. Collection, disclosure, and protection of consumers' nonpublic personal information or personally identifiable information (e.g., medical history, financial information, etc.) are governed by a range of laws and regulations (e.g., the Gramm-Leach Bliley Act; the Health Insurance Portability and Accountability Act; the European Union Data Protection Directive; privacy laws in Canada, Japan, and Australia; the Payment Card Industry Data Security Standard; the Interagency Guidelines for Safeguarding Customer Information; Basel II operational controls and Sarbanes-Oxley internal controls; etc.).
  • To address these concerns, data masking capabilities are embedded in most commercially available Extract, Transform, and Load (ETL) and Test Data Management (TDM) products. Some database products and application software (e.g., enterprise resource planning (ERP) applications, customer relationship management (CRM) applications, human capital management (HCM) applications, etc.) also include data masking capabilities. In addition, point solutions have been developed to fill particular needs. Many companies build their own data masking solution to fit their situation if they can find no other appropriate tool.
  • Many large enterprises employ dozens of mission critical software applications, of which some are commercial, off the shelf applications while others are customer-created. These applications may share account information about the company's clients, products, and services, which may be subject to masking. The applications may interact with each other. In addition, an end-user may view the data using more than one of the applications. When the applications are used with a varied set of operating systems and data sources, an enterprise may have to piece together a data masking strategy from various niche and/or custom solutions. These disparate solutions will use different algorithms, resulting in inconsistently masked data.
  • BRIEF SUMMARY
  • According to one embodiment of the present invention, a system masks data objects across a plurality of different data resources. The system comprises a processor configured to include a plurality of service providers to mask the data objects, wherein each service provider corresponds to a different type of data masking for the data objects. An interface provides access to the plurality of service providers from different data-consumers to mask the data objects according to the corresponding types of data masking, wherein resulting masked data maintains relational integrity across the different data resources. Embodiments of the present invention further include a method and computer program product for masking data objects across a plurality of different data resources in substantially the same manners described above.
  • BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
  • Generally, like reference numerals in the various figures are utilized to designate like components.
  • FIG. 1 depicts an example computing environment for an embodiment of the present invention.
  • FIG. 2 depicts a block diagram of a masking module according to an embodiment of the present invention.
  • FIG. 3 depicts a flow diagram illustrating an example manner of masking information using a public interface according to an embodiment of the present invention.
  • FIG. 4 an example form of an input parameter string for a service provider for credit card numbers according to an embodiment of the present invention.
  • DETAILED DESCRIPTION
  • Present invention embodiments relate to masking data objects (e.g., replacing persons' names with fictional names, obscuring all or part of credit card numbers, etc.) consistently across a plurality of data resources to protect privacy. In an example scenario, a large organization may support computing platforms with a variety of operating systems (e.g., AIX, z/OS, Linux, etc.) and data sources (e.g., relational databases based on different relational database management systems (RDBMSs), flat files, spreadsheets; Extensible Markup Language (XML) files, comma separated values (CSV) files, etc.). An embodiment of the present invention allows the organization to mask the data in a manner that preserves relational integrity between data objects in different data sources. For example, persons' names may appear in both a relational database and a CSV file, and the organization may conduct research or test new software using the data masked in such a way that each name is always replaced with the same corresponding fictional name, whether in the database or the CSV file. Different applications, which may be interact with each other in an integrated manner, may use the masking services provided by a present invention embodiment and produce consistent results.
  • One aspect of a present invention embodiment is to provide a common set of masking services via a flexible, common interface. A set of masking service providers (also referred to as providers) encapsulate data masking algorithms for particular types of data objects (e.g., national identity number (NID) (e.g., Social Security Number (SSN), Canadian Social Insurance Number (SIN), etc.), credit card number (CCN), names, addresses, etc.) within a uniform application programming interface (API), so that different providers may be used with minimal changes to the software calling the API. The API may be used by applications written in a variety of programming languages (e.g., C, C++, Cobol, etc.). For example, masking may be incorporated via the API into Extract, Transform, and Load (ETL) tools, Hadoop platforms, etc. A masking grammar provides a high-level syntax that enables access to the masking service providers from high level programming and scripting languages (e.g., Perl, Lua, etc.), user-defined functions within a database, dynamic masking clients, etc. Regardless of the manner in which the data masking capabilities provided by a present invention embodiment are used, the same data may be masked identically and consistently.
  • Another aspect of a present invention embodiment is to facilitate user additions to the set of masking service providers. A service provider interface (SPI) allows users to implement their own masking service providers and plug them into the common framework so they may be used in the same manner as other masking service providers.
  • A further aspect of a present invention embodiment is to perform masking within a database server system. A set of user-defined functions (UDFs) and user-defined table functions (UDTFs) are installed and invoked within a database. These functions use the masking grammar to enable use of the masking service providers inside Structured Query Language (SQL) queries. This allows masking to be performed within the database and may be invoked via a database stored procedure to control unit of work commits and rollbacks. For example, a user may make a full copy of a database, and then execute a user defined function using a SQL statement including a user-defined function (UDF) to perform masking in-place on the copy. Alternatively, a user may apply masking using the UDF while creating or copying a table in the database using a SQL statement. Since, the UDF is an object in the database, the masking is performed within the database and may consume less time than if the data were extracted from the database, processed by a masking operation, and re-inserted into the database.
  • A still further aspect of a present invention embodiment is to provide dynamic masking (also referred to as “on the fly” masking). For example, a query may be made against a non-masked data source using a client application, and sensitive data in the result set may be masked dynamically based on the security profile of the end-user making the request.
  • Yet another aspect of a present invention embodiment is to provide a masking-on-demand application, including a command line interface, that provides convenient masking of common, non-relational file formats (e.g., CSV, XML, etc.) stored within various file systems (e.g., POSIX, Windows, Hadoop, etc.) and relational data sources. A wizard-driven front end places the power of the data masking service providers at the fingertips of the user without the complexity of implementing masking in a formal system (e.g., a test data management system, ETL system, etc.).
  • An example environment for present invention embodiments is illustrated in FIG. 1. Specifically, the environment includes one or more server systems 100, one or more client or end-user systems 110, and one or more data sources 120. Server systems 100 and client systems 110 may be remote from each other and communicate over a network 12.
  • Network 12 may be implemented by any number of any suitable communications media (e.g., wide area network (WAN), local area network (LAN), Internet, intranet, etc.). Alternatively, any number of server systems 100, client systems 110, and data sources 120 may be local to each other, and communicate via any appropriate local communication medium (e.g., local area network (LAN), hardwire, wireless link, intranet, etc.).
  • A server system 100 may include one or more applications 102 and masking module 104. Application 102 uses masking module 104 to mask information from data sources 120. Applications 102 may include user-created applications and/or other applications or utilities (e.g., a test data management suite, masking-on-demand application, user-defined functions, etc.) that use masking module 104 via API 202 (FIG. 2) to mask data from one or more data sources 120. The application and masking module may be implemented across plural server systems. Alternatively, the application and/or masking module may reside on a client system 110 or other computer system in communication with the data sources.
  • Client systems 110 enable users to communicate with the application, masking module, and/or data sources (e.g., via network 12). The client systems may present any graphical user (e.g., GUI, etc.) or other interface (e.g., command line prompts, menu screens, etc.) to receive commands from users and interact with the application, masking module, data sources and/or other modules or services.
  • Data sources 120 (e.g., include relational databases, flat files, spreadsheets, comma separated value (CSV) files, etc.) contain information accessed by application 102 including information that may be subject to masking.
  • Server systems 100 and client systems 110 may be implemented by any conventional or other computer systems preferably equipped with a display or monitor, a base (e.g., including at least one processor 20, memories 30 and/or internal or external network interface or communications devices 10 (e.g., modem, network cards, etc.), optional input devices (e.g., a keyboard, mouse, or other input device), and any commercially available and custom software (e.g., masking module software).
  • The masking module may include one or more modules or units to perform the various functions of present invention embodiments described below (e.g., managing resources, hashing, masking data, etc.), may be implemented by any combination of any quantity of software and/or hardware modules or units, and may reside within memory 30 of a server system and/or client systems for execution by processor 20.
  • A block diagram of masking module 104 according to an embodiment of the present invention is illustrated in FIG. 2. The masking module includes public application programming interface (API) 202, service manager 204, service provider API 206, service providers 210, utilities 212, and operating system (OS) interface 214. The masking module may be implemented in a module framework with layers of functionality in separate libraries loosely coupled by the APIs.
  • Public API 202 is used by application 102 to communicate with the masking module (e.g., to apply masking to data from data source 120). For example, public API 202 may provide a C API comprising externalized functions callable from application 102. In addition, the public API may be used (e.g., via wrappers, mixed-language linking, etc.) by applications built using a variety of other programming languages (e.g., COBOL, C++, etc.). Public API 202 supports a masking provider grammar that allows high level languages and scripting languages (e.g., Lua, Perl, etc.) to gain access to services provided by the masking module.
  • The public API (and the back-end, in general) is independent of the data source. This provides the flexibility to support structured and unstructured data sources without limitation. The calling application is responsible for extracting data from a data source and passing the data to the masking module via the public API. The input and output data structures represent data as rows and columns/fields within the rows. Standard data types are used to represent various types of data (e.g. integer, char, null terminated strings, date, time, etc.)
  • Service manager 204 manages global resources for masking module 104 and data being transported from public API 202 to individual masking service providers 210. Service providers interface (SPI) 206 is a C interface point to and from each masking service provider 210. Masking service providers 210 may include pre-defined masking service providers (for masking, e.g., a person's ages, credit card number (CCN), e-mail address, national identity, city, country, etc.) and user-written masking service providers. User-written service providers may be added into masking module 104 or may reside external to the masking module. In addition, masking module 104 may include utility functions 212 (e.g., hashing functions, table lookup functions, swapping functions, etc.) that are exposed via service provider API 206 for use by pre-defined and/or user-written masking service providers. The masking service providers are data source agnostic and support virtually all data types and character sets (e.g., ASCII, Unicode, Multi-byte, etc.).
  • Operating system (OS) interface 214 handles operating system-specific functions (e.g., input/output, logging, exception handling, etc.) for the masking module for each of the supported environments (e.g., AIX, Linux, Windows, Solaris, Hewlet-Packard UniX (HP UX), z/OS, etc.). In addition, OS interface 214 may handle operating system-specific functions for applications (e.g., in an embodiment-provided masking on demand application).
  • A manner of interacting with masking module 104 from application 102 according to an embodiment of the present invention is illustrated in FIG. 3. In particular, application 102 makes an initial call to the masking module via a Provider_FrmwInit function of public API 202 at step 301.
  • At step 302, the masking module receives control (e.g., of program execution on processor 20), loads other libraries (e.g., operating system specific libraries), acquires resources (e.g., memory for data to be masked, log file handles, etc.), and initializes itself to provide data masking services for any of the available masking service providers 210.
  • At step 303, the application prepares a data structure for communicating information to the masking module. This structure identifies the specific masking service provider needed by the application and control parameters to drive execution of the masking service provider. The application then calls the masking module via a Provider_Init function of public API 202 to initialize (e.g., load dictionaries, set processing options, etc.) the specified masking service provider.
  • At step 304, the masking module receives control, interprets the input structure, acquires resources, loads a library containing the specified masking service provider, and initializes the service provider for data masking. The masking module returns a token identifier to the application. This token identifier is passed by the application to the masking module in subsequent service calls to identify the specified and initialized masking service provider or masking service provider instance from any other masking service providers that may be operating in the same process.
  • At step 305, the application prepares the input structure with one or more input buffers for the data to be masked and with the token identifier returned from the Provider_Init function call. The masking module may process masking tasks as single entities or in user-defined batch sizes. The application then calls a Provider Service function of public API 202 to mask the data identified in the one or more input buffers.
  • At step 306, the masking module receives control, interprets the token identifier, interprets the input buffer(s), masks the data, and returns the masked data to the application. The masked data is returned either in the input buffer(s), or optionally, in corresponding output buffer(s).
  • At step 307, the application determines whether more data remains to be masked. If so, processing returns to step 305. Otherwise, at step 308, the application calls a Provider_Term function of the public API (passing the token identifier in the call) to terminate use of the specified masking service provider by that application.
  • At step 309, the masking module receives control, interprets the token identifier, releases resources, and terminates the masking service provider specified by the token identifier for the application.
  • At step 310, the application calls the masking module via a Provider_Frmw/Term function of the public API to allow the masking module framework to be terminated.
  • At step 311, the masking module receives control, releases resources, and terminates the masking module framework environment.
  • An example using the masking service provider for credit card numbers (CCNs) illustrates the masking grammar. Example keywords and parameters (some common to all masking service providers, some specific to the CCN provider) are described, followed by examples of the use of the CCN provider in a UDF and within a Lua script. An input parameter string contains control information using the masking grammar. An example form of the input parameter string for a CCN service provider according to an embodiment of the present invention is illustrated in FIG. 4. A required parameter named PRO (or PROVIDER) specifies the masking service provider. For example, the term PRO=CCN specifies that the provider for credit card numbers is requested.
  • A required parameter FLDDEFn describes the attributes of a field. The n suffix correlates to the index of the field, argument or field-name specified in the query or expression. For example, FLDDEF1 describes the attributes of the first field. FLDDEF2 describes the attributes of the second field, etc. The FLDDEF parameter includes sub-parameters enclosed within parenthesis to separate them from other parameters.
  • In particular, a required FLDDEF sub-parameter NAME specifies the field name. For example: FLDDEF1=(NAME=FIELD1) indicates that field number 1 is named “FIELD1.”
  • A required FLDDEF sub-parameter named DT (or DATATYPE) specifies the data type of the field. Example values, and their characteristics, that may be assigned to the DT sub-parameter include the following:
  • i) CHAR
  • Fixed size character data which is left justified and space padded.
  • ii) DATE
  • The date is contained within three consecutive shorts integers. The first is a signed short that contains the year, the second is an unsigned short that contains the month and the third is an unsigned short that contains the day. In a C-type structure format the date appears as:
  • typedef struct s_odbc_date
    {
    signed short  Year;
    unsigned short  Month;
    unsigned short  Day;
    } ODPP_ODBC_DATE;
  • iii) DATETIME_CHAR
  • This a fixed size character data containing a date-time value that is left justified and space padded.
  • iv) DATETIME_SZ
  • This is a character data string containing a date-time value that is left justified, space padded and terminated by a NULL character.
  • v) DATETIME_VARCHAR
  • This is a variable size character data starting with a short integer value which indicates the length, in bytes, of the character date-time value that follows.
  • vi) DATETIME_WCHAR
  • This is a fixed size wide-character data containing a date-time value that is left justified and space padded.
  • vii) DATETIME_WSZ
  • This is a wide character data string containing a date-time value that is left justified, space padded and terminated by a NULL character.
  • viii) DATETIME_WVARCHAR
  • This is a variable size wide character data starting with a short integer value which indicates the length, in bytes, of the wide character date-time value that follows.
  • ix) DECIMAL370
  • This is an IBM mainframe 370/MVS/ESA/zOS packed decimal encoded buffer. A packed decimal field has two decimal numbers expressed in a single byte of storage in all but the rightmost/last portion of a packed decimal field. The last rightmost/last byte has the sign indicator in the rightmost/last part of the byte. The standard signs used a 0xF for positive numbers and 0xD for negative numbers.
  • x) DOUBLE
  • This is a double precision floating point number. Range of values: 1.7E+/−308 (15 digits).
  • xi) FLOAT
  • This is a floating point number. Range of values: 3.4E+/−38 (7 digits).
  • xii) INTEGER
  • This is a 4-byte signed integer. Range of values: −2,147,483,648 to 2,147,483,647.
  • xiii) LONG_LONG
  • This is an 8-byte signed numeric value. Range of values: −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
  • xiv) ORA_VARNUM
  • This is an Oracle VARNUM-type. It is similar to the Oracle external datatype NUMBER except that the first byte contains the length of the number representation. The length value does not include the length byte itself. The user must reserve 22-bytes to use the longest possible VARNUM where the 1st byte is the length and bytes 2-through 22 contain the 21-binary format of the Oracle NUMBER-type.
  • ) xv) SMALLINT
  • This is a 2 byte signed integer value. Range of values: −32,768 to 32,767
  • xvi) TIME
  • ) The time is contained in three consecutive unsigned shorts. The first contains the hour, the second contains the minute and the third contains the second. In a C-type structure format the time appears as:
  • typedef struct s_odbc_time
    {
    unsigned short  Hour;
    unsigned short  Minute;
    unsigned short  Second;
    } ODPP_ODBC_TIME;
  • xvii) TIMESTAMP
  • The timestamp is contained in a consecutive arrangement of six shorts followed by an unsigned integer. The first is a signed short that contains the year, the second is an unsigned short that contains the month, the third is an unsigned short that contains the day, the fourth is an unsigned short that contains the hour, the fifth is an unsigned short that contains the minute the sixth is an unsigned short that contains the second and at the end of this consecutive arrangement is an unsigned integer that contains the fractional second. In a C-type structure format the timestamp appears as:
  • typedef struct s_odbc_timestamp
    {
    signed short  Year;
    unsigned short  Month;
    unsigned short  Day;
    unsigned short  Hour;
    unsigned short  Minute;
    unsigned short  Second;
    unsigned int  Fraction;
    } ODPP_ODBC_TIMESTAMP;
  • ) xviii) U_INTEGER
  • This is a 4 byte unsigned integer value. Range of values: 0 to 4,294,967,295.
  • xix) U_LONG_LONG
  • This is an 8-byte unsigned numeric value. Range of values: 0 to 18,446,744,073,709,551,615,
  • xx) U_SMALLINT
  • This is a 2-byte unsigned integer value. Range of values: 0 to 65,535
  • xxi) U_TINYINT
  • This is a single byte unsigned integer value. Range of values: 0 to 255
  • xxii) VARCHAR
  • This indicates character data starting with a short integer value which indicates the length, in bytes, of the character data to follow.
  • xxiii) VARCHAR_SZ
  • This indicates character data string which is terminated by a NULL character.
  • xxiv) WCHAR
  • This is a fixed size wide character data which is left justified and space padded.
  • xxv) WVARCHAR
  • This is wide character data starting with a short integer value which indicates the length, in bytes, of the wide character data to follow.
  • xxvi) WVARCHAR_SZ
  • This is wide character data string which is terminated by a NULL character.
  • For example, the expression FLDDEF1=(NAME=FIELD1, DT=WCHAR) specifies that field number 1 is named “FIELD1” and has a data type of WCHAR.
  • In some cases, e.g., for some UDFs, some information is not needed because it can be determined within the UDF.
  • An optional FLDDEF sub-parameter named LEN (or LENGTH) specifies the length of a character field as an integer value. This parameter is required only when this information is not available within the environment in which the masking module is executing, and is used only with character data types (e.g., CHAR, VARCHAR, VARCHAR_SZ, WCHAR, WVARCHAR, WVARCHAR_SZ, DATETIME_CHAR, DATETIME_SZ, DATETIME_VARCHAR, DATETIME_WCHAR, DATETIME_WSZ, DATETIME_WVARCHAR). For example, the expression FLDDEF1=(LEN=10, NAME=FIELD1, DT=WVARCHAR) specifies that field number 1 has data type WVARCHAR and is ten characters long.
  • An optional FLDDEF sub-parameter named PRE (or PRECISION) specifies the precision of a numeric field. This parameter is required only when this information is not available within the masking module executing environment. The value of this field is an integer that specifies the precision of field. For example, the expression FLDDEF1=(PRE=5, NAME=FIELD2, DT=DOUBLE) indicates that field number 2 has a precision of five digits.
  • An optional FLDDEF sub-parameter named SCA (or SCALE) specifies the scale of a numeric field. This parameter is required only when this information is not available within the masking module executing environment. Its value is a short integer that specifies the scale of the field. For example, the expression FLDDEF1=(PRE=5, SCA=2, NAME=FIELD2, DT=DOUBLE) indicates that field number 2 has a precision of five and a scale of two.
  • An optional FLDDEF sub-parameter named CP (or CODEPAGE) specifies the code page of the data. This parameter is required only when: the type of data is CHAR, VARCHAR, VARCHAR_SZ, DATETIME_CHAR, DATETIME_SZ, or DATETIME_VARCHAR; and this information is not available within the masking module executing environment, the CP/CODEPAGE parameter was not specified outside of the FLDDEF, or the code page of the data for the subject FLDDEF is different than the CP/CODEPAGE specified outside of the FLDDEF. This parameter takes an integer value that specifies the codepage or character-set identifier. For example, FLDDEF1=(CP=1252, NAME=FIELD3, DT=CHAR) specifies code page 1252.
  • An optional FLDDEF sub-parameter named CPT or CPTYPE specifies the code page type. This parameter is required only when: the type of data is CHAR, VARCHAR, VARCHAR_SZ, DATETIMEI_CHAR, DATETIME_SZ, or DATETIME_VARCHAR; the CP/CODEPAGE sub-parameter is specified; and this information is not available within the ODPP executing environment, the CPT/CPTYPE parameter was not specified outside of the FLDDEF, or the source of the data for the subject FLDDEF is different than the CPT/CPTYPE specified outside of the FLDDEF.
  • The following Table 1 identifies the code page type abbreviations based upon the data source:
  • TABLE 1
    Code page type abbreviations
    Type Values Data Source
    DBZ or DB2zOS DB2 z/OS
    DB2 or DB2LUW DB2-LUW
    ORA or ORACLE Oracle
    SYB or SYBASE Sybase
    ODBC ODBC
    IFX or INFORMIX Informix
    MSS or SQLSERVER MS SQL Server
    TD or TERADATA Teradata
    NZ or NETEZZA Netezza
    ANY any DBMS
    NONE no DBMS
  • In many cases, the source of the input data is a DBMS in which case a DBMS-type code page type value is required. This ensures that the masking module handles the data using DBMS-specific code pages. When the origin of the data is DBMS specific but not tied to any one DBMS, then the value should be specified as ANY. When the origin of the data is from a non-DBMS source, then the value should be specified as NONE.
  • An example expression using CP and CPT is the following: FLDDEF1=(CP=943, CPT=DB2, NAME=FIELD=4, DT=VARCHAR). This expression specifies that the code page for the field is DB2.
  • Parameters that are specified within the input parameter string and that are used for more than one of the masking service provider specific grammar include CP (or CODEPAGE), CPT (or CPTYPE), and DLIM (or DISCARDLIMIT).
  • CP specifies the code page of the data for all data-related input. This parameter may be optionally overridden within a FLDDEFn-type parameter when there is a difference in the code pages between fields within the same syntax expression. This allows the masking module to handle data expressed in different code pages between different fields. The default is UTF-8 (Unicode). The parameter takes an integer value that specifies the codepage or character-set identifier.
  • CPT is an optional parameter that specifies the code page value. This code page type applies to all data-related input. This parameter may be optionally overridden within a FLDDEFn-type parameter when there is a difference in the code page types between fields within the same syntax expression. This allows the masking module to handle data expressed in different DBMS-specific code pages.
  • The DLIM parameter specifies the number of failed rows that should be discarded or ignored before a process takes an action. The particular action depends on the specific implementation (e.g. Lua, UDF, etc.). For example, the expression DLIM=10 specifies that ten rows are to be discarded.
  • The input to an masking module-based UDF is specified with the following format:
  • OptimMask<ret-type> ( argument-1 , ... argument-n , ‘ODPP-
    provider-input-syntax ’).
  • The term OptimMask<ret-type> is the name of the ODPP-type UDFs. <ret-type> is the return data type from the UDF which is based upon the categorization of data types that are supported within each DBMS. The terms argument-1, . . . argument-n are the input arguments to the UDF. At least one argument is required as the object of the UDF. This argument may be any type of SQL expression supported by the hosting DBMS. In many cases, this will simply be the name of the source column. The string ‘ODPP-provider-input-syntax’ is the syntax expression that is input to the ODPP-specific service provider, for example:
  • SELECTCCNCol, OptimMaskStr800Latin (CCNCol,
     ‘pro=ccn,mtd=repeatable, flddef1 = (name=CCNvc, dt=char) ’)
    MaskedCCN from TestTable.
  • In the above example, OptimMaskStr800Latin is the name of the masking module-based UDF, which can return VARCHAR string of max. 800 characters; CCNCol is the table column-name to be masked; ‘pro=ccn, mtd=repeatable, flddef1=(name=CCNvc, dt=char)’ is the ODPP syntax that is requesting the masking module CCN service provider. The method of masking is repeatable.
  • Scripts (e.g., Lua scripts) may be used for customized column processing with a database. These scripts may invoke masking module 104 to mask data values. A call to a masking service provider from a Lua script uses the same masking grammar as described above in the context of a UDF. For example, the following Lua code may be used to generate a masked value via the masking service provider for credit card numbers (CCN).
  • VALUE = source.field.getvalue(“CreditCardNum”)
        -- get CreditCardNum field value
    MASK_VALUE = OptimMaskStr800Latin(VALUE,
       ‘pro=ccn, mtd=repeatable,
       flddef1=(name=CCNvc,dt=char)’)
    target.field.setvalue(MASK_VALUE)
  • It will be appreciated that the embodiments described above and illustrated in the drawings represent only a few of the many ways of implementing embodiments for masking data objects consistently across a plurality of different data resources to protect privacy.
  • The environment of the present invention embodiments may include any number of computer or other processing systems (e.g., client or end-user systems, server systems, etc.) and storage systems (e.g., file systems, databases, or other repositories), arranged in any desired fashion, where the present invention embodiments may be applied to any desired type of computing environment (e.g., cloud computing, client-server, network computing, mainframe, stand-alone systems, etc.). The computer or other processing systems employed by the present invention embodiments may be implemented by any number of any personal or other type of computer or processing system (e.g., desktop, laptop, PDA, mobile devices, etc.), and may include any commercially available operating system and any combination of commercially available and custom software (e.g., database software, communications software, etc.). These systems may include any types of monitors and input devices (e.g., keyboard, mouse, voice recognition, touch screen, etc.) to enter and/or view information.
  • The various functions of the computer or other processing systems may be distributed in any manner among any number of software and/or hardware modules or units, processing or computer systems and/or circuitry, where the computer or processing systems may be disposed locally or remotely of each other and communicate via any suitable communications medium (e.g., LAN, WAN, intranet, Internet, hardwire, modem connection, wireless, etc.). For example, the functions of the present invention embodiments may be distributed in any manner among various server systems, end-user/client and/or any other intermediary processing devices including third party client/server processing devices. The software and/or algorithms described above and illustrated in the flow charts may be modified in any manner that accomplishes the functions described herein. In addition, the functions in the flow charts or description may be performed in any order that accomplishes a desired operation. Application 102, masking module 104, and some or all components thereof may be coupled in any manner (e.g., statically linked, dynamically linked, inline, within the same process or separate processes, within the same or separate processors, etc.).
  • The communication network may be implemented by any number of any types of communications network (e.g., LAN, WAN, Internet, Intranet, VPN, etc.). The computer or other processing systems of the present invention embodiments may include any conventional or other communications devices to communicate over the network via any conventional or other protocols. The computer or other processing systems may utilize any type of connection (e.g., wired, wireless, etc.) for access to the network. Local communication media may be implemented by any suitable communication media (e.g., local area network (LAN), hardwire, wireless link, Intranet, etc.).
  • The system may employ any number of data storage systems and structures to store information. The data storage systems may be implemented by any number of any conventional or other databases, file systems, caches, repositories, warehouses, etc.
  • The present invention embodiments may employ any number of any type of user interface (e.g., Graphical User Interface (GUI), command-line, prompt, etc.) for obtaining or providing information, where the interface may include any information arranged in any fashion. The interface may include any number of any types of input or actuation mechanisms (e.g., buttons, icons, fields, boxes, links, etc.) disposed at any locations to enter/display information and initiate desired actions via any suitable input devices (e.g., mouse, keyboard, touch screen, pen, etc.).
  • It is to be understood that the software of the present invention embodiments could be developed by one of ordinary skill in the computer arts based on the functional descriptions contained in the specification and flow charts illustrated in the drawings. Further, any references herein of software performing various functions generally refer to computer systems or processors performing those functions under software control. The computer systems of the present invention embodiments may alternatively be implemented by any type of hardware and/or other processing circuitry.
  • The present invention embodiments are not limited to the specific tasks, algorithms, parameters, data, or network/environment described above, but may be utilized for any type of data object masking.
  • The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises”, “comprising”, “includes”, “including”, “has”, “have”, “having”, “with” and the like, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
  • The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
  • The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
  • As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
  • Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • Aspects of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims (7)

What is claimed is:
1. A computer implemented method of masking data objects across a plurality of different data resources comprising:
accessing a plurality of service providers to mask the data objects, wherein each service provider corresponds to a different type of data masking for the data objects, via an interface that provides access to the plurality of service providers from different data-consumers to mask the data objects according to the corresponding types of data masking, wherein resulting masked data maintains relational integrity across the different data resources.
2. The computer implemented method of claim 14, wherein the data-consumers include at least one of user generated applications and user defined functions.
3. The computer implemented method of claim 14, wherein the interface includes a grammar to enable data-consumers implemented by at least one of computing and scripting languages to access the plurality of service providers to mask the data objects.
4. The computer implemented method of claim 14, wherein at least one service provider includes a user generated service provider performing a corresponding type of data masking, and the interface includes:
a provider interface to enable the data-consumers to access at least one user generated service provider.
5. The computer implemented method of claim 14, wherein the interface includes an Application Programming Interface (API).
6. The computer implemented method of claim 14, wherein each data resource includes one of a data source, a data application, and an operating platform.
7. The computer implemented method of claim 14, wherein the data-consumers are associated with different contexts of data and applications and the data masking of the service providers is applied to the different contexts.
US14/298,058 2013-10-21 2014-06-06 Consistent data masking Abandoned US20150113659A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/298,058 US20150113659A1 (en) 2013-10-21 2014-06-06 Consistent data masking

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/058,556 US9621680B2 (en) 2013-10-21 2013-10-21 Consistent data masking
US14/298,058 US20150113659A1 (en) 2013-10-21 2014-06-06 Consistent data masking

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/058,556 Continuation US9621680B2 (en) 2013-10-21 2013-10-21 Consistent data masking

Publications (1)

Publication Number Publication Date
US20150113659A1 true US20150113659A1 (en) 2015-04-23

Family

ID=52827425

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/058,556 Expired - Fee Related US9621680B2 (en) 2013-10-21 2013-10-21 Consistent data masking
US14/298,058 Abandoned US20150113659A1 (en) 2013-10-21 2014-06-06 Consistent data masking

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/058,556 Expired - Fee Related US9621680B2 (en) 2013-10-21 2013-10-21 Consistent data masking

Country Status (1)

Country Link
US (2) US9621680B2 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965648B1 (en) 2017-04-06 2018-05-08 International Business Machines Corporation Automatic masking of sensitive data
US10032043B2 (en) * 2015-06-29 2018-07-24 International Business Machines Corporation Masking sensitive data in mobile applications
US10417435B2 (en) * 2015-12-01 2019-09-17 Oracle International Corporation Replacing a token with a mask value for display at an interface
US10460129B2 (en) 2017-01-12 2019-10-29 Ca, Inc. System and method for managing cooperative synthetic identities for privacy protection through identity obfuscation and synthesis
US20200065410A1 (en) * 2018-08-23 2020-02-27 Mastercard International Incorporated Systems and methods for validating database integrity
US20200082010A1 (en) * 2018-09-06 2020-03-12 International Business Machines Corporation Redirecting query to view masked data via federation table
US10592693B2 (en) 2017-01-12 2020-03-17 Ca, Inc. System and method for analyzing cooperative synthetic identities
US10867063B1 (en) * 2019-11-27 2020-12-15 Snowflake Inc. Dynamic shared data object masking
US11184333B2 (en) * 2016-12-05 2021-11-23 Intecrowd, LLC Human capital management data transfer systems
US20220164477A1 (en) * 2020-11-20 2022-05-26 Paypal, Inc. Detecting leakage of personal information in computing code configurations
US11368466B2 (en) 2019-09-18 2022-06-21 David Michael Vigna Data classification of columns for web reports and widgets
US11451371B2 (en) * 2019-10-30 2022-09-20 Dell Products L.P. Data masking framework for information processing system
EP4198785A4 (en) * 2020-09-01 2024-02-07 Huawei Tech Co Ltd Data masking method, data masking apparatus and storage device

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10242000B2 (en) 2016-05-27 2019-03-26 International Business Machines Corporation Consistent utility-preserving masking of a dataset in a distributed environment
CN107992937B (en) * 2016-10-26 2021-12-03 北京大学深圳研究生院 Unstructured data judgment method and device based on deep learning
US10842897B2 (en) 2017-01-20 2020-11-24 Éclair Medical Systems, Inc. Disinfecting articles with ozone
CN109460676A (en) * 2018-10-30 2019-03-12 全球能源互联网研究院有限公司 A kind of desensitization method of blended data, desensitization device and desensitization equipment
US11227065B2 (en) * 2018-11-06 2022-01-18 Microsoft Technology Licensing, Llc Static data masking
US11200218B2 (en) 2019-04-17 2021-12-14 International Business Machines Corporation Providing consistent data masking using causal ordering
US20200356559A1 (en) 2019-05-08 2020-11-12 Datameer, Inc. Query Combination In A Hybrid Multi-Cloud Database Environment
US20210382986A1 (en) * 2020-06-03 2021-12-09 ArecaBay, Inc. Dynamic, Runtime Application Programming Interface Parameter Labeling, Flow Parameter Tracking and Security Policy Enforcement
CN113158233B (en) * 2021-03-29 2023-06-27 重庆首亨软件股份有限公司 Data preprocessing method and device and computer storage medium
US11921868B2 (en) 2021-10-04 2024-03-05 Bank Of America Corporation Data access control for user devices using a blockchain

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070110224A1 (en) * 2005-11-14 2007-05-17 Accenture Global Services Gmbh Data masking application
US20090100527A1 (en) * 2007-10-10 2009-04-16 Adrian Michael Booth Real-time enterprise data masking
US20090204631A1 (en) * 2008-02-13 2009-08-13 Camouflage Software, Inc. Method and System for Masking Data in a Consistent Manner Across Multiple Data Sources
US20100131518A1 (en) * 2008-11-25 2010-05-27 Safenet, Inc. Database Obfuscation System and Method
US20110321120A1 (en) * 2010-06-24 2011-12-29 Infosys Technologies Limited Method and system for providing masking services
US20120096567A1 (en) * 2008-05-29 2012-04-19 James Michael Ferris Systems and methods for management of secure data in cloud-based network
US20140123303A1 (en) * 2012-10-31 2014-05-01 Tata Consultancy Services Limited Dynamic data masking
US20140172806A1 (en) * 2012-12-19 2014-06-19 Salesforce.Com, Inc. Systems, methods, and apparatuses for implementing data masking via compression dictionaries
US20150067886A1 (en) * 2012-02-21 2015-03-05 Green Sql Ltd Dynamic data masking system and method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6157955A (en) 1998-06-15 2000-12-05 Intel Corporation Packet processing system including a policy engine having a classification unit
US20060074897A1 (en) 2004-10-04 2006-04-06 Fergusson Iain W System and method for dynamic data masking
US20070016637A1 (en) 2005-07-18 2007-01-18 Brawn John M Bitmap network masks
US7974942B2 (en) 2006-09-08 2011-07-05 Camouflage Software Inc. Data masking system and method
US7917770B2 (en) 2006-10-10 2011-03-29 Infosys Technologies Ltd. Configurable data masking for software testing
FR2924880A1 (en) 2007-12-07 2009-06-12 France Telecom METHOD AND SYSTEM FOR TRANSFERRING OBJECTS

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070110224A1 (en) * 2005-11-14 2007-05-17 Accenture Global Services Gmbh Data masking application
US20090100527A1 (en) * 2007-10-10 2009-04-16 Adrian Michael Booth Real-time enterprise data masking
US20090204631A1 (en) * 2008-02-13 2009-08-13 Camouflage Software, Inc. Method and System for Masking Data in a Consistent Manner Across Multiple Data Sources
US20120096567A1 (en) * 2008-05-29 2012-04-19 James Michael Ferris Systems and methods for management of secure data in cloud-based network
US20100131518A1 (en) * 2008-11-25 2010-05-27 Safenet, Inc. Database Obfuscation System and Method
US20110321120A1 (en) * 2010-06-24 2011-12-29 Infosys Technologies Limited Method and system for providing masking services
US20150067886A1 (en) * 2012-02-21 2015-03-05 Green Sql Ltd Dynamic data masking system and method
US20140123303A1 (en) * 2012-10-31 2014-05-01 Tata Consultancy Services Limited Dynamic data masking
US20140172806A1 (en) * 2012-12-19 2014-06-19 Salesforce.Com, Inc. Systems, methods, and apparatuses for implementing data masking via compression dictionaries

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10032043B2 (en) * 2015-06-29 2018-07-24 International Business Machines Corporation Masking sensitive data in mobile applications
US10417435B2 (en) * 2015-12-01 2019-09-17 Oracle International Corporation Replacing a token with a mask value for display at an interface
US11184333B2 (en) * 2016-12-05 2021-11-23 Intecrowd, LLC Human capital management data transfer systems
US10460129B2 (en) 2017-01-12 2019-10-29 Ca, Inc. System and method for managing cooperative synthetic identities for privacy protection through identity obfuscation and synthesis
US10592693B2 (en) 2017-01-12 2020-03-17 Ca, Inc. System and method for analyzing cooperative synthetic identities
US9965648B1 (en) 2017-04-06 2018-05-08 International Business Machines Corporation Automatic masking of sensitive data
US10489608B2 (en) 2017-04-06 2019-11-26 International Business Machines Corporation Automatic masking of sensitive data
US11074366B2 (en) 2017-04-06 2021-07-27 International Business Machines Corporation Automatic masking of sensitive data
US20200065410A1 (en) * 2018-08-23 2020-02-27 Mastercard International Incorporated Systems and methods for validating database integrity
US11100085B2 (en) * 2018-08-23 2021-08-24 Mastercard International Incorporated Systems and methods for validating database integrity
US20200082010A1 (en) * 2018-09-06 2020-03-12 International Business Machines Corporation Redirecting query to view masked data via federation table
US11030212B2 (en) * 2018-09-06 2021-06-08 International Business Machines Corporation Redirecting query to view masked data via federation table
US11368466B2 (en) 2019-09-18 2022-06-21 David Michael Vigna Data classification of columns for web reports and widgets
US11451371B2 (en) * 2019-10-30 2022-09-20 Dell Products L.P. Data masking framework for information processing system
US11055430B2 (en) * 2019-11-27 2021-07-06 Snowflake Inc. Dynamic shared data object masking
CN113261000A (en) * 2019-11-27 2021-08-13 斯诺弗雷克公司 Dynamic shared data object masking
US10867063B1 (en) * 2019-11-27 2020-12-15 Snowflake Inc. Dynamic shared data object masking
US11574072B2 (en) * 2019-11-27 2023-02-07 Snowflake Inc. Dynamic shared data object masking
EP4198785A4 (en) * 2020-09-01 2024-02-07 Huawei Tech Co Ltd Data masking method, data masking apparatus and storage device
US20220164477A1 (en) * 2020-11-20 2022-05-26 Paypal, Inc. Detecting leakage of personal information in computing code configurations
US11755776B2 (en) * 2020-11-20 2023-09-12 Paypal, Inc. Detecting leakage of personal information in computing code configurations

Also Published As

Publication number Publication date
US9621680B2 (en) 2017-04-11
US20150113656A1 (en) 2015-04-23

Similar Documents

Publication Publication Date Title
US9621680B2 (en) Consistent data masking
US8352478B2 (en) Master data framework
US8626573B2 (en) System and method of integrating enterprise applications
US9348870B2 (en) Searching content managed by a search engine using relational database type queries
US20190034476A1 (en) Converting a language type of a query
US10423396B1 (en) Transforming non-apex code to apex code
US8180758B1 (en) Data management system utilizing predicate logic
KR101099152B1 (en) Automatic task generator method and system
US7689580B2 (en) Search based application development framework
US20090049422A1 (en) Method and system for modeling and developing a software application
GB2519779A (en) Triplestore replicator
US20110252049A1 (en) Function execution using sql
US6421666B1 (en) Mechanism for sharing ancillary data between a family of related functions
EP4022427A1 (en) Generating software artifacts from a conceptual data model
US8005785B2 (en) Apparatus and method for routing composite objects to a report server
CN110109983B (en) Method and device for operating Redis database
US20050289115A1 (en) Integrating best practices into database design
US9026561B2 (en) Automated report of broken relationships between tables
US11176314B2 (en) XML schema description code generator
CN112182080A (en) Data integration system and data processing method based on data integration system
Purba An Approach for Establishing Enterprise Data Standard0
Vajantri et al. An apache calcite-based polystore variation for federated querying of heterogeneous healthcare sources
Baklarz et al. DB2 Universal Database V8 for Linux, UNIX, and Windows Database Administration Certification Guide
Batra et al. Embedded databases & Berkeley
Bannert timeseriesdb: Manage and Archive Time Series Data in Establishment Statistics with R and PostgreSQL

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:D'COSTA, NOEL H. E.;HAGELUND, PETER;HENDERSON, DAVID J.;AND OTHERS;SIGNING DATES FROM 20131016 TO 20131018;REEL/FRAME:033050/0821

AS Assignment

Owner name: GLOBALFOUNDRIES U.S. 2 LLC, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:036550/0001

Effective date: 20150629

AS Assignment

Owner name: GLOBALFOUNDRIES INC., CAYMAN ISLANDS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GLOBALFOUNDRIES U.S. 2 LLC;GLOBALFOUNDRIES U.S. INC.;REEL/FRAME:036779/0001

Effective date: 20150910

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: GLOBALFOUNDRIES U.S. INC., NEW YORK

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:WILMINGTON TRUST, NATIONAL ASSOCIATION;REEL/FRAME:056987/0001

Effective date: 20201117