When planning for new or revised data platforms, or migrating data from on-premise to cloud platforms, data controls should be considered in addition to core business requirements. The following are key areas to consider:
- New regulations and standards regarding data security and privacy
- Digital rights management for data
- Metadata regarding data quality.
Considering these areas during data design allows appropriate processes and controls to be incorporated into the data architecture. Data privacy, digital rights, and data quality management will become increasingly complex over time, and so designing and implementing methods into the data architecture to manage those aspects will yield an enormous return on investment. Such investment in design is worthwhile whether your platform is cloud-based or on-premise.
Planning for security and privacy in data organization and management
Regulations and standards around data security and data privacy should be considered during the requirements phase when designing data platforms across the enterprise. Significant regulations have been introduced in the European Union and California, with other jurisdictions expected to follow suit.
In the EU, the General Data Protection Regulation (GDPR) defines “rules relating to the protection of natural persons with regard to the processing of personal data and rules relating to the free movement of personal data,” and is enforceable by the EU globally, regardless of where the data processing takes place. It directs how consumer data is to be managed, retained and disclosed to third parties, among other aspects, and non-compliance meets with stiff financial penalties, besides reputational risk and the risk of actual harm. Ensuring that your design allows for automated compliance will place your organization at a material advantage relative to those relying on manual efforts.
Meanwhile in California, the California Consumer Protection Act (CCPA), which represents a significant step forward in US consumer data-privacy rights, protects “consumer rights relating to the access to, deletion of, and sharing of personal information that is collected by businesses.” Although the CCPA is less enforceable outside California than GDPR is outside Europe, in-scope organizations must have the ability to comply with a number of stringent data storage, retrieval and notice provisions, while also meeting certain data security standards for data relating to California consumers.
Regarding standards, ISOs 27001 and 27701 address information security management and privacy information management from the perspective of developing controls around people, processes, technology, and data, including PPI (personally identifiable information). The New York State Department of Financial Services (NYDFS) Cybersecurity Regulation (23 NYCRR 500), implemented in 2017, is closely aligned with ISO 27001, and prescribes additional measures that New York financial institutions must adopt around cybersecurity, data security, and data privacy. The above regulations and standards should be considered for both on-premise and cloud environments. Cloud implementations increase the number of aspects to be evaluated, since data traverses the Internet. Those aspects include: increased security and encryption during data movement; third-party Internet access; and location and jurisdiction of data. All are resolvable, however, given adequate analysis and a willingness to future-proof the data architecture.
Managing digital rights across the enterprise
In many enterprises, licensed data is used in systems, applications, and reporting. This is particularly the case for market data, instrument data, and party data. At the contract level, terms may be imposed on:
‘Who’ could encompass named users, the permitted number of users, or a specific department or division.
‘Where’ might be limited to an office address, state or province, or country.
‘How’ might be the specific system or report access, or whether the data can be used generally or must not be used for certain usage such as machine learning, derived data, or external reporting.
Logging, managing and reporting on this ‘rights-of-use’ information can be complex and time-consuming without the proper processes and procedures in place. Data platforms that automatically associate this metadata with existing or planned data catalogs can optimize enterprise access to licensed data in compliance with contractual agreements. Having such built-in access rights allows advanced or quantitative users to access compliant data in sandbox environments, largely disintermediating IT for any self-serve reporting, data analytics or machine learning.
Integrating data quality across the enterprise
Prior to reporting and extraneous system usage, data quality should begin within the initial source systems. Determining the quality of data at the source, and identifying and resolving data and source-system issues–through automated and well-governed processes–can go a long way toward improving the quality of data accessed downstream through reports and other systems.
It is important not to stop with processes only focusing on source remediation, data quality issue management, and lineage documentation. Data accessed and transformed via ETL or ELT processes (extract, transform / load), and then landed in data lakes, data warehouses and data marts should be validated for accuracy as well. A report showing summed or aggregated data should provide the same result if the source was summed manually. As data branches out through multiple stages — source > data lake > data warehouse > data mart > reports or dashboards or analytics — it is not difficult to incorporate incorrect calculation or aggregation methods. Validating data quality through all stages of usage and lineage, and automatically providing the associated metadata to the enterprise, will ensure accurate and complete reporting, increase organizational confidence in data, and reinforce a positive data culture.
Including the right resources during data platform planning
Satisfying core business requirements is central to the success of any data platform or data migration. As technology advances, however, and the enterprise’s need for trusted data expands, it’s increasingly critical to take into account:
- Security and privacy requirements
- Digital rights management
- Data quality enablement.
This is in addition to addressing other data controls such as general data governance and data cataloging. Having the right subject matter experts in the room will massively improve your chances of delivering a fit-for-purpose solution. Few companies have these aspects seamlessly embedded into their legacy data architecture, and their speed is throttled and their risk profile heightened as a consequence. When making change, whether incremental or a wholesale move to the cloud, taking the opportunity to implement these three items–no matter on how small a scale–will accelerate your organization’s learning, reduce risk, and ultimately enhance its ability to innovate and deliver on its strategy.