Data architecture: the basis of a differentiating strategy

10 de July de 2023

Data architecture is a fundamental element of successful information management and business organisation systems.

It integrates the models, policies and rules that govern what data will be collected; how it will be stored, classified and exploited through the available technology infrastructure.

In the same way that data architecture is critical for the good management of a company, so is its corporate strategy, so it is necessary to take care of its design and implementation. Both are related, because if something fails in the design of the corporate strategy, there may be many failures in data management and, consequently, in the company’s organisation. A migration process, for example, can become a real headache if the database design is flawed.

An example of a well-defined data architecture is those companies that, after learning of the changes that were to be applied to the GDPR, adapted their databases before the regulation came into force.

Big data and the origin of data architecture

To understand what data architecture is all about, you need to know what big data is: ‘large volumes of data of all kinds that cannot be analysed using conventional IT tools’. Thus, the objective of big data tools is none other than to analyse data and information in an intelligent way, in order to help in decision-making.

On the other hand, the objective of data architecture is to define the origin and types of data necessary for business development. The system designed to achieve this must be simple enough to be understood by stakeholders, as well as consistent and stable. Therefore, data architecture does not seek to define a universal design methodology, but to develop techniques to help deploy and produce information spaces.

Data architecture planning and design

In general, data architecture is designed and developed during the planning stage of a new system to establish how data will be processed, stored, used and accessed. Thus, in order to design an efficient system, control the flow of data and ensure its protection, it is important to know the relationship and type of management required for each type of data from the outset.

11 functions required in data management

With regard to data management, DAMA International defines eleven necessary functions:

  1. Data governance: planning, supervision, and control in the management and use of data.
  2. Data architecture: establishment of models, policies and rules for managing data.
  3. Data modeling & design: design of the database, and management of the implementation and technical support.
  4. Data storage: definition of the storage location, and the amount and type of data to be stored.
  5. Data security: protection of privacy and confidentiality.
  6. Data integration & interoperability: transport and consolidation of data.
  7. Documents & contents: establishment of the rules to be applied to data outside the databases.
  8. Reference & master data: management of shared data to reduce the amount of redundant information, improve data quality and obtain a global view of the information.
  9. Data warehousing & BI: management of analytical data processing and access to data that will support decision making.
  10. Meta-data: indexing of the information contained in a database.
  11. Data quality: definition, control and improvement of data quality according to the needs of the project.

Data architecture in data model development

The data architecture of a company has to be one of the pillars on which the development of the business data model is based. To define it, the following aspects must be taken into account:

  • The configuration of the database.
  • The way the data is stored.
  • The metadata architecture.
  • The data integration model(s).

The guidelines chosen in the definition and planning of a data architecture should provide for linkage to other business models and provide some flexibility for the organisation to develop the data as needed and without impediment. For example, they should take into account that the data collected and stored can be exploited at other times by different business units, and not only for the one for which it was collected in the first place.

In many cases, this development will require the company to adapt to market circumstances, as well as to market demands. For example, when new data protection legislation arises, as has happened in Europe with the GDPR, it will be necessary to adapt the data architecture to the new reality that it poses, both in terms of the new rules and in terms of what customers demand in relation to the protection of their information.

In establishing the foundations of a data architecture, the information skeleton of a company is put in place. In this process, there are several factors that cannot be overlooked, for example, the present and future information needs of the company, and the quality of the data models. To this end, it is advisable to define the corporate information management strategy around three points:

  • Development of standards applicable to all perspectives of the data model.
  • Data model quality review.
  • Version management and data model integration processes.

It should be borne in mind that data architecture designs created within an organisation can be reused to generate other different systems, for example, to generate the data architecture designs of new subsidiaries opened by the company that made the original design. This can reduce costs and improve the quality of databases, especially if the architectures from which the designs are reused have been successful.

The data architecture development cycle

The development of a data architecture, which always precedes the development of the system, is divided into four stages:

  • Requirements. This phase focuses on capturing, documenting and prioritising requirements that influence the data architecture. Special emphasis needs to be placed on data quality, as it plays a crucial role in the requirements. For example, if the data obtained is redundant, incomplete or does not relate to the desired information as set out in the data architecture, this data will be good, but cannot be considered good quality, as it does not meet the requirements.
  • Design. This is the most complex stage of the data architecture, since it is the moment in which the structures that compose it are defined. Patterns and design tactics are used to create it. At this point it is also necessary to choose the technologies to be used for data management, storage and processing.
  • Documentation. Once the architecture design has been created, it must be communicated to the other actors involved in its development, and to do this successfully, it is necessary to document the architecture design in detail.
  • Evaluation. After the documentation stage, it is important to evaluate the design to identify potential problems. This has an advantage if done early, before coding begins, as the cost of correcting defects that are identified is lower than if it is done after the system is built.

If the design of the database architecture is well defined and fine-tuned, and complies with all legal requirements in terms of storage and management, later problems will be avoided. In this sense, analysis is the key instrument for continuous improvement.