Isolating the data source from the internal view of data

The basic principle to keep in mind in deciding what definitions and ABL code go into this layer is that, to the greatest extent possible, you should isolate all code that has knowledge of the data source in the data access layer and its procedures. This allows you to define an internal data structure (an internal "schema", if you will) that is the best representation of your data for your application logic. In many cases, because of the evolution of a database design over time, the re-use of databases inherited from older applications, and other factors, your current database schema may not be an ideal way for your business logic to refer to and manage your data. An older database may have a number of shortcomings that you want to be able to mask in newly built business objects and business logic, including:

The database may not be properly normalized, which can mean that you need to execute complex queries in order to relate data in one table to another, or that there is redundant data that requires fields in multiple tables to be updated when data in another table changes.

The database may have inconsistent naming conventions, or similar data may be stored in different places in different ways. This could result in a single entity such as a customer number being stored in different tables under different names or in different formats or even in different data types. You will want your internal representation to be consistent.

The database may contain overly large tables with many fields, which have been accreted over time as various special end-user features required more and more fields to hold values used by only a subset of your users, or for other reasons. Often these fields could be logically grouped according to when or by whom they are needed, so that the internal tables do not use or see all the fields at the same time.

Conversely, sometimes multiple tables that really represent a single data entity must be joined together because they were developed at different times. Internally these might better be seen as fields in a single table.

Often, you would like to denormalize the data internally but not in the database. For example, it may be helpful to include the customer name along with customer number internally, if they need to be viewed together, while you would not want to repeat the customer name in multiple tables that all used the customer number as a key.

These are just a few of the considerations that would lead you to want to provide a different internal view of data from how it is actually stored. Even when the database is well designed, there will likely be times when you still would want the internal view of the data to be different, for example, when you need to present a denormalized view or specially filtered subset of the data to the application.

If you have an existing application, it is likely that you cannot (and should not) face the prospect of doing an extensive cleanup of the database schema and data as an initial part of a transformation effort. There may be various reasons for this. If you have a large installed user base, it may not be feasible to convert their databases to a new form all at once. Also, you may have large numbers of reports and business logic procedures that you expect to be able to reuse or easily repackage in the newer version of your application, even as you adapt your application for a distributed environment and in other ways that help it conform with the guidelines of the OpenEdge Reference Architecture. If you make extensive changes to the database schema, you may improve the quality of the data representation at the expense of a great deal more up front design and development work, and with more disruption to your current installed base.

Isolating the database schema specifics in the data access layer of your modernized procedures lets you clean up or otherwise change the underlying schema independent of how the application logic uses the data. When you make changes to the schema, only the mapping code in the data access layer needs to change. If you have older procedures that reference the database directly, then these can be kept isolated from the new internal definitions until you are prepared to rework or replace them.

In addition, some of your data may come from a source other than an OpenEdge database or a database accessible through an OpenEdge DataServer. In this case the actual Data-Source object will not be useful to you. Your data may come from an unmanaged source such as a flat file, a spreadsheet, or an XML or JSON document. Or it might be read dynamically from a data streaming device. In this case, you can still define a data access procedure that uses FILL event procedures to populate the Business Entity's ProDataSet with your own custom code. This means that, as with database Data-Sources, the rest of your application does not need to know about the specifics of where the data comes from or how it is managed once it gets beyond the business logic of the application.