Week 3: Architectural Design for the Data-source layer

The problem: How to allow querying of data source layer from the domain layer without exposing implementation details. What if we communicate directly from the domain layer?

There are a few problems:

• We want loose coupling between the domain layer and the , so we can freely change the database. • Domain layer developers should not have to be SQL experts. • Database Admins should not have to read domain code. • Parsing responses from the database are not part of domain logic.

Patterns: Data Gateway: An object that acts as a gateway to a database table. One instance handles all the rows in the table. It encapsulates queries as methods.

Consider a person table in a db. The gateway is implemented above. ResultSet is a set of all rows that matched query to the database. We have decoupled domain from the database.

Table Data Gateway Pros: Simplicity: It is a simple pattern that works well for many applications. Precise mapping: mapping between database and gateway is simple — one data gateway class per database table. Compatibility (table module): The table data gateway is highly compatible with the table , because it returns a result set for the table module classes to operate on.

Table Data Gateway Cons: Incompatibility (domain model): It is not as compatible with the domain model pattern as other data-source patterns are. Scalability: It will not scale as well as the domain complexity increases.

Row Data Gateway: An object that acts as a gateway to a single record within a table. There is one instance per row in the table. Attributes correspond to columns in the database table. Person finder is used to get person gateway instances which are used to update a single person row

Row Data Gateway Pros: Type safety: There will still have to be a conversion between the database record and the object, but these conversions are now done in one place (PersonGateway.load). Precise mapping: mapping between the database and gateways is straightforward –one gateway class per table. Compatibility (transaction script): One of the weaknesses of the transaction script is the duplicate code across transactions. With the row data gateway pattern, some duplicate code is instead re-factored into the gateway class and shared by many transaction scripts.

Row Data Gateway Cons: 1. Boiler plate code: A downside is that we have to create loads of boiler plate code, including new classes + getters/setters, to act as the gateways. Maintenance overhead. 2. Database coupling: This pattern tends to result in a close coupling with the underlying database because the gateway objects reflect the columns of tables. 3. Incompatibility (domain model): When using the domain model pattern at the domain layer, using the row data gateway pattern results in three data representations: one at the domain layer, one for the gateway, and one in the database; only two are needed. For this reason, other patterns should be used with the domain model pattern, e.g: active record.

Active Record: An object that wraps a row in a database table or , encapsulates the database access and adds domain logic on that data. Extension of the row data gateway pattern since it additionally contains domain logic in its objects.

Active Record Pros: Type safety, precise mapping, and compatibility (transaction script): These three pros are inherited since the active record pattern is an extension of the row data gateway pattern. High cohesion: This pattern promotes higher cohesion than the row data gateway pattern by placing logic related to the table rows in one class.

Active Record Cons: Domain Logic coupling: Promotes higher cohesion then row data gateway, but also couples data with domain logic, which reduces re-use. Can mitigate by implementing a base class containing database methods and inheriting these in a subclass and adding domain logic. Database coupling: Like row data gateway, encourages a close coupling with the database. Scalability: Does not scale well as domain complexity increases. Also, close coupling with the database schema makes it difficult to use OOP constructs such as inheritance.

Data Mapper: A layer of mappers that moves data between objects and a database while keeping them independent of each other and the mapper itself. The mapper separates the in-memory domain objects from the underlying database. Highly suitable to use with the domain model pattern, we can keep the domain focused.

Note the difference between this and the row data gateway. In the row data gateway, the PersonGateway object held the domain object attributes and converted query results into PersonGateway instances. In the data mapper, the Person object, which is the domain object, knows nothing about the database. Passing in person into the update method here.

Data Mapper Pros: Loose coupling: De-couples database access from domain objects. Compatibility: Pattern is highly compatible with the domain model patter. Data mapper can crate domain model objects from the database. High re-use for database and domain layer.

Data Mapper Cons: Complexity: An extra layer is added to act as the mapper. This increases complexity, so is only a worthwhile trade-off if the domain logic is particularly complex.

Choosing between the patterns: Table data gateway: Used when the domain is simple and fits well with the table module pattern. Row data gateway: This should be used when there the domain is simple, but also when design-time type safety is desired. Fits well the transcript script pattern. It should not be used if there is a high likelihood of the database schema changing.

Active record: Used when the domain logic is complicated, but not complicated enough to warrant the use of the domain model pattern. It should not be used if there is a high likelihood of the database schema changing.

Data mapper: This should be used when the domain logic is particularly complex and should be used in conjunction with the domain model pattern. It should also be used if there is a good chance that either the data source layer is likely to change, or if the domain logic is likely to change without a change to the underlying data. This includes cases where the “change” is re-using one of these layers in another application.