No matter what industry we’re working in, systems integration is a common task we have to execute as software engineers. Most often we integrate internal systems in the products we build, but as product needs evolve, we have to perform integrations with external systems as well.
To make this possible, we follow different types of enterprise integration patterns, each with distinct advantages and disadvantages. Keep reading to explore the different types of integration patterns, which are File Transfer, Shared Database, Remote Procedure Calls, and Messaging.
Breaking Down Different Integration Patterns
One way to integrate two or more components is through file transfer. In file transfer integration, a component produces data files, and another component consumes them, usually using a poll mechanism.
- The component that produces the files and the one that consumes them are decoupled since they don’t need to know how they work internally. However, we need to maintain the internal file structure since this is now the contract for both components, and if we change it inadvertently, it can cause integration issues.
- Since there could be a period between a file being produced and a file being consumed, the component data can get out of sync, and it may or may not be something that can affect the business processes since data can be stale.
- We can generate files more frequently; however, this takes a lot of resources, and we need to make sure that the components that consume them don’t lose any of the files.
- Data produced by one component could have a different meaning to another component, which can cause information inconsistency.
Neutral Points to Consider
- The component that consumes the files usually needs to do some format processing and conversion.
- The components need to agree on file naming conventions and the folder structure to store the files.
- Sometimes we want to define an interval of time when the files are produced and when to consume them since it could take time to generate them, which business needs to consider.
- A locking mechanism is required to avoid a consumer component reading a file while the producer component is writing it.
- We need a deletion policy and mechanism to remove files that are no longer needed.
This is a very well-known and easy to set up enterprise integration pattern since all we need is a database (SQL or NoSQL) central to multiple components.
- Transaction management is provided by the database, which maintains the data’s consistency, integrity, and durability.
- Data format and structure are consistent since we use a database that clearly defines these aspects.
- There is no different meaning of data since we need to agree on the meaning of the data across the different components.
- An SQL shared database can become a bottleneck since multiple components will be writing/reading to the same database. Usually, we will have our components distributed in different places. This means we will need to replicate the database and have distributed locking, which could easily lead to performance issues. This is no longer an issue in NoSQL databases; however, relational databases are still widely used when needing ACID properties for some transactions.
- Defining a single schema that is useful for all the components is not always an easy task since each component usually has its own needs. NoSQL databases allow us to change the schema dynamically; however, in practical terms, if we have multiple components using the same NoSQL database, it’s also required to define a schema that works for all of them.
- Since the data is not encapsulated, any component can change the data without further notice, and this could cause issues for other components using the same data.
- It’s important to couple all the components to one single database.
Remote Procedure Invocation
When we talk about remote procedure invocation, we start to think about remote services, which invokes a method located in a remote component. We can pass the data required, and the remote method will know how to process it. Some of the most relevant technologies that belong to this category are SOAP Services, Restful Services, and, more recently, GraphQL to query Services.
- We can implement the microservices architecture and all its advantages:
- Increased agility due to a shorter development time
- Faster and more reliable automated deployment
- Better testability since the logic of each service can be tested independently
- Higher scalability
- The data and the logic to process the data are encapsulated in the remote method
- Each component changes its data without affecting other components, which also helps us avoid different meanings of the same data
- Coupling exists since to invoke the remote procedure, the caller needs to know the contract of the callee, and the callee needs to be available during the process
- Since this is all done remotely, there could be some network failures or latency, and we need to design our components to deal with these scenarios
- Integration and operational complexity can increase since logic and data is distributed across multiple components
- Performance can be degraded if we have network latency or if one of the components in the flow call has performance issues
Finally, we come to Messaging, a way to asynchronously transfer and process data between two components using messages. The components that participate in this type of integration are the producers, the broker, and the consumers.
- Messaging allows us to implement an event-driven architecture with all its advantages:
- Agility since we can have different teams building the producer/consumer components independently
- Better deployment since each producer/consumer component can be deployed and operate independently
- High performance since producer/consumer components run asynchronously
- High scalability since we can deploy additional instances of the producers and consumers to distribute the processing load
- There’s no coupling between the component that produces the message and the one that consumes it. They don’t need to know each other’s internal details, and they don’t need to be available simultaneously
- Data staleness can be mitigated since when there’s a change, a producer component can quickly generate a message, and consumers will be notified and processed immediately
- Testability of the message flow across the different producer/consumer components could be difficult since multiple components could be processing the same message simultaneously
- Complexity could be increased since the asynchronous processing of the data is distributed across multiple producer/consumer components, potentially making it hard to understand the full data flow. We also need to deal with message processing errors and programming learning curves
What’s the Right Enterprise Integration Pattern for You?
In modern times in a greenfield scenario, we most often integrate components using the remote procedure invocation integration pattern with a microservices architecture, or we use messaging to process data to implement an event-driven architecture asynchronously.
However, the choice of one integration pattern over another is guided by what’s the most modern solution and other external factors like working with legacy systems that only allow us to integrate following a particular pattern. With this in mind, we always need to define a solution that considers the different pros and cons of each integration style.
At Wizeline, our Solutions team can help you define an architecture that considers the different constraints of your project and define an appropriate integration strategy. Contact us today by emailing firstname.lastname@example.org to get started!