There is a continual pressure to reduce information latency times in the quest to keep a finger on the pulse of a business. This pressure stems from the rapid changes in the business landscape that have shaped IT strategy over the past four or five years. The move to real time information delivery demands just-in-time reporting. This demand is not restricted to the operational or transactional reports; the demand has expanded to include the analytics and business intelligence as well. Traditionally, these analytics and business intelligence applications were sourced from data warehouses that would be updated in a nightly batch window. As such, a day’s latency was the norm for analytics, and sufficed to know how the business did at the end of the last business day. However, in today’s world, it is necessary to know how the business is doing every hour. This is especially true for high volume rapid transaction systems such as financials, retail and airlines.
The demand for real time reporting and analytics is translated to stringent requirements on the backed data integration systems. The traditional flow of data to a data warehouse is via the ETL routines. These routines need to be architected to meet the near-real time latency requirements. Some typical challenges in a real time data integration environment are:
- Necessity to conduct business 24/7 is reducing batch windows. As more business is done across time zones and over the web, many organizations are faced with the problem of shrinking batch windows, making it more difficult for traditional ETL tools to extract data in the short time available.
- Large volumes of information are difficult to handle in a batch window. As more information is gathered – such as online transaction data, inventory data, and customer information – the effort involved in moving it to the warehouse increases drastically. Many organizations are finding that an eight hour batch window is no longer sufficient for traditional ETL (extract, transform, load) tools to integrate all of the needed data.
- The need to track all changes for auditing purposes. Organizations need to comply with regulations, which often requires them to continuously track all changes to data and not just the net result of those changes.
- Growing need to detect and react to business events as they happen. Many organizations are looking for ways to detect business events in production systems and have those events trigger a response in another system. For example, a cell phone company would like to send a text message to a customer running low on minutes asking if him if he would like to purchase more.
- Organizations want to deploy new applications using data on legacy systems without paying for an increase in workload. Often, legacy systems are already maxed out and new capacity is very expensive. Organizations want new applications on UNIX or Windows to avoid this cost, but integrating the data from those legacy systems without increasing the load on them is a key challenge
- Increasing need to keep data in sync across the enterprise. Customers want up-to-the-minute access to order, payment and inventory data so they can buy products, pay bills and check delivery status online. Employees need much of the same so they can better service customers and make wise business decisions. To accomplish this, eCommerce data needs to be in sync with business applications and data needs to flow in real-time across the enterprise.
For more information, the reference for this post is here (Link to IBM)