What are transactional databases?
Transactional databases are optimized for running production systems—everything from websites to banks to retail stores. These databases excel at reading and writing individual rows of data very quickly while maintaining data integrity.
Transactional databases are row-stores, which means that data is stored on disk as rows, rather than columns. Row-stores are great when you need to know everything about one customer in the user table since you can grab only the data you need. But are not so great when you’re trying to count the customers in a particular ZIP code, since you have to load not just the ZIP column, but the name, address and user_id columns as well.
Transactional databases aren’t specifically built for analytics, but often become de facto analytic environments because they’re already in place as production databases. Because they’ve been around for decades, they’re familiar, accessible, and ubiquitous.
If your organization does not have a pre-existing separate analytics stack, one of the quickest ways to start doing analytics is to create a replica of your transactional database. This ensures that analytic queries don’t accidentally impede business-critical production queries while requiring minimal additional setup. The downside is that these databases are designed for processing transactions, not analysis. Using them for analytics is a great place to start, but you may run into limitations and need workarounds sooner than you would on an analytics-specific setup.
Ensuring data integrity
Transactional databases are architected to be ACID compliant, which ensures that writes to the database either succeed or fail together, maintaining a high level of data integrity when writing data to the database. Transactional databases are therefore critical for business transactions where a high-level of data integrity is necessary (the canonical example is banking where you want a whole transaction—debit from one account and credit to another—to either succeed or fail).
Because transactional databases are designed to run production systems, they’re very good at operations that must complete in milliseconds. If you’re doing analytics on a transactional replica of your production database, the replica will likely be nearly in sync with the main database, i.e. sub-second latency.
Monitoring operational systems
Working with data from transactional databases to provide a real-time operational snapshot is a perfect analytical use case for transactional databases because there’s so little latency introduced by the replica. If you’re trying to monitor support workloads or inventory or another operational system and need to make decisions based on data that’s as fresh as possible, replicating the production database may be the best option.
If even one part of the transaction fails, the entire transaction fails. In this way, every transaction must succeed 100% in order to successfully be committed to the database.
A transaction is either written to the database (bringing the database from one valid state to another) or the transaction is reverted.
Transactions that haven’t completed yet cannot be acted on or modified by other transactions.
Once a transaction is written to the database, it will stay there, even in the event of a database failure.