Mirroring and Protecting Cassandra Databases In The Cloud


Summary: A leading provider of Data as a Service for data-driven enterprise applications turned to Imanis Data to protect its critical customer data assets stored in Cassandra databases in the cloud and to enable rapid application iteration.


Industry: Technology (software). An award-winning software company that manages all types of customer data including multi-domain master data, transaction and interaction data, third party, public and social data, across all industries from healthcare and life sciences to retail and entertainment.


Big Data Environment: This Imanis Data customer has standardized on Datastax Enterprise (DSE) as the underlying NoSQL database. All databases and applications are hosted in the Amazon AWS cloud. The customer currently serves its clients using six 6-node DSE clusters storing 36 terabytes of data.


Challenges: The customer was using their engineering resources to write scripts for protecting Cassandra databases. The backup scripts were executed on a nightly basis on each of the DSE clusters and would frequently fail. Engineering would have to be called in to debug and fix these complex scripts so that Cassandra backups could be done successfully. Also, the scripts were backing up all replicas of data resulting in escalating Amazon storage bills.


Creating test and development environments with production data also involved writing inefficient scripts. Engineers had to wait for days to get a non-production environment to use for development thereby slowing down the application development process. These challenges were an unnecessary drain on valuable engineering resources and taking engineers away from other business critical projects.


Solution: The customer has deployed a single 3-node Imanis Data cluster to back up all 6 DSE clusters. Deploying and configuring the Imanis Data software to back up 6 DSE clusters took less than an hour and the entire configuration was done using our web-based user interface. This greatly simplified the backup and recovery process and freed up valuable engineers from writing and maintaining scripts.


All backups are de-duplicated, encrypted, and stored in Amazon S3. Imanis Data’s content-aware deduplication significantly reduced the backup storage requirements by storing one backup copy versus storing all replicas. By copying backup data to low-cost Amazon S3, the customer was able to further reduce backup storage costs significantly.


The same Imanis Data cluster is also being used to spin up test & development clusters using production data. Using our RESTful API, the customer is able to integrate Imanis Data into their workflows and dashboard. Developers can now create test and development clusters very easily and quickly without writing any scripts.

Sign Up To Receive Imanis Data Updates

Take the Next Step

Put Imanis Data to work for all your data management needs.