Santosh Avasarala – Technical Blogs http://www.easywaytech.com/blog Just another WordPress site Tue, 11 Dec 2018 09:44:51 +0000 en-US hourly 1 https://wordpress.org/?v=4.7.2 Tombstones in Cassandra http://www.easywaytech.com/blog/index.php/2016/07/19/tombstones-in-cassandra/ http://www.easywaytech.com/blog/index.php/2016/07/19/tombstones-in-cassandra/#respond Tue, 19 Jul 2016 06:36:31 +0000 http://www.easywaytech.com/blog/?p=41 When load testing our apps deployed in AWS cloud, we saw some really long, crazy timeouts on reads on all of the column families with a load of 500 odd threads for insert/update/delete using Datastax java driver v2.1.6.

Investigating Cassandra logs revealed:

“Scanned over 100000 tombstones; query aborted”

Tracing our cql queries, we saw tombstones information like the one below:

Because of these read latencies caused by tombstones, we did further research on how Cassandra deals with deletes.

Cassandra is a distributed system with immutable SSTables; deletes are done differently compared to a relational database. It stores all changes as immutable events. It can’t simply go back and mutate a record like a relational database would do.

When writing to Cassandra, the following actions take place:

  • The commit is logged to disk in a commit log entry and inserted into in-memory table
  • Once the memtable reaches a limit (on entries), it is flushed to disk
  • Entries from the memtable being flushed are appended to a current SSTable in the column family
  • If compaction thresholds are reached, a compaction is run. Compaction is a maintenance process which re-organizes SSTables to optimize data structures on disk as well as reclaim unused space

What is a Tombstone?

A Tombstone is ‘dead data’ which is a record of Cassandra’s deletion. Deletes are performed by writing a ‘Tombstone’ to Cassandra like the other writes.

Also Cassandra marks TTL data with a tombstone after the requested amount of time has expired. Tombstones exist for a period of time defined by ‘gc_grace_seconds’ which can be defined at column family or cluster level. After data is marked with a tombstone, the data is automatically removed during the normal compaction process.

How can it become critical?

While performing a delete request, Cassandra will write that delete to all the replicas available at that time. If any node goes down at the time of delete, it waits until the node comes back. These tombstones will not be removed up to grace period defined and if the node fails to come up within the grace period, then we might see a scenario where deleted values would again become readable because a tombstone only made it to a limited set of replicas and then got cleaned up.

Too many tombstones can cause your query to fail; it is a safe guard to prevent against OOM and poor performance. We got to know from our load tests that 100000 plus tombstones were being scanned in our queries with the Cassandra v2.0.5.

Finally, we found the fix for it after contacting the Datastax support team with our test results and per their suggestion, we have upgraded our Cassandra from v2.0.5 to v2.1.14 and java driver to v3.0.1 to resolve these issues. Also we have been running the ‘node repair’ routinely on all the nodes to repair the inconsistencies across the replicas whenever a failure occurs.

]]> http://www.easywaytech.com/blog/index.php/2016/07/19/tombstones-in-cassandra/feed/ 0 Need for Microservices http://www.easywaytech.com/blog/index.php/2016/06/20/need-for-microservices/ http://www.easywaytech.com/blog/index.php/2016/06/20/need-for-microservices/#respond Mon, 20 Jun 2016 12:45:42 +0000 http://www.easywaytech.com/blog/?p=15 Microservices architecture is a concept that supports agile development and delivery of complex/large applications. Let’s see how it is different from usual Monolithic, SOA architectures.

A Monolithic architecture is a method of developing applications where the front-end UI, core of the application business logic and database layer are implemented in a single application. This method is more focused on a modular architecture and is very simple to deploy and test. It makes horizontal scaling easier by having multiple copies of application behind a load balancer, but the application size keeps on growing along with the features added to it which makes scaling, agile development and continuous deployment a bit complex.

This is where Microservices comes in handy by tackling all these complexities by decomposing the modules into small individual services which can be hosted separately. Each microservice is aligned to a specific business function. Each backend service exposes a REST API and most services consume APIs provided by other services.

By this method, we can apply scaling to each independent microservice at the scale it needs. Also instead of sharing a single database schema through the whole application, each service can have a separate schema as it ensures loose coupling.

Benefits of Microservices

  • Each service can be developed and deployed independently of other services – easier to deploy new versions of services frequently
  • Easier to scale development. It enables you to organize the development effort around multiple teams
  • A simpler, lightweight service

We use AWS Elastic Compute Cloud (EC2) in our organization to deploy and scale our applications using Microservices architecture. We use the following libraries built by Netflix to develop a Microservice in JAVA:

  • Ribbon – Used for Load balancing and it gives support for multiple protocol (HTTP, TCP, UDP) in an asynchronous and reactive model
  • Hystrix – This provides Latency and Fault Tolerance for rapid recovery. It also provides thread and semaphore isolation with circuit breakers
  • Archaius – It is used for dynamic configuration and typed properties. It achieves high throughput and Thread safe configuration operations
  • Eureka – It is used for Service Discovery and load balancing at middle-tier

To summarize, the Microservices architecture pattern is a better choice for complex, evolving applications despite the implementation challenges.

]]> http://www.easywaytech.com/blog/index.php/2016/06/20/need-for-microservices/feed/ 0