Wednesday, 27 April 2016 18:35

Making Ruby on Rails Scale

Written by
Rate this item
(0 votes)

Making Ruby on Rails Scale

So what is Scaleability?

Scaleability is the ability of an application to satisfactorily service and respond to the number of users/requests required. The variable being the number of users or requests, and the "satisfactorily service" being the subjective. Typically, the scaleability of an application is defined by the choices made by the original designers and developers of the application, and the constraints given them by the "customer".

Twitter for example, was not designed from the ground up to be capable of supporting millions of users, and therefore they have had scaleability and reliability issues. Its not their fault, they became victims of the success of the project; having said that its not a trivial or cheap exercise to recover from a scaleability issue (or poorly written) application and come out with your brand reputation intact - so you have to hand it to those guys.

Best then to get it right from the outset (but easier said than done)

Design

We don't want to get paralysed by masticating constantly over the design. But its much easier to fix at the design stage than in production. So we need to do some planning from the outset

Rule 1 - Get the scaleablity parameters from the business plan and keep the client informed of the limitations of the design from the get go

The client should have built a business plan for the project; if they haven't it should be ringing your alarm bells. The business plan should dicate the scaleability requirements for the design. The plan should give you sufficient information to enable you to determine the platform you should  be using for the application.

Rule 2 - Challenge whether Ruby on Rails is the most appropriate platform for this application

Cube will be writing on the limitations of Ruby on Rails v2.x in a further blog.

Rule 3 - Its horses for courses folks! waterfall for big objectives, agile for small utility type applications.

If the application is sizeable, and/or involves a medium to large population base, then you're going to need to use at least some of the waterfall techniques for design, sure you can mix in some agile techniques where appropriate to keep things moving. But for big projects, pure agile development is going to lead you into a cul-de-sac in an articulated lorry, and it ain't going to be easy to escape from that one!!!!

Rule 4 - Map out the objectives and make absolutely sure you understand the requirements for the application

So map out the objectives, and goals for the application in the written form, you'll need to speak to all the stakeholders via interviews, group sessions if need be. If you can write it down easily, then you have put sufficient thought into the application, if its hard to write down, you need further thought.

Rule 5 - If you can't write it down easily, you haven't put in sufficient thought

The map is going to form the basis of the specification of the application. Once you've completed and reviewed the map, the application architect takes on the mantle. The architect will map and agree with the client the actor/role/process/artifacts, and crucially the interaction between them. You should then be in a position to develop and agree the views. This is best mocked up in the pictorial form (whichever suits the team best), you then need to review the pictorial views against the actor/role/process/artifact model. This technique is designed to ensure that the views will be useable in the real world, it also provides good information as to what data/methods will be required for the controller, and model design.

Model-View-Controller Architecture

Shoulda been called the VCM model in Rails, since the model and view never directly interact. Below is my interpretion of the rails architecture.

Ruby on  Rails Architecture

Rule 6 - Stick with the rails rules for Views, Models, and Controllers from the outset

The purpose of the view

The view is purely there to provide an interface to the user, it supplies and retrieves data to/from the controller(css, html, javascript, json, xml, csv, png, etc). So no putting any heavy weight code such as complex validation by embedding javascript in there. If you want to do aysnch validation (by that i mean in realtime, pre form submit), use a javascript function and generate it from the controller using one of the frameworks (ajax, jquery, mootools). (AJAX would be my favourite for this task). Views should not have any other interaction with anything other than the controller.

The purpose of the controller

The controller is like the traffic cop marshalling requests, it takes requests from the view, parses them, handles sessions, cookies, submits and requests data from the model, it also serves to provide security for the application. It should be mean and lean. if its not then you need to rethink, and refactor.

The purpose of the model

Models validate, store and retrieve data from the database, and deal with the business logic. Its where all the hard work is done, in the traffic analogue its the articulated lorry - it does all the heavy lifting and transport.

I can understand that sometimes you want to use for a stored procedure to make the database server do the work, since it allows the logic to be split into an 'n' tier architecture. I'd resist that until i was absolutely sure that there was no other way.

Object to Relational Mapping

The next exercise is to perform the object to relational mapping, and database design - its absolutely crucual you get this right - its very difficult to play around with database design once you've a couple of million rows in place, and some initially happy and expectant customers.

Rule 7 - Use the Rails conventions for Object, Table, and Relationship mapping

Unless you are porting a legacy database and have no other choice, stick with the Rails conventions in table, and attribute naming. I've spent plenty of time regretting some early decisions on thinking I was right, and they were wrong.

Rule 8 - Perform a sensibility check on the design

Finally, perform a sensibility check of your design against the actor/role/process/artifact model just to make sure that you haven't missed anything. If you can't make this diagram look clean, and readable, then its likely that the design needs more work.

Ruby on Rails

Lets talk about pure Ruby (as opposed to JRuby)

Ruby is an object oriented interpreted language, developed in 1995 by Yukihiro “matz” Matsumoto. Ruby is an extremely elegant language, which allows the programmer an immense degree of freedom. That freedom though, comes at a price. As an interpreted language, Ruby is not the greyhound of the language world (1.9.0 of ruby vs C++ vs Java - 89.3, 1.6).But at least, with the advent of Ruby 1.9, we have the ability to take advantage of multiple OS threads, in versions previous to 1.9 we could only take advantage of a single OS thread.

The major limitation is the Global Implementation Lock (GIL), this prevents more than one OS thread running at a time. So what does this mean ?, well we can't take advantage of multiple CPU cores and we have an IO blocking issue. The reason, is the lack of certainty that the application is thread safe.

Thread Safety

So Whats thread safety, and why has it not been implemented before? Threads share the same memory address space, so you can write the same variable multiple times in multiple threads, so which one is the write(sic) one, and how do you implement thread safety. The answer is to lock the function/method right at the start, and drop the lock at the end. The problem is that its expensive in resource term, 'cos you're blocking on the function. A more refined approach is to put locks around the write to the variable, this is less expensive, but more complicated and they ain't easy to debug. A simpler option is to make variables write once. This is the approach adopted in the dataflow gem Thead safety is only important in parallelism, so should you bother? If you need an application to scale, then a simple method is to add more hardware, but that only works if you have built-in the tools to make best use of the additional hardware.

Application Partitioning

the majority of system architects took the fork and exec daemon approach Application Partitioning Lets assume that we have an application in which clients submit a form periodically, administrators perform administration, analysts analyse, and managers generate MIS reports. Its possible to partition the application, so that different instances of the application handle each community, we may even adopt a different strategy for each community.

Message Queuing

Applications some times require large blocks of code and complex database calls which block other simpler operations whilst executing. One solution is to use message queuing. Basically messages are passed to a queue, at the end of the queue is a background task execution server. It pops messages off the queue and executes the task , passing results back off the messaging queue. Its possible to perform asynchronous operations within a page using this technique. A significant number of message queuing components are available from Apaches ActiveMQ using the Stomp protocol,

though the database is going to become the blocking factor, even with pooling connections. Rails inherently at least as of 2.2 does not handle different database connections concurrently, and thats where you need to think about alternative approaches such as message queuing, but that involves making significant changes to your code. If your doing this at the design stage, then great, but make sure that the messaging engine you are going to use is going to be appropriate. Data Partitioning Lets say you've got a huge database, and that its become or will become the blocking factor, then you need to split the databases up. You use a single or cluster of database instances as an index server. You perform a lookup on the record you require from the index server, and then access the record from the appropriate database instance. Currently, this technique is beyond rails, so you'll need to perform some fancy footwork to get this going robustly. The De-coupled approach An alternative approach to thread safety, is to use a reinvention of the fork and exec daemon approach, by using the http server to handle and distribute incoming requests and generating multiple processes to handle them. The basic concept is to use apache with mod-cluster, to generate several mongrel servers. The principle is shown below. Its also possible to do this with several other webservers. An alternative is to use IBMs webserver with JRuby

Read 170 times

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.

×
Help Us
Get Better
Invalid Name Invalid Email Address Invalid Mobile Invalid characters in Messages
Please help us improve, leave some feedback on our site, products, or products you'd need that we don't provide. We promise to respond to all feedback.