Thursday, November 30, 2006

In Praise of Automated Tests: A Built In Break

In the good old days of slow computers, slow compilers, and tens or hundreds of thousands of lines of code that had to be built before you could test, compiling provided a nice break in your work cycle:

"Why are you just sitting there? Get back to work."
Programmer's automatic reply: "I'm compiling...."

Faster compilers, faster computers, and scripting languages all conspired to remove this perk of the business. Thankfully, automated testing has brought it back:

"Why are you just sitting there?"
New reply: "I'm testing."

Who can argue against testing?

Wednesday, November 29, 2006

Rails Security Checklist

Security is a dull topic, but will become exciting if you screw it up. To avoid such excitement, here is a checklist for reviewing security in models, controllers, and views. If you see holes, add a comment and I'll update the list.

Security checklist for models:
  • Use attr_accessible (or attr_protected if you must) to explicitly identify attributes that are accessible by .create and .update_attributes. Just because you don't expose an attribute on an edit form doesn't mean that someone won't try to post a value to it. I prefer attr_accessible over attr_protected as it fails on the side of safety when new fields are added to a model - you have to explicitly expose new fields.
  • Make sure queries are using the Rails bind variable facility for parameters, not string concatenation or the handy Ruby's #{...} syntax.
  • Use validations to prevent bad input.
Security checklist for controllers:
  • Make non-action controller methods private (if possible).
  • If non-action controller methods must be public, identify them with hide_action to prevent unwanted execution.
  • Make sure before_filters are in place if necessary for your authorization infrastructure.
  • Move queries from your controller to your model, and see the model checklist above.
  • Check for params[:id] usage - are you sure you can trust it? Check for proper ownership of the record.
  • Check for usage of hidden fields - a user can send anything to you through them, so treat them with suspicious just as params[:id] should be suspect.
  • Use filter_parameter_logging to prevent entry of sensitive unencrypted data (passwords, SSN's, credit card numbers, etc.) in your server logs.
  • Forget about your view code for a minute, and think about how to protect your controller from posts a malicious user could make to any of your exposed methods. All parameters (whether or not exposed on a form, and whether or not invisible) are suspect to length overruns, bypassing of any browser based validation, attacks with malformed data, etc.
Security checklist for views:
  • Make sure all data displayed is escaped with the helper method h(string).
  • Eliminate comments in your views that you don't wish the entire world to see.
What else? (In particular, considerations for REST web services and AJAX need to be added).

Tuesday, November 28, 2006

Ruby on Rails Anti-Pattern?: Primary Key Visibility, Using in Code

Using an autoincrement integer field as the primary key / object id lets the Rails table relationship magic flow, and is a common design pattern in other O/R frameworks. Be careful about using this value for other purposes, however. It is typical to consider using the ID for such things as:
  • As a customer number, part number, invoice number, etc. visible to users or printed on documents.
  • Explicitly referenced in your source code to drive business logic.
Why is exposing or referencing the ID a potential problem? It may bite you in several ways:
  • If using ID's to drive business logic, your data is no longer easily portable between databases. This is particularly a problem when workinging on a project with multiple developers each running on their own local development database. It is possible to keep configuration record id's related to business logic in sync., but takes some effort. Throw in test databases, test fixtures, production systems, demo systems, etc. and you are signed up for even more synchronization fun.
  • There are surprising and annoying non-technical reasons that may cause the need to change formatting and layout of customer numbers, invoice numbers, part numbers, etc. It is very unpleasant to have those as part of your key structure if you have to restructure them. (A particularly ugly example: systems using Social Security Number (SSN) as a primary key, foreign key, etc. are a nightmare to rework when eliminating/hiding/encrypting SSN for privacy reasons.)
  • Sometime during the lifecycle of your application, it is possible that you will move your production data to a different database instance or platform. While you do have to maintain foreign key integrity among ID's while performing such a task, you will at least not have to worry about also synchronization with the outside world to maintain the integrity of the customer numbers, invoice numbers, or part numbers.
What to do instead? Consider incurring the wrath of the database normalization gods:
  • Use a separate, non-primary key field for identifiers visible by your users. This is a bit of work as you have to roll another autoincrement field, driven by a separate table, but is worth considering. Having such a field decouples your database structure (primary and foreign keys, table relationsihps) from the whims of how users use an identifier as a purchase order number, invoice number, customer number, part number, etc.
  • Use a mapping field of some sort for objects that must be explicitly identified in code to drive business rules or other behavior. This is easy to do, but does incur some run-time overhead as you now reference my_object.some_other_object.map_field instead of my_object.some_other_object_id in your code.
  • In the case of configurable selections against which you may have to code logic, don't use a table (and accompanying model) just to have an database driven drop down selection box. If there is no logic or other data associated with the selections, simply store the actual selection value in your record, thereby eliminating the object ID issue altogether.
Related: See c2.com for a great catalog of anti-patterns.

Monday, November 27, 2006

Put Away that Hammer: Use Ruby-fu to avoid unnecessary inheritance in Rails

Overusing inheritance is a well-known pitfall of object oriented languages. Ruby provides insanely great toys to allow you to extend classes (or even objects) without resorting to ugly inheritance hierarchies that are not relevant to your real-world domain model.

While working on Rails projects I have learned to break old habits:
  • Don't inherit from a class simply to extend it.
  • Don't write a wrapper class to extend an existing class.
Instead, do the following:
  • Extend a project class using "include" to include a module.
  • Extend Ruby's classes (careful!) the same way.
  • Or - extend an existing class by declaring it and adding additional methods.
  • In some scenarios, extend an object rather than a class.
Extremely useful articles in greater depth are at Luke Redpath and in Code Snippets. Ruby modules are documented in the Ruby Core API.

Thursday, November 23, 2006

Running with Scissors: Are you still coding without automated tests?

Do you run with scissors? Play football without a helmet? Drive while drunk? Do you still create software without scripted tests? For quite some time, I was a test driven development hypocrite. I had read the books, read the magazine articles, dabbled with various testing tools, and believed test driven development was the way to go. But for various reasons, I never quite had the gumption to push test driven development into our organization (or even into my own work habits).

Switching to Ruby on Rails finally kicked me into shape. Creating and running scripted tests within Rails is almost too easy not to do - the framework draws you into testing with it's excellent capabilities for Unit Tests, Functional Tests, and Integration Tests. I was initially hooked by using unit tests to do initial model testing (reaction: "cool.... this is much nicer than monkey testing various boundary behaviors"), then got the full testing religion after doing some significant code refactoring on code that was fully covered by tests ("Carrumba! I'll never write an application without fully automated tests again!").

One of the common complaints and FUD mentioned about Ruby and Rails is non-existent or weak support for debugging (although reportly Radrails, and possibly other tools, does have a decent level of debugging capabilities). Interestingly enough, I have found this to be an almost irrelevant issue, as my debugging is almost always done through the tests, not by stepping through code. I don't always follow the exact sequence of "write a test, see it fail, write some code to make it pass", but do make sure that my code is covered by tests, and do use the tests to exercise boundary conditions, failure cases, and so on.

Two additional tools we have found to be particularly useful for testing Rails code:
  • rcov: code coverage for ruby - provides coverage analysis with excellent web reporting of the results.
  • Watir - Web Applicatoin Testing in Ruby - The Rails tests can give your application and database servers a good beating. But they do run on the application server, only exercising a subset of your hardware and software. When you are ready to test your entire infrastructure from a browser (exercising your ISP, front-end web servers, SSL, etc.) , consider Watir. It's not really a load testing tool, but it does let you create browser-based tests scripted in Ruby, is quick and easy to install and use, and is handy if don't read (or are not ready for) incurring the full effort of a full blown load testing solution.
In our company, we are convinced that the tradeoff of creating more code up front (often the test code is larger than the code being tested) easily pays for itself in a shortened development schedule. More importantly, we are also convinced that over the lifespan of our application, having a fully automated test suite will give us a huge advantage in long-term agility, will hugely reduce our maintenance costs, and ultimately will have a significant impact on the profitability and valuation of our business.

If you're not using Rails, there is almost certainly a tool available for your programming language. There are innumerable references on this subject, a good place to start looking is here.

Wednesday, November 22, 2006

A link for Technorati

Nothing to see here, move along....
Technorati Profile

Tuesday, November 21, 2006

Using Rails in_place_editor and in_place_editor_field

In-place form editing is the first Recipe in the Rails Recipes book, and for good reason. Our application has a screen on which the user enters a dollar amount, from which several other amounts are calculated and displayed for review. A user typically cycles through this process 2 or 3 times before being ready to save their results.

The in_place_editor_field (or in_place_editor) provides a clean solution for this need. The user can click the amount to edit, click ok to submit it (while remaining on the page), and review the calculated results.

I coded the view using in_place_editor_field (which uses parameters identical to text_field - send it your object and column to be edited, and you're in business). In our case, the model object is named coverage, and the field to be edited is annual_amount, so the view contains:

<%= in_place_editor_field('coverage', 'annual_amount') %>

The user can now click the amount, edit it in place, and click OK to submit the results.

To process the results (since we wanted to do some calculations, rather than just saving the entry), we are responsible for providing a controller method (named appropriately as it is automatically called when the user submits their edit) as follows:

def set_coverage_annual_amount
#store posted amount
#(available in params[:value] )
#and/or use the results in calculations
end

The params[:value] that was automatically submitted to the method, so I did my calculations based on the submitted value. A more typical use is to store the submission, but in my case that was deferred until final form submission.

Finally, one the results are submitted and processed, the controller method wants a corresponding view to render its results. In this case, an rjs template was the way to go, so I created the view set_coverage_annual_amount.rjs with just one line of code:

page.replace_html 'amounts_group',
:partial => 'subscriber_amounts'

The subscriber_amounts partial form was already available, displayed all of the recalculated fields, and was just as easy to re-render as opposed to updating each individual field in the rjs template (which would have worked just as well). What did not work was attempting do do the partial render directly within the controller's set_coverage_annual_amount method - this resulted in the edited amount field being replaced with the entire 'amounts_group' div contents.

If you need similar functionality on any piece of data, in_place_editor works in a similar fashion. You display a piece of data identifiable via the DOM (such as by using the span tag with an ID - for this example the ID is 'myamount'). Then follow up with in_place_editor in the same view to mark the item with the necessary AJAX goodness:
<%= in_place_editor('myamount',   
{:url => url_for(
:action => "set_coverage_annual_amount")}) %>
Note quite as pretty as in_place_editor_field, as you have to explicitly identify the destination the edit will be posted to, but the capability is still nice to have when you need it.

See the Rails wiki, Shoo.gr, and Shane's Brain Extension for other articles (in addition to the Rails API reference and the Rails Recipes book) that I found particularly helpful on this topic.

Gotcha: Ruby Class Variable Scope is Global

While working on the authorization portion of an application, I was using a class variable to cache lookups of access rights throughout the duration of a users session. For various obscure reasons that are not relevant, I did not want to cache the rights on a session variable, so was using a class variable for this purpose. This technique worked nicely for a single web user and during functional testing, but failed miserably with two or more users.

As it turns out (as tested on Webrick and Mongrel), class variables are shared between web sessions (this holds true for controller classes and for models). Depending upon your prior language and platform background, this may not be news to you, but it was to me, as I beleived class variables were isolated by user session. To make matters worse, under both Mongrel and Webrick on a Windows machine, the class variables ARE isolated by user session. The problem did not manifest itself until the application was deployed to Linux at our hosting site.

Lessons learned:
  • Be careful about assumptions you bring with you from other languages and platforms.
  • Test for simultaneous multiple users - just running your unit, functional, and integration tests for a single user is not sufficient.
  • While the portability of Ruby and Rails code is magnificent, there is at least one subtle difference (probably more) between behavior on platforms.
  • I now have have one more reason for needing to buy a Mac for software development.
The good news is that this behavior opens the possibility of selectively sharing cached data between users on a single machine, although this topic needs exploration as there are possibly better methods for doing this.

See RubyCenteral here and here for good overviews of class variables. See RailsTips.org for an article explaining the difference between class variables and class instance variables.

Monday, November 20, 2006

Returning multiple results from a Ruby method

While it is preferable to have a method call return a single value, sometimes it is necessary to return multiple results. Returning an array or hash works for this purpose, array results can be received into individual variables.

For example, consider a method that returns a boolean success/failure indicator, along with an integer representing the number of items having that status. The following examples show use of an array, and use of a hash, to return the results:

1. Return the results as an array, receive the values in a list of variables.
Return the results (in this case true, and 23):

class MyClass
def some_method
[true, 23]
end
end

Receive the results in separate variables:

my_object = MyClass.new
status, counter = my_object.some_method

2. Return the results as a hash:
Return the results in a hash (in this case with keys of :status and :counter):

class MyClass
def some_method
{:status => true, :counter => 23}
end
end

Use the results:

my_object = MyClass.new
results = my_object.some_method

#the results are now in the hash
puts results[:status]
puts results[:counter]

Sunday, November 19, 2006

Do you grok Ruby on Rails?

During the last 9 or 10 months I have learned a great deal about programming with Ruby on Rails - enough to say that I am confident and comfrotable using it. But I have not mastered Ruby or Rails - I still learn something new almost every day. This blog documents things learned while proceeding beyond the basics, mixed with occasional thoughts about software development in general.

With that said, an explanation of the word grok may be in order:
  • If you have mastered Perl, you grok punctuation.
  • If you have mastered Java (in particular J2EE), you grok XML.
  • If you have been working as an Enterprise Astronaut, you might grok WSDL, BPEL, UML, EAI, CYA, FUD, and a whole boatload of enterprise-speak.
  • If you have mastered .NET, you grok Visual Studio.
  • If you have mastered PHP, you most likely grok spaghetti......
If that's not enough, see wikipedia's entry for grok.