Code & Clay Notes to self. Mainly Ruby/Rails.

Giving an association a custom name

Sometimes, it may be desirable to reference an association with a name that differs from the one generated from the table name.

I have a model named Steps. Each step can have many steps. Each step belongs to one other parent step.

class Step < ApplicationRecord
  belongs_to :step
  has_many :steps
end
a = Step.create
b = Step.create
c = Step.create

a.steps >> b
a.steps >> c

I could get the parent step of b like so:

b.step
# => #<Step:0x00007fdf24b916e8

I don’t think that’s intuitive enough though. Since I’m asking for the parent step, it should be more like:

b.parent
# => #<Step:0x00007fdf24b916e8

Setting up the custom name is simple. I can supply the name of my choosing. Then, all I need to do is specify the class name and foreign id in the options.

class Step < ApplicationRecord
  belongs_to :parent, class_name: 'Step', foreign_key: :step_id
  has_many :steps
end

Converting a pgsql file to CSV

I had a request this morning from a client who wanted me to save some tables from their database as CSVs. They provided a .pgsql file which contains all the data and instructions for constructing the database. It’s sort of a bunch of migrations and seeds in one place.

There might be a more straightforward way but I decided to create a new Rails app and build the database from the dump file. I could then export the tables from the console.

There weren’t many steps to it. Firstly I created the database.

rails db:create

I went to config/database.yml to get the name of the database before running:

psql -d rails_app_development -f dumped_db.pgsql

It took a couple of tries. The command’s output informed me I needed to create some users:

createuser --superuser postgres
createuser --superuser rails

Following rails db:drop db:create I ran the psql command again.

Having created the database I ran db:schema:dump to create the schema file. From that I could see which tables I needed to create corresponding models for.

I then went to the console and dumped my models to CSV files.

def dump_csv(path, model)
  CSV.open(path, "wb") do |csv|
    csv << model.attribute_names
    model.find_each do |row|
      csv << row.attributes.values
    end
  end
end

Be careful when memoizing booleans!

(This post is also published on dev.to)

It took me an hour or so of frustration to figure out why a method was being called multiple times despite my attempt at memoizing its return value.

The problem

My problem looked a bit like this.

def happy?
  @happy ||= post_complete?
end

My intention was that value of post_complete? would be stored as @happy so that post_complete? would be fired only once.

However, that’s not guaranteed to happen here. post_complete? might be evaluated and its value assigned to @happy every time I call happy?.

Can you see why?

  @happy ||= post_complete?

What’s going on?

The question mark denotes that post_complete? is expected to return a boolean value. But, what if that value is always false?

Another way of writing the statement is:

@happy || @happy = post_complete?

In the above example, I want to know if at least one of the sides is true.

Remember that if the left-hand side of an || statement is false, then the right-hand side is evaluated. If the left side is truthy, there’s no need to evaluate the right side – the statement has already been proven to be true – and so the statement short circuits.

If I replace post_complete? with boolean values, it’s easier to see what is happening.

In this example, @happy becomes true:

def happy?
  @happy || @happy = true
  # @happy == true
end

However, in this example, @happy becomes false:

def happy?
  @happy || @happy = false
  # @happy == false
end

In the former, @happy is falsey the first time the method is called, then true on subsequent calls. In that example, the right-hand side is evaluated once only. In the latter, @happy is always false and so both sides are always evaluated.

When using the ||= style of memoization, only truthy values will be memoized.

So the problem is that if post_complete? returns false the first time happy? is called, it will be evaluated until it returns true.

A solution

So how do I go about memoizing a false value?

Instead of testing the truthiness of @happy, I could check whether or not it has a value assigned to it. If it has, I can return @happy. It if hasn’t, then I will assign one. I will use Object#defined?.

The documentation states:

defined? expression tests whether or not expression refers to anything recognizable (literal object, local variable that has been initialized, method name visible from the current scope, etc.). The return value is nil if the expression cannot be resolved. Otherwise, the return value provides information about the expression.

Note that the expression is not executed.

I use it like so:

def happy?
  return @happy if defined? @happy
  @happy = false
end

Referring back to the documentation, there’s one thing I need to be aware of. This isn’t the same as checking for nil or false. It’s a bit counterintuitive, but defined? doesn’t return a boolean. Instead, it returns information about the argument object in the form of a string:

> @a, $a, a = 1,2,3
> defined? @a
#=> "instance-variable"
> defined? $a
#=> "global-variable"
> defined? a
#=> "local-variable"
> defined? puts
#=> "method"

If I assign nil to a variable, what do you think the return value will be when I call defined? with that variable?

> defined? @b
#=> nil
> @b = nil
#=> nil
> defined? @b
#=> "instance-variable"

So, as long as the variable has been assigned with something (even nil), then defined? will be truthy. Only if the variable is uninitialized, it returns nil.

Of course, you can guess what happens when we set the variable’s value to false.

> @c = false
#=> false
> defined? @c
=> "instance-variable"

Update: An improved solution

Prompted by Valentin Baca’s comment, I’ve reassessed my original solution. Do I really need to check whether or not the variable is initialised or is checking for nil enough?

@happy.nil? should suffice as I’m only interested in knowing that the variable is nil rather than false. (false and nil are the only falsey values in Ruby.)

I think this version is more readable:

def happy?
  @happy = post_complete? if @happy.nil?
  @happy
end

Wrapping up

I now know that the ||= style of memoization utilizes short-circuiting. If the left-hand side variable is false, then the right-hand part of the statement will be evaluated. If that’s an expensive method call which always returns false, then the performance of my program would be impacted. So instead of ||= I can check if the variable is initialized or I can check if it’s nil.

And now I’m happy.

def happy?
  @happy = post_complete? if @happy.nil?
  @happy
end

Value Objects In Ruby

(This post is also published on dev.to)

What is a value object?

A small simple object, like money or a date range, whose equality isn’t based on identity. Martin Fowler

Objects in Ruby are usually considered to be entity objects. Two objects may have matching attribute values but we do not consider them equal because they are distinct objects.

In this example a and c are not equal:

class Panserbjorn
  def initialize(name)
    @name = name
  end  
end

a = Panserbjorn.new('Iorek')
b = Panserbjorn.new('Iofur')
c = Panserbjorn.new('Iorek')

a == c #=> => false

# Three distinct objects:
a.object_id #=> 70165973839880
b.object_id #=> 70165971554200
c.object_id #=> 70165971965460

Value objects on the other hand, are compared by value. Two different value objects are considered equal when their attribute values match.

Symbol, String, Integer and Range are examples of value objects in Ruby.

Here, a and c are considered equal despite being distinct objects:

a = 'Iorek'
b = 'Iofur'
c = 'Iorek'

a == b #=> false
a == c #=> true

# Three distinct objects:
a.object_id #=> 70300461022500
b.object_id #=> 70300453210700
c.object_id #=> 70300461053840

How can I create a value object?

Say I want a class to represent the days of the week and I also want instances of that class to be considered equal if they represent the same day. A Sunday object should equal another Sunday object. A Monday object should equal another Monday object, etc…

I might begin with the following class:

class DayOfWeek
  DAYS = {
    1 => 'Sunday',
    2 => 'Monday',
    3 => 'Tuesday',
    4 => 'Wednesday',
    5 => 'Thursday',
    6 => 'Friday',
    7 => 'Saturday'
  }.freeze

  def initialize(day)
    raise ArgumentError, 'Day outside range' unless (1..7).cover?(day)

    @day = day
  end

  def to_i
    day
  end

  def to_s
    DAYS[day]
  end

  private

  attr_accessor :day
end

Now, I am going to instantiate three objects to represent the days of the week on which I eat pizza, pay the milk man, and put out the recycling for collection:

pizza_day = DayOfWeek.new(5)
milk_money_day = DayOfWeek.new(2)
recycling_collection_day = DayOfWeek.new(5)

I know that I eat pizza for dinner the same day that I put out the recycling. I consider these objects to represent the same thing: Thursday. They should be equivalent. But they’re not:

pizza_day == recycling_collection_day #=> false

That’s because they’re not yet value objects. #== compares the identities of the objects.

I should override #==. I will use pry to find out where the method comes from so we can see how it derives its current behaviour.

pizza_day.method(:==).owner #=> BasicObject

DayOfWeek inherits #== from BasicObject.

The page for BasicObject#== states:

== returns true only if obj and other are the same object. Typically, this method is overridden in descendant classes to provide class-specific meaning.

Aha! The class specific meaning in this case is I want to compare its instances by value.

I know that these objects expose an integer. It makes sense to compare against that but I don’t want to compare a day with an actual integer. Thursday should not be equivalent to the number 5.

I also know that a DayOfWeek exposes a string as well. It follows that any equivalent days would return matching string and integer values:

class DayOfWeek
  # ...

  def ==(other)
    to_i == other.to_i &&
    to_s == other.to_s
  end

  alias eql? ==

  # ...
end

I have aliased #eql? to #==. The BasicObject documentation explains:

For objects of class Object, eql? is synonymous with ==. Subclasses normally continue this tradition by aliasing eql? to their overridden ==

Bingo! We have value objects. pizza_day and recycling_collection_day are considered equivalent:

pizza_day == recycling_collection_day #=> true

I could override other comparison methods, #<==>. <=, <, ==, >=, > and between? as it makes sense to say that Monday is less than Tuesday or Friday is greater than Thursday but I have decided that’s not needed for now.

There is however, one more important step that I need to implement. These objects are equivalent, so when used as a hash key I would expect them to point to the same bucket.

The Hash documentation suggests:

Two objects refer to the same hash key when their hash value is identical and the two objects are eql? to each other.

A user-defined class may be used as a hash key if the hash and eql? methods are overridden to provide meaningful behavior. By default, separate instances refer to separate hash keys.

Following that advice, I need to change the default behaviour of #hash. I already know that integers in Ruby are value objects. I can see that equivalent integers always return the same #hash.

a = 1
b = 1

a.object_id #=> 3
b.object_id #=> 3
1.object_id #=> 3

1.hash == 2.hash #=> false
[a,b,1].map(&:hash).uniq.count #=> 1
101.hash == (100 + 1).hash #=> true

The same goes for strings:

a = 'Svalbard'
b = 'Svalbard'

# Note the different object ids:
a.object_id #=> 70253833847520
b.object_id #=> 70253847146940
'Svalbard'.object_id #=> 70253847210020

# The hash values of equivalent strings match:
'Svalbard'.hash == 'Bolvanger'.hash #=> false
[a,b,"Svalbard"].map(&:hash).uniq.count #=> 1
'Svalbard'.hash == ('Sval' + 'bard').hash #=> true

I will generate the hash using its string and integer properties.

  def ==(other)
    to_i == other.to_i
  end

  alias eql? ==

  def hash
    to_i.hash ^ to_s.hash
  end

Per the example in the documentation, I’ve used the XOR operator (^) to derive the new hash value.

Now that I have overridden #hash, I can see that equivalent DayOfWeek instances point to the same bucket:

day_1 = DayOfWeek.new(1)
day_2 = DayOfWeek.new(1)

day_1 == day_2 #=> true

notes = {} #=> {}
notes[day_1] = 'Rest'
notes[day_2] = 'Party'

notes.length #=> 1
notes #=> {#<DayOfWeek:0x00007fa193e44170 @day=1>=>"Party"}

Structs

If I want multiple value objects, I might have to override #hash and #== for each class.

I could decide to use structs instead.

A Struct is a convenient way to bundle a number of attributes together, using accessor methods, without having to write an explicit class.

Structs are value objects by default. Of course we now have an idea of how this works. The documentation explains:

Equality—Returns true if other has the same struct subclass and has equal member values (according to Object#==).

Just as we thought!

DayOfWeek = Struct.new(:day) do
  DAYS = {
    1 => 'Sunday',
    2 => 'Monday',
    3 => 'Tuesday',
    4 => 'Wednesday',
    5 => 'Thursday',
    6 => 'Friday',
    7 => 'Saturday'
  }.freeze

  def to_s
    DAYS[day]
  end

  def to_i
    day
  end
end

a = DayOfWeek.new(1)
b = DayOfWeek.new(2)
c = DayOfWeek.new(1)

a == c #=> true
a == b #=> false

Summary

We now know the difference between an entity object and a value object. We have learned that we need to override both #hash and #== if our value objects are to be used as hash keys. And, we have learned that structs provide value object behaviour straight out of the box.