When I wrote about good modules and bad modules, I mentioned that an indication of a "bad" module was when it was used to extract code for the sake of code extraction. This usually results in a module that is only being mixed into one class.
When I published the article, I had lots of support from people intimately familiar with the internals of many popular libraries like Active Record, agreeing that such extraction is not helpful and would eventually harm a project. I also had people rise up to defend their use of modules.
This gave me more time to reflect on my views and how I expressed them. I realized that while I casually mentioned some of the problem's "readability," "complexity," and "maintainability," I didn't do much to explain why modules made this harder. I took my experience for granted, and readers deserve more.
I've long used concerns to break a class into smaller modules. I recently discovered that Rails encourages this behavior by allowing inline concerns in your code. At first glance, this looks like it could be useful, but let's not be so quick. Here is an example of a one-off module:
class Todo # Other todo implementation # ... module EventTracking extend ActiveSupport::Concern included do has_many :events before_create :track_creation after_destroy :track_deletion end def notify_next! # end private def track_creation # ... end end include EventTracking end
As written I don't have too many problems with it, other than it doesn't need to be written as a module in the first place. We're only using it in one place; so why do we need to introduce include
-ing a module (a form of multiple inheritance) to do this? It can be rewritten in fewer lines of code without the module.
I touched on this in the previous article. The reason here though is to eventually extract this module into another file. From the docs of the "inline concerns":
Once our chunk of behavior starts pushing the scroll-to-understand-it boundary, we give in and move it to a separate file. At this size, the increased overhead can be a reasonable tradeoff even if it reduces our at-a-glance perception of how things work.
So according to the recommended use, once our module can't easily fit in our brain or our screen, we are encouraged to move it to another file. Now we're left with two files:
class Todo < ActiveRecord::Base # Other todo implementation # ... include EventTracking end
and
module EventTracking extend ActiveSupport::Concern included do has_many :events before_create :track_creation after_destroy :track_deletion end def notify_next! # end private def track_creation # ... end end
This is where I start to have a lot of problems. I mentioned readability earlier, so let's touch on that first.
Bad Readability/Grep-ability
Let's say you're a dev and you want to know why, after you destroy a Todo
, all its events are deleted. You open up app/models/todo.rb
and see that it's hundreds of lines long and there's no way you could possibly read it all. You search the file (CMD+F for the win!), except your search for events
didn't come back with anything meaningful. Lots of references but you don't see where the events
method gets added. Maybe you've been around the block and you search for event
instead to find this line:
include EventTracking
This leads you to a global search for the string EventTracking
which eventually leads you to the file which shows you what you're looking for. Except in my experience it's rarely one module. If we're being encouraged to create new files when a module is too large, we are encouraged to have many modules. Often when searching, you'll find multiple modules with similar names:
include EventPlanning include EventTracking include EventCoordination include EventedWorker include Evanescence include EvanIsAName # ...
Now you have to search through each to discover where the functionality came from. How is this better than putting everything in one file? It's not. It's worse. You had one problem (one really long file), now you have N problems where N is the number of files you have to search through to find the functionality you're looking for.
Favor Composition
I'm not saying you HAVE to put everything in one file. Please, by all means, extract some logic into a custom class and call it. In that scenario, if you have an EventPlanning
class, you might only have to add one method to your main Todo
class:
class Todo < ActiveRecord::Base # Other todo implementation # ... has_many :events before_create :track_creation after_destroy :track_deletion def notify_next! EventPlanning.new(self.events).notify_next! end end
Why do I like this better? In this hypothetical example, the EventPlanning
logic didn't need to have access to EVERY method and EVERY database entry in Todo
. Here when you look at the code, we see two things. If you are searching for the notify_next
! logic, you'll see that it's in the EventPlanning
class. No guess and check required; it is obvious where the logic lives. You also see what gets passed in, in this case events
.
Limited Interface
Once you're inside of an EventPlanning
class, it is much simpler to focus on the core logic of notifying about the next event, and it doesn't need to know about the rest of Todo
.
It may seem like EventPlanning
having access to more methods makes it more powerful; this is true. However with great power comes great ability to make complicated code.
If I write a well-defined EventPlanning
class to take an array of events, it makes it harder for other codes to accidentally impose side effects on our class. The only way you can change the behavior of our class is if we change the code of the class or the input, in this case, a list of events. It also makes it easier to test in isolation, which is a nice bonus.
If you don't totally follow, here's an example of how this feature could be written as a class:
class EventPlanning def initialize(events) @events = events end def last_event @events.last end end
Now here's the example of the same logic coming from a module:
class Todo < ActiveRecord::Base # Other todo implementation # ... def self.last_event events.last end module EventTracking extend ActiveSupport::Concern included do has_many :events before_create :track_creation after_destroy :track_deletion end def notify_next! notify!(last_event) end private def notify!(event) # ... end def track_creation # ... end end include EventTracking end
Here our dev was very clever. They noticed there was already a last_event
method, and they wanted to stay DRY so they reused it for the notify_next!
logic. It works fine for now. What if tomorrow another dev gets a new requirement that says they need to filter private events from people's profiles? The first thing profiles show is the last event, so they change the code to something like this:
class Todo < ActiveRecord::Base # Other todo implementation # ... def self.last_event events.where(private: false).last end module EventTracking extend ActiveSupport::Concern included do has_many :events before_create :track_creation after_destroy :track_deletion end def notify_next! notify!(last_event) end private def notify!(event) # ... end def track_creation # ... end end include EventTracking end
Now the behavior of last_event
is changed. They run the tests, and everything passes. They push to production, and now suddenly people aren't getting notifications for their private events. OHNO! This broke because there was a very weak "contract" between what EventTracking
is trying to do and what the Todo
class provides.
In the case of our EventPlanning
class, there is a very strong "contract," as it only has access to what we pass it. It is much less likely to break.
Concern Confusion
While that is all hypothetical code, let me show you real places where one concern uses methods defined in another file. Here is code out of Active Record:
module ActiveRecord module ConnectionHandling # https://github.com/rails/rails/blob/cf5f55cd30aef0f90300c7c8f333060fe258cd8a/activerecord/lib/active_record/connection_handling.rb#L47-L59 def establish_connection(config = nil) raise "Anonymous class is not allowed." unless name config ||= DEFAULT_ENV.call.to_sym spec_name = self == Base ? "primary" : name self.connection_specification_name = spec_name resolver = ConnectionAdapters::ConnectionSpecification::Resolver.new(Base.configurations) spec = resolver.resolve(config).symbolize_keys spec[:name] = spec_name connection_handler.establish_connection(spec) end # ... end end
As concerns go, this one is concerned with connection handling, which is also the name of the file. In theory, it makes sense to isolate all connection code to one file. Breaking it out into a module seems totally reasonable. However, there are no guarantees that A) only connection handling code goes into that module and B) that module doesn't interact with code written in other modules.
The second case is exactly what happens here if we look at the last line:
connection_handler.establish_connection(spec)
Guess what? The connection_handler
method isn't defined in the connection_handling.rb
file (strangely enough). We split out the code to "make it easier to read," but how can you read code that isn't there? You now have to hunt for that method. Turns out it's in another module, descriptively named Core
.
module ActiveRecord module Core extend ActiveSupport::Concern # ... # https://github.com/rails/rails/blob/cf5f55cd30aef0f90300c7c8f333060fe258cd8a/activerecord/lib/active_record/core.rb#L494-L496 def connection_handler self.class.connection_handler end # ... def self.connection_handler ActiveRecord::RuntimeRegistry.connection_handler || default_connection_handler end def self.connection_handler=(handler) ActiveRecord::RuntimeRegistry.connection_handler = handler end end end
You then have other modules who are aware of the methods provided by ConnectionHandling
, for example in ActiveRecord::Persistence
:
module ActiveRecord module Persistence extend ActiveSupport::Concern # ... def destroy raise ReadOnlyRecord, "#{self.class} is marked as readonly" if readonly? destroy_associations self.class.connection.add_transaction_record(self) # <===================== HERE destroy_row if persisted? @destroyed = true freeze end end end
We can see that Persistence
uses the connection
method, which is defined in a different module, ConnectionHandling
, or at least I think it is. I can tell you that Persistence
does not define the connection
method. But since there are so many modules, it's impossible for me to guarantee you where it came from. That is, unless I run the code and use some debugging magic or unless I spend a good bit of time spelunking in the source. I don't want to do either, but I want to know what methods I can use, what their APIs are, and how they all work together.
Note that I'm not picking on Active Record here. This code was written a long time ago and a lot of people use it. It's very hard to make huge refactors, and a lot of work has gone into making AR quite a bit better over the years. This style of coding is an inherited relic. Also this was THE style of coding at the time. It was very popular and is found in many more projects other than Rails. It just so happens to be such an easy example since it's survived for so long. At the time of its inception, modules seemed like an obvious answer to help with code clarity. The people who work on AR are amazing and do amazing work. It's not fair to judge them on this one short coming. I appreciate all the code that got us this far, and I appreciate even more the people that will take us further tomorrow. Go thank a maintainer today. I'll wait.
Hopefully the Active Record example shows you how convoluted things can get. There are worse examples, but they're not as easy to demonstrate.
You may be thinking this was a fault of a single programmer or a group of programmers in an isolated incident. Those people should have put the connection
method somewhere else, or "hey an easy patch is to put connection_handler
into connection_handling.rb
."
This may help, but it is temporary.
The module provides no guarantees that the code in one module will not interact with the code in another. There are no clearly defined interfaces between how the different modules work. A change in behavior of one method in one file can cause an unanticipated ripple effect.
Yes, testing exists and will help us catch some problems, but that's only part of the story. A good interface makes it easier for people to write side effect-free code. When you can fully comprehend the behaviors and responsibilities of an object and fit it into your head, then it's easier to make bug-free changes. I dare you to try to fit all the behavior of an Active Record module in your head; this also includes EVERY method that it knows about and calls, as well as all the modules it includes. (Hint, it's a lot).
If you don't understand exactly what I mean by classes, take a look at this Sprockets class where I introduced URITar. Go look at it, read the source, read the docs. Do you understand it? Do you think you could modify it? The beauty is what you see is what you get.
In the Sprockets example, I do wish we didn't have to rely on calling methods on the env that is passed in, but that's for another day.
By and large, maintainers tend to agree that smaller composable classes are the best way to isolate code and keep things clean. Much of Active Record has already been extracted into classes. A large amount of that work has been done by @sgrif (who's legal job title according to the Canadian government is "10x Hacker Ninja Guru"). Thank your maintainers.
Slippery Slope
This brings us back to the original code we started out with.
class Todo # Other todo implementation # ... module EventTracking extend ActiveSupport::Concern included do has_many :events before_create :track_creation after_destroy :track_deletion end def notify_next! # end private def track_creation # ... end end include EventTracking end
You may recall that I said this type of inline module is fine as long as it stays in the same file. I'm going to go back and say it isn't fine.
It's not fine, because extracting that code into a module makes it easier to extract into a file and that is most certainly (as hopefully I've shown) not "good." It is a slippery slope.
I'm not saying that module extraction is inherently bad. Rather, it tends to lead to bad things. I understand that you are incredibly disciplined and will only use methods from within the same module. I understand all your coworkers will meticulously comb through their own patches to make sure they do the same. Your boss, your dog, your customers will all know the rules of this file.
Somehow, someway, something will reference a method from one module in another, and all that hard work will be null and void. If you never extract any logic to a single-use module, you never have to worry.
Another thing that could happen is that a coworker sees a few methods in your module they want on another class. They include it, and you never even know. Now your module has gone from being included in one class to being included in a few. Good, right?
Well, we have the same problem. If you need to modify the code, but you don't know exactly how it's being used, you might refactor and delete a method you didn't even know was needed. If you never extracted the code into a module, you wouldn't have to worry.
Programming styles go in and out of fashion. I'm sure once everything has been sufficiently extracted to small classes, we'll have a revolution where we relearn how much fun making a thousand-line-long module is. I also understand that applications are subject to deadlines and beholden to these things called time and money. No code can be perfect. Anyone who says differently is selling something.
I'm not saying that you need to go out and refactor all your single-use modules into classes. I am saying if you find yourself reaching to do a module-method extraction, stop. Consider what road you're starting down and these lessons I've shared. What would a class-based extraction look like instead? If you're not sure, there's a wealth of object-oriented materials for purchase or via conference talks for free on YouTube. Use your best judgment and have fun.