197

I was involved in a programming discussion today where I made some statements that basically assumed axiomatically that circular references (between modules, classes, whatever) are generally bad. Once I got through with my pitch, my coworker asked, "what's wrong with circular references?"

I've got strong feelings on this, but it's hard for me to verbalize concisely and concretely. Any explanation that I may come up with tends to rely on other items that I too consider axioms ("can't use in isolation, so can't test", "unknown/undefined behavior as state mutates in the participating objects", etc.), but I'd love to hear a concise reason for why circular references are bad that don't take the kinds of leaps of faith that my own brain does, having spent many hours over the years untangling them to understand, fix, and extend various bits of code.

Edit: I am not asking about homogenous circular references, like those in a doubly-linked list or pointer-to-parent. This question is really asking about "larger scope" circular references, like libA calling libB which calls back to libA. Substitute 'module' for 'lib' if you like. Thanks for all of the answers so far!

2
  • Does circular reference pertain to libraries and header files? In a workflow, new ProjectB code will be processing a file that's output from legacy ProjectA code. That output from ProjectA is a new requirement driven by ProjectB; ProjectB has a code that facilitates generically determining which fields go where, etc. The point being, legacy ProjectA could reuse code in new ProjectB, and ProjectB would be foolish not to reuse utility code in legacy ProjectA (e.g.: character set detection and transcoding, record parsing, data validation and transformation, etc.).
    – Luv2code
    Oct 2, 2015 at 16:11
  • 1
    @Luv2code It only becomes foolish if you cut and paste the code between projects or possibly when both projects compile and link in the same code. If they're sharing resources like this, put them into a library. Feb 11, 2016 at 1:50

14 Answers 14

265

There are a great many things wrong with circular references:

  • Circular class references create high coupling; both classes must be recompiled every time either of them is changed.

  • Circular assembly references prevent static linking, because B depends on A but A cannot be assembled until B is complete.

  • Circular object references can crash naïve recursive algorithms (such as serializers, visitors and pretty-printers) with stack overflows. The more advanced algorithms will have cycle detection and will merely fail with a more descriptive exception/error message.

  • Circular object references also make dependency injection impossible, significantly reducing the testability of your system.

  • Objects with a very large number of circular references are often God Objects. Even if they are not, they have a tendency to lead to Spaghetti Code.

  • Circular entity references (especially in databases, but also in domain models) prevent the use of non-nullability constraints, which may eventually lead to data corruption or at least inconsistency.

  • Circular references in general are simply confusing and drastically increase the cognitive load when attempting to understand how a program functions.

Please, think of the children; avoid circular references whenever you can.

15
  • 37
    I particularly appreciate the last point, "cognitive load" is something that I am very conscious of but never had a great concise term for it. Oct 14, 2010 at 16:40
  • 6
    Good answer. It would be better if you said something about testing. If modules A and B are mutually dependent, they must be tested together. This means they are not really separate modules; together they are one broken module. Oct 29, 2012 at 18:35
  • 5
    Dependency injection is not impossible with circular references, even with automatic DI. One will just have to be injected with a property rather than as a constructor parameter. Apr 29, 2014 at 15:24
  • 3
    @BlueRaja-DannyPflughoeft: I consider that an anti-pattern, as do many other practitioners of DI, because (a) it's not clear that a property is actually a dependency, and (b) the object being "injected" can't easily keep track of its own invariants. Worse, many of the most sophisticated/popular frameworks like Castle Windsor can't give useful error messages if a dependency can't be resolved; you end up with an annoying null reference instead of a detailed explanation of exactly which dependency in which constructor couldn't be resolved. Just because you can, doesn't mean you should.
    – Aaronaught
    Apr 30, 2014 at 1:53
  • 3
    I wasn't claiming it's a good practice, I was just pointing out it's not impossible as claimed in the answer. Apr 30, 2014 at 3:09
25

A circular reference is twice the coupling of a non-circular reference.

If Foo knows about Bar, and Bar knows about Foo, you have two things that need changing (when the requirement comes that Foos and Bars must no longer know about each other). If Foo knows about Bar, but a Bar doesn't know about Foo, you can change Foo without touching Bar.

Cyclical references can also cause bootstrapping problems, at least in environments that last for a long time (deployed services, image-based development environments), where Foo depends on Bar working in order to load, but Bar also depends on Foo working in order to load.

17

When you tie two bits of code together, you effectively have one large piece of code. The difficulty of maintaining a bit of code is at least the square of its size, and possibly higher.

People often look at single class (/function/file/etc.) complexity and forget that you really should be considering the complexity of the smallest separable (encapsulatable) unit. Having a circular dependency increases the size of that unit, possibly invisibly (until you start trying to change file 1 and realize that also requires changes in files 2-127).

15

They may be bad not by themselves but as an indicator of a possible poor design. If Foo depends on Bar and Bar depends on Foo, it is justified to question why they are two instead of a unique FooBar.

13

Hmm... that depends on what you mean by circular dependence, because there are actually some circular dependencies which I think are very beneficial.

Consider an XML DOM -- it makes sense for every node to have a reference to their parent, and for every parent to have a list of its children. The structure is logically a tree, but from the point of view of a garbage collection algorithm or similar the structure is circular.

5
  • 1
    wouldn't that be a tree? Oct 14, 2010 at 1:31
  • @Conrad: I suppose it could be thought of as a tree, yes. Why? Oct 14, 2010 at 2:14
  • 1
    I don't think of tree's as circular because you can navigate down its children and will terminate (regardless of the parent reference). Unless a node had a child that was also a ancestor which in my mind makes it a graph and not a tree. Oct 14, 2010 at 15:21
  • 5
    A circular reference would be if one of the children of a node looped back to an ancestor. Oct 14, 2010 at 17:04
  • This isn't really a circular dependency (at least not in a way that causes any issues). For example, imagine that Node is a class, which has other references to Node for the children inside itself. Because it's only referencing itself, the class is completely self-contained and isn't coupled to anything else. --- With this argument, you could argue that a recursive function is a circular dependency. It is (at a stretch), but not in a bad way.
    – byxor
    Nov 5, 2016 at 14:00
9

Is like the Chicken or the Egg problem.

There are many cases in which circular reference are inevitable and are useful but, for example, in the following case it doesn't work:

Project A depends on project B and B depends on A. A needs to be compiled to be used in B which requires B to be compiled before A which requires B to be compiled before A which ...

6

While I agree with most of the comments here I would like to plead a special case for the "parent"/"child" circular reference.

A class often needs to know something about its parent or owning class, perhaps default behavior, the name of the file the data came from ,the sql statement that selected the column, or, the location of a log file etc.

You can do this without a circular reference by having a containing class so that what was previously the "parent" is now a sibling, but it is not always possible to re-factor existing code to do this.

The other alternative is to pass all the data a child might need in its constructor, which end up being just plain horrible.

1
  • On a related note, there are two common reasons X might hold a reference to Y: X might want to ask Y to do things on X's behalf, or Y might be expecting X to do things to Y, on Y's behalf. If the only references that exist to Y are for the purpose of other objects wanting to do things on Y's behalf, then the holders of such references should be told that Y's services are no longer needed, and that they should abandon their references to Y at their convenience.
    – supercat
    Oct 15, 2018 at 21:53
5

In database terms, circular references with proper PK/FK relationships make it impossible to insert or delete data. If you can't delete from table a unless the record is gone from table b and you can't delete from table b unless the record is gone from table A, you can't delete. Same with inserts. this is why many databases do not allow you to set up cascading updates or deletes if there is a circular reference because at some point, it becomes not possible. Yes you can set up these kind of relationships with out the PK/Fk being formally declared but then you will (100% of the time in my experience) have data integrity problems. That's just bad design.

5

I'll take this question from modelling point of view.

As long as you don't add any relationships that aren't actually there, you are safe. If you do add them, you get less integrity in data (cause there is a redundancy) and more tightly coupled code.

The thing with the circular references specifically is that I haven't seen a case where they would be actually needed except one - self reference. If you model trees or graphs, you need that and it is perfectly all right because self-reference is harmless from the code-quality point of view (no dependency added).

I believe that at the moment you start to need a not-self reference, immediately you should ask if you can't model it as a graph (collapse the multiple entities into one - node). Maybe there is a case in between where you make a circular reference but modelling it as graph is not appropriate but I highly doubt that.

There is a danger that people think that they need a circular reference but in fact they don't. The most common case is "The-one-of-many case". For instance, you have got a customer with multiple addresses from which one should be marked as the primary address. It is very tempting to model this situation as two separate relationships has_address and is_primary_address_of but it is not correct. The reason is that being the primary address is not a separate relationship between users and addresses but instead it is an attribute of the relationship has address. Why is that? Because its domain is limited to the user's addresses and not to all the addresses there are. You pick one of the links and mark it as the strongest (primary).

(Going to talk about databases now) Many people opt for the two-relationships solution because they understand to "primary" as being a unique pointer and a foreign key is kind of a pointer. So foreign key should be the thing to use, right? Wrong. Foreign keys represent relationships but "primary" is not a relationship. It is a degenerated case of an ordering where one element is above all and the rest is not ordered. If you needed to model a total ordering you would of course consider it as a relationship's attribute because there is basically no other choice. But at the moment you degenerate it, there is a choice and quite a horrible one - to model something that is not a relationship as a relationship. So here it comes - relationship redundancy which is certainly not something to be underestimated. The uniqueness requirement should be imposed in another way, for instance by unique partial indexes.

So, I wouldn't allow a circular reference to occur unless it is absolutely clear that it comes from the thing I am modelling.

(note: this is slightly biased to database design but I would bet it is fairly applicable to other areas too)

3
  • In your opinion should we use circular reference for domain models? Consider this case, there are two entities, Student and Course. The relationship between them is many-to-many. When I fetch the Student entity, I want to know all the courses a student has enrolled in. When I fetch the Course, I want to know all the students who have enrolled in this course. In this case, both the Student and the Course will have a reference to each other. What if I want to support some complex requirement like when I fetch a student, I want to show the other students(coursemates) enrolled in a course?
    – Hem Bhagat
    Jul 10, 2023 at 6:38
  • In a real case, you will probly end up with a third object like "enrollment" that carries a link to a course and a student + additional info like when the student has practice lessons or e.g. grades. If there is no additional info needed for the relation then probly there should be only a link Student -> Course in db explicitly (of course the opposite relation follows implicitly). If there is a need for speed (e.g. to quickly get students for a course), then one can cache the queries. But this is just from my POV ;).
    – clime
    Jul 10, 2023 at 8:33
  • I agree. A link entity makes sense in the case of a many-to-many relationship. But how would you design models in case of a one-to-many relationship? The one-to-many relationship also introduces circular reference in models. Example: One User has Many Orders. When fetching a User, I need all its Orders and when fetching an Order I need the User associated with it. Would you prefer doing a separate query for any one of the Entities and keep the link in the other one? Or would you go with two different domain models for each entity, one with the link and one without the link?
    – Hem Bhagat
    Jul 10, 2023 at 9:42
2

I'd answer that question with another question:

What situation can you give me where keeping a circular reference model is the best model for what you're trying to build?

From my experience, the best model will pretty much never involve circular references in the way I think you mean it. That being said, there are a lot of models where you use circular references all the time, it's just extremely basic. Parent -> Child relationships, any graph model, etc, but these are well known models and I think you're referring to something else entirely.

1
  • 1
    It MAY be that a circular linked list (single-linked or double-linked) would be an excellent data structure for the central event queue for a program that's supposed to "never stop" (stick the important N things on the queue, with a "do not delete" flag set, then simply traverse the queue until empty; when new tasks (transient or permanent) are needed, stick them in a suitable place on the queue; whenever you serve an even without the "do not delete" flag, do it then take it off the queue).
    – Vatine
    Oct 14, 2010 at 14:11
1

Circular references in data structures is sometimes the natural way of expressing a data model. Coding-wise, it's definitely not ideal and can be (to some extent) solved by dependency injection, pushing the problem from code to data.

1

A circular reference construct is problematic, not just from a design standpoint, but from an error catching standpoint as well.

Consider the possibility of a code failure. You haven't placed proper error catching in either class, either because you haven't developed your methods that far yet, or you're lazy. Either way, you don't have an error message to tell you what transpired, and you need to debug it. As a good program designer, you know what methods are related to what processes, so you can narrow it down to those methods relevant to the process that caused the error.

With circular references, your problems have now doubled. Because your processes are tightly bound, you have no way of knowing which method in which class might have caused the error, or from whence the error came, because one class is dependent on the other is dependent on the other. You now have to spend time testing both classes in conjunction to find out which one is really responsible for the error.

Of course, proper error catching resolves this, but only if you know when an error is likely to occur. And if you're using generic error messages, you're still not much better off.

1

Some garbage collectors have trouble cleaning them up, because each object is being referenced by another.

EDIT: As noted by the comments below, this is true only for an extremely naive attempt at a garbage collector, not one that you would ever encounter in practice.

10
  • 11
    Hmm.. any garbage collector tripped up by this isn't a true garbage collector. Oct 14, 2010 at 0:37
  • 11
    I don't know of any modern garbage collector which would have problems with circular references. Circular references are a problem if you're using reference counts, but most garbage collectors are tracing style (where you start with the list of known references and follow them to find all others, collecting everything else). Oct 14, 2010 at 0:40
  • 4
    See sct.ethz.ch/teaching/ws2005/semspecver/slides/takano.pdf who explains the drawbacks to various types of garbage collectors -- if take mark and sweep and start optimizing it to reduce the long pause times (e.g. creating generations), you start to have problems with circular structures (when circular objects are in different generations). If you take reference counts and start fixing the circular reference problem, you end up introducing the long pause times are characteristic of mark and sweep.
    – Ken Bloom
    Oct 14, 2010 at 13:45
  • If a garbage collector looked at Foo and deallocated its memory which in this example references Bar it should handle the removal of Bar. Thus at this point there is no need for garbage collector to go ahead and remove bar because it already did. Or vice versa, if it removes Bar which references Foo it shuold remove Foo too and thus it will not need to go remove Foo because it did so when it removed Bar? Please correct me if I am wrong.
    – Chris
    Oct 14, 2010 at 13:46
  • 1
    In objective-c, circular references make it so the ref count doesn't hit zero when you release, which trips up the garbage collector.
    – Dexter
    Oct 14, 2010 at 14:06
-3

In my opinion having unrestricted references makes program design easier, but we all know that some programming languages lack support for them in some contexts.

You mentioned references between modules or classes. In that case it's a static thing, predefined by the programmer, and it's clearly possible for the programmer to search for a structure that lacks circularity, though it might not fit the problem cleanly.

The real problem comes in circularity in run time data structures, where some problems actually can't be defined in a way that gets rid of circularity. In the end though - it's the problem that should dictate and requiring anything else is forcing the programmer to solve an unnecessary puzzle.

I'd say that's a problem with the tools not a problem with the principle.

3
  • Adding a one sentence opinion doesn't significantly contribute to the post or explain the answer. Could you elaborate upon this?
    – user40980
    May 11, 2014 at 2:11
  • Well two points, the poster actually mentioned references between modules or classes. In that case it's a static thing, predefined by the programmer, and it's clearly possible for the programmer to search for a structure that lacks circularity, though it might not fit the problem cleanly. The real problem comes in circularity in run time data structures, where some problems actually can't be defined in a way that gets rid of circularity. In the end though - it's the problem that should dictate and requiring anything else is forcing the programmer to solve an unnecessary puzzle.
    – Josh S
    May 11, 2014 at 6:37
  • I have found that it makes it easier to get your program up and running but that generally speaking it ultimately makes it harder to maintain the software since you find that trivial changes have cascading effects. A makes calls into B which makes calls back to A which makes calls back to B... I've found it's tough to truly understand the effects of changes of this nature, especially when A and B are polymorphic. May 14, 2014 at 20:06

Not the answer you're looking for? Browse other questions tagged or ask your own question.